MIDI Lab BRD
AMF MIDI Lab — Business Requirements Document
Combined BRD and Functional Specification
Version 1.1 | Incorporates corrected technology architecture from Layer 12 audit
Executive Summary
AMF MIDI Lab is a personal-use web application for organizing and practicing the Adaptive Musician's Framework (AMF). It serves as a structured learning hub for AMF Semester 1 and a MIDI-first practice-material generator. The long-term version becomes a creative practice environment that generates jam tracks, MIDI exercises, charts, triad sheets, pitch-class-set sheets, rhythm-placement sheets, and instrument-specific practice tracks for acoustic guitar and piano.
Core product strategy: Creation-first, not analysis-first. Instead of asking the app to analyze commercial recordings, the app generates controlled musical environments from structured AMF data. Because the system creates the MIDI and metadata itself, it knows chord events, form, rhythm attack placements, pitch-class sets, triad options, and practice goals — which makes accurate AMF worksheet generation tractable.
Primary near-term goal: A reliable AMF lesson organizer and MIDI practice generator for Semester 1.
Primary long-term goal: A creative musicianship lab that generates and assists with analysis of songs, exercises, and custom practice environments.
Core technical strategy: Open-source MIDI and symbolic-music tools as the core. Client-side Tone.js for all interactive real-time audio. Server-side FluidSynth for non-real-time audio export only. Self-hosted Demucs + Basic Pitch for song study pipeline.
Architecture principle: Every AMF learning question must be explicitly attached to the generated environment that trains it. Generated content without a learning question is engagement theater, not education.
Product Vision and Strategic Direction
Why This Exists
The AMF curriculum is now large enough that it needs a software home. The existing documents define a rich learning system, but documents alone are not ideal for daily practice. The software becomes the operational layer: what to practice today, how to generate supporting materials, how to track progress, how to hear examples, and how to connect every drill to the larger framework.
Core Product Thesis
MIDI-first generation gives the system exact knowledge of notes, timing, form, and intention. The app can then generate educational overlays with high accuracy — worksheets, chord charts, rhythm grids, TPS maps, and practice cards — because the system knows what it generated.
Automatic audio analysis of commercial recordings is error-prone, especially for full-band recordings. The creation-first approach sidesteps this problem for the core learning use case.
AMF Learning Systems in Scope
| System | Semester 1 Coverage | Post-Semester 1 |
|---|---|---|
| PDC | Role choice, contribution decisions | Full PDC depth in genre labs |
| Blues Root | Always present; felt foundation | Emotional shaping across genres |
| Rhythm Cells | Two cells, placement | Full cell library, polyrhythm |
| RXP | 12-bar form feel | Long-time work, whole-form composition |
| TPS | Three colors (major, minor, spread) | Full triad library, upper structures |
| SHAPE | One motif, rhythmic development | Full phrase architecture |
| CAS-ARC | One-chorus ARC, multi-chorus route | Full composition architecture |
| PCS | 027 as color object | Full set library |
Business Requirements
Business Objectives
- Reduce friction between AMF theory and daily practice
- Generate accurate practice materials automatically from AMF parameters
- Provide evidence-based progress tracking aligned with AMF mastery levels
- Support adaptive difficulty — materials match the learner's current demonstrated level
- Make the AMF Internal Band metaphor tangible through interactive tools
Business Problem
The AMF body of work contains many systems, documents, partner modules, and curriculum tracks. Without a software layer, the learner must manually navigate Word and PDF documents, remember spaced repetition schedules, generate exercises by hand, and maintain progress tracking externally. This increases friction and weakens consistency.
Guiding Product Principles
- Creation before analysis: Generate clean practice materials before attempting complex automatic song interpretation
- Structured data before audio polish: The app must know what it generated so it can teach it
- Same curriculum, different instrument interfaces: Guitar and piano share AMF concepts but get instrument-specific practice details
- Adaptive difficulty as a first-class requirement: The MIDI generator must adjust harmonic complexity, tempo, and rhythmic density based on demonstrated mastery level — this is the highest-value single feature in the application
- Every generated environment has an attached AMF learning question: A 12-bar blues backing track without a specific learning question is just a backing track
- Assistive tools, not fake certainty: When automated analysis is added, label it as draft/assistive until user-reviewed
- Personal-use first: Do not overbuild multi-user SaaS features in MVP
- Pedagogy is a first-class feature: Spaced repetition, slow practice, visualization, feedback, and definitions of done are not add-ons
The Five Interactive Tool Concepts
Tool 1: Rhythm Grid Editor
What it does: A step-sequencer-style grid where the learner builds, plays, and modifies rhythm patterns using AMF Rhythm Cells as the vocabulary.
Educational purpose: Makes the relationship between notation, rhythm cells, and musical feel concrete and immediate. The learner can place a Charleston pattern on the grid, hear it immediately, then move it to an upbeat to hear the anticipated version — without any musical notation expertise required.
AMF learning question attached to every session: "Which rhythmic position creates the most forward momentum? Which position creates space?"
Grid design:
- 4/4 time, one bar, subdivided to eighth notes (8 slots)
- Optional 16th-note subdivision for more advanced cells
- Color-coded slots by rhythmic position: downbeat (slot 1), weak beats, upbeats, anticipated positions
- Each slot: click to toggle attack on/off; drag to extend note duration
- Cell library: pre-built AMF cells available as drag-and-drop (quarter-pulse, Charleston, shuffle, stop patterns)
- Loop playback: plays continuously until stopped, adjustable BPM
- Export: save as MIDI pattern, append to 12-Bar Generator session
Audio implementation: Tone.js Sequencer in browser. No server round-trip for audio. Latency target: <10ms click-to-sound.
AMF learning levels mapped to grid options:
| Mastery Level | Available Grid Options |
|---|---|
| Level 1–2 | 4 slots visible (quarter-note grid only), 3 pre-built cells |
| Level 3 | 8 slots (eighth-note grid), 5 cells |
| Level 4+ | Full 16-slot grid, all cells, custom pattern entry |
Tool 2: TPS Placement Explorer
What it does: An interactive piano keyboard with a bass note selector. The learner chooses a bass note and a triad, and the app shows the TPS color produced — naming the color, showing the register options, and playing the sound.
Educational purpose: Makes TPS visible and immediately audible. The learner can explore the emotional difference between a major triad over a root, a minor triad over the same root, and a spread triad — as sound, not just as a diagram.
AMF learning question attached to every session: "What is the emotional difference between a major triad and a spread triad over the same bass note? Which one would you use to open a chorus? Which one to close it?"
Interface design:
- Piano keyboard view (two octaves, interactive)
- Bass note selector: separate single-octave keyboard below the main keyboard
- Triad selector: major, minor, diminished, augmented, spread (open position)
- TPS color label displayed on selection: "D major triad over Bb root = TPS Color 3 (Major 3rd above)"
- Inversions: root position, first inversion, second inversion buttons
- Register: high, middle, low placement buttons — sound updates in real time
- Playback: click to hear selected voicing; hold to sustain
- Related AMF prompt: displays the AMF practice question for the selected color
Audio implementation: Tone.js Sampler with real piano samples loaded at startup. AlphaSynth (alphaTab) may be used for the guitar fretboard variant of this tool.
Guitar fretboard variant (TPS on guitar):
- Same bass note + triad selector logic
- Guitar fretboard display (6 strings, 12 frets) instead of piano keyboard
- Shows which strings/frets produce the selected TPS color
- alphaTab for fretboard rendering and AlphaSynth for playback
Tool 3: PCS Set Explorer
What it does: An interactive tool for exploring pitch-class sets (PCS objects) over different bass notes. Semester 1 focuses on 027. Later semesters add 016, 013, and more complex sets.
Educational purpose: Makes the ambiguous and resonant quality of PCS objects tangible. The learner plays a 027 set over different bass notes and hears how the same set changes emotional color depending on the bass note beneath it.
AMF learning question attached to every session: "How does the same 027 set change its emotional quality over a major root vs. a minor root? When would you use it, and when would you avoid it?"
Interface design:
- Set selector: 027 (Semester 1), 016, 013, custom (post-Semester 1)
- Bass note selector
- Pitch display: shows the actual notes of the set given the root (e.g., "027 from A = A, B, E")
- Piano and guitar display: shows the set visually on both instrument maps
- Playback: click to hear; broken arpeggio and block chord options
- Chord context: select the current chord (I7, IV7, V7, Im7) — app shows which sets are appropriate and which create tension
- AMF annotation: "This set works as a passing color over IV7 in Month 2"
Music theory computation: Tonal.js handles set construction, note naming, interval calculation. FastAPI endpoint handles the AMF-specific annotations (which sets are curriculum-appropriate at which level).
Tool 4: 12-Bar Generator
What it does: Generates a complete, structured 12-bar blues backing track from AMF curriculum parameters. The output is MIDI and optional audio, plus an AMF worksheet showing what was generated and why.
This is the highest-priority generative tool. The Rhythm Grid and TPS Explorer teach individual elements; the 12-Bar Generator puts them together in a musical context.
AMF learning question attached to every generated track: Specific to the parameters. Example: "This track uses anticipation on bar 5 (IV chord). Listen for it. Can you place your own rhythm cell in the same anticipation position without losing the form?"
Generator parameters:
| Parameter | Options |
|---|---|
| Key | A, Bb, C, E (others post-Semester 1) |
| Tempo | 50–120 BPM (slider) |
| Feel | Straight, swing, shuffle, slow blues |
| Month | 1 (Stabilize), 2 (Vary), 3 (Adapt) |
| Difficulty | Auto (uses tracked mastery level) or manual |
| Instrument tracks | Click, bass, drums, piano comp, guitar comp, PCS melodic exercise |
| Number of choruses | 1–6 |
| AMF system focus | Rhythm Cells, TPS, SHAPE, CAS-ARC (selects which worksheet is generated) |
| Chord voicing density | Sparse, medium, full |
Adaptive difficulty — mandatory requirement:
The generator must not be a static template picker. Adaptive difficulty means:
- Tempo: Lower for earlier mastery levels (Month 1 target: 60–70 BPM; Month 3 allows up to 100 BPM)
- Harmonic complexity: Month 1 = plain I7/IV7/V7. Month 2 = adds passing chords and brief substitutions. Month 3 = minor blues, extended chords
- Rhythmic density: Month 1 = sparse downbeat-focused comping. Month 2 = anticipations and stops. Month 3 = full rhythmic vocabulary
- The app should read the learner's tracked mastery level and suggest parameters accordingly. Manual override is always available.
Generated output:
- MIDI file (downloadable)
- Audio preview (optional — rendered server-side via FluidSynth as MP3)
- Chord chart (HTML, printable)
- RXP rhythm grid (visual representation of what the backing track plays)
- TPS worksheet (which triads are available over each chord in the form)
- AMF learning question (specific to the parameters chosen)
- Practice card (what to practice against this track, step-by-step)
Architecture note: FastAPI generates the MIDI structure and returns note data to the frontend. The frontend Tone.js handles preview playback from the note data. FluidSynth only runs when the user requests a downloadable MP3 export.
Tool 5: Moises-Assisted Song Workspace (Revised: Demucs + Basic Pitch Pipeline)
What it does: A workspace where the learner studies a real song — an anchor song or other repertoire — by uploading audio, running stem separation and audio-to-MIDI conversion, annotating the results, and generating AMF practice tasks from the structured findings.
Why it exists: Real recordings teach things that generated backing tracks cannot. The ability to isolate a bass line, slow down a guitar phrase, or convert a melody to MIDI and compare it against your own playing is genuinely useful for AMF development. The question is how to build this without API dependencies or audio quality limitations.
Revised pipeline (replacing Moises API):
User uploads audio file
|
v
Demucs v4 (async job, server-side)
-> stems: vocal, drums, bass, other
|
v
Basic Pitch (on "other" stem)
-> MIDI transcription of harmonic/melodic content
|
v
User reviews and annotates
-> chord symbols, section markers, tempo, key, notes
|
v
App generates AMF practice tasks
-> PDC, RXP, TPS, PCS, guitar, piano prompts
Workspace features:
- Audio file upload (MP3, WAV, FLAC)
- Stem player: play/mute individual stems (vocal, drums, bass, other)
- MIDI viewer: display Basic Pitch output, allow note correction
- Annotation fields: chord chart, section labels (intro, verse, chorus, turnaround), tempo, key, capo/pitch notes
- AMF prompt generator: takes annotations → outputs PDC listening questions, RXP placement analysis, TPS color map, practice tasks for guitar and piano
- Reference links: link to Moises session, Spotify track, or YouTube for external reference
AMF learning question attached to every workspace session: Specific to the song. Example for Blue Monk: "Identify the first bar in the recording where Monk leaves a deliberate rest longer than two beats. What harmonic function is on that bar? What does the silence say?"
Demucs v4 processing specs:
- CPU-only processing on DigitalOcean Ubuntu server: approximately 60–90 seconds for a 3-minute track
- Result cached after first processing — second access is instant
- Job queue: async worker handles processing; user notified when stems are ready
- No GPU required for personal-use scale
Moises as manual companion: If the user also uses Moises externally, they can paste or enter their Moises session notes directly into the workspace annotation fields. The workspace is not dependent on Moises — it is compatible with Moises as an optional data source.
Corrected Technology Stack
Frontend
| Technology | Purpose | Why |
|---|---|---|
| Next.js | React framework, routing, SSR | Production-ready, component ecosystem |
| Tone.js | Client-side audio scheduling and synthesis | <10ms latency for interactive tools; W3C Web Audio API; used in production by Chrome Music Lab, Strudel |
| Tonal.js | Music theory computations | Chord analysis, scale construction, interval calculations, note transformations |
| VexFlow | Notation snippets, rhythm examples | Production-mature for programmatic notation generation |
| alphaTab | Guitar tablature + notation + AlphaSynth playback | Combines display and playback for guitar track in one library |
Critical architecture rule: All interactive, real-time audio must use Tone.js in the browser. Server-side FluidSynth introduces 200–500ms latency per click — this destroys the feedback loop that makes interactive learning tools educationally valuable. Tone.js in the browser targets <10ms click-to-sound latency.
Backend
| Technology | Purpose | Constraints |
|---|---|---|
| FastAPI | REST API, MIDI generation endpoint, job management | Python; handles MIDI data generation; returns note lists to frontend (not audio) |
| Python | General backend language | |
| pretty_midi | MIDI file generation | v0.2.11; use fluidsynth() synthesis method; avoid synthesize() (sine waves, unusable) |
| mido | MIDI I/O, lower-level manipulation | More actively maintained than pretty_midi for I/O |
| music21 | Music theory, harmonic analysis | MUST NOT be in real-time request paths. Heavy object model (deepcopy operations are slow). Offline/pre-computed use only: generating curriculum content, pre-analyzing chord charts, building TPS maps at content creation time. |
| PostgreSQL | User data, progress tracking, generated content cache |
Audio and Analysis Pipeline
| Technology | Purpose | Notes |
|---|---|---|
| FluidSynth | Server-side audio export | Non-real-time only: generates downloadable MP3/WAV backing tracks from MIDI |
| SoundFont (TBD) | Audio sample quality | First-class content decision — see SoundFont note below |
| Demucs v4 | Stem separation | Self-hosted, async job, CPU processing 60–90s/track, results cached |
| Basic Pitch | Audio-to-MIDI transcription | Spotify open-source; applied to Demucs "other" stem for melodic content |
SoundFont Selection — First-Class Content Decision
SoundFont selection determines the musical quality of all server-side generated audio. The default GeneralUser GS SoundFont is musically adequate but not blues-authentic. Blues-appropriate piano and guitar sounds are central to the AMF learning experience.
SoundFont selection criteria:
- Piano: must have realistic electric/acoustic piano samples appropriate for blues and jazz-blues (not GM-style synthesized piano)
- Guitar: acoustic and electric guitar samples with appropriate attack and resonance
- Drums: shuffled and straight blues drum kit sounds
- File size: manageable for server deployment
Decision gate: SoundFont must be selected and evaluated before launch. Options to evaluate include MuseScore's official SoundFont, Salamander Grand Piano, SGM-v2.01, and specialized blues/jazz SoundFonts. This is not a decision to defer to post-launch.
Repository Structure
amf-midi-lab/
README.md
docs/
brd-fsd/
product-notes/
amf-source-library/
apps/
web/ # Next.js frontend
components/
RhythmGrid/ # Tone.js step sequencer
TpsExplorer/ # Piano keyboard + Tone.js sampler
PcsExplorer/ # PCS set visualization
TwelveBarGenerator/ # Parameter UI + track display
SongWorkspace/ # Demucs + Basic Pitch UI
lib/
tone/ # Tone.js initialization and shared audio context
tonal/ # Tonal.js chord/scale helpers
api/ # FastAPI backend
routers/
midi.py # MIDI generation endpoints
curriculum.py # Curriculum data endpoints
progress.py # Progress tracking endpoints
workspace.py # Song workspace endpoints
services/
midi_generator.py # pretty_midi + mido generation
demucs_worker.py # Demucs v4 async job
basic_pitch_worker.py # Basic Pitch transcription
# NOTE: music21 service is offline/batch only — not imported in hot path
music21_offline/ # Offline analysis scripts (never called at request time)
packages/
amf-core/ # Shared types, curriculum constants, AMF system maps
music-core/ # Music theory helpers, MIDI/form abstractions
worksheet-core/ # Worksheet generation templates
workers/
audio-renderer/ # FluidSynth/ffmpeg export jobs
stem-separator/ # Demucs v4 jobs
midi-importer/ # Upload parse/tag jobs
data/
seed/ # Seed data for curriculum, AMF system constants
examples/
storage/
assets/ # Uploaded/generated assets
generated/
soundfonts/ # SoundFont files for FluidSynth
scripts/
dev-setup.sh
render-midi.sh
seed-db.py
tests/
unit/
integration/
Core Data Structures
FormMap
{
"id": "form_12bar_a_blues",
"title": "12-Bar Blues in A",
"meter": "4/4",
"bars": 12,
"choruses": 3,
"sections": [
{"label": "A1", "bars": [1,2,3,4], "function": "establish"},
{"label": "A2", "bars": [5,6,7,8], "function": "develop_return"},
{"label": "Turnaround", "bars": [9,10,11,12], "function": "tension_complete"}
]
}
ChordEvent
{
"bar": 1,
"beat": 1,
"duration_beats": 4,
"symbol": "A7",
"root": "A",
"quality": "dominant7",
"function": "I7",
"available_tps": ["A major triad", "C# diminished fragment", "E minor over A"],
"available_pcs": ["027", "016"],
"amf_month": 1
}
RhythmGrid
{
"meter": "4/4",
"subdivision": "eighth",
"slots_per_bar": 8,
"attacks": [1, 4, 7],
"labels": ["downbeat", "upbeat", "upbeat"],
"anticipations": [{"slot": 7, "anticipates": "bar+1 beat 1"}],
"stops": [4, 7],
"amf_cell_name": "Charleston",
"amf_month": 1
}
PracticeTask
{
"title": "Month 1 Guitar: Muted Pulse Through 12 Bars",
"semester": 1,
"month": 1,
"instrument": "acoustic_guitar",
"duration_options": {"minimum": 10, "standard": 25, "extended": 45},
"systems": ["PDC", "Blues Root", "Rhythm Cell", "RXP"],
"instructions": "Play muted strings through the full form. Keep the groove alive without harmonic changes.",
"amf_learning_question": "Can you feel the 12-bar cycle as three four-bar sections by the end of three passes?",
"definition_of_done": "Complete three choruses at slow tempo without losing form; record one pass and identify one improvement.",
"adaptive_difficulty": {
"mastery_level_1_2": {"tempo_bpm": 55, "bars": 4},
"mastery_level_3": {"tempo_bpm": 65, "bars": 12},
"mastery_level_4": {"tempo_bpm": 75, "bars": 12, "add_chord_changes": true}
}
}
GeneratedExercise
{
"id": "uuid",
"created_at": "timestamp",
"parameters": {
"key": "A",
"tempo": 70,
"feel": "shuffle",
"month": 1,
"tracks": ["click", "bass", "drums", "piano_comp"],
"choruses": 3,
"amf_system_focus": "Rhythm Cells"
},
"amf_learning_question": "This backing track has a constant quarter-note pulse in the bass. Can you maintain your own rhythm cell on top without rushing?",
"form_map": "form_12bar_a_blues",
"chord_events": [...],
"rhythm_grid": {...},
"midi_file_path": "storage/generated/exercise_uuid.mid",
"audio_file_path": null,
"worksheet_data": {...}
}
Sprint Plan
Sprint 1 — MVP: Rhythm Grid + TPS Explorer + 12-Bar Generator Backend
Deliverable: A working minimal app with the three highest-priority interactive tools.
Backend work:
- FastAPI project scaffolding
POST /midi/generate-12barendpoint: accepts FormMap parameters, returns ChordEvents and RhythmGrid JSONGET /curriculum/semester/1endpoint: returns Week-by-week curriculum data- PostgreSQL schema: users, generated_exercises, progress_tracking
- pretty_midi MIDI file generation from ChordEvents
- File storage for generated MIDI
Frontend work:
- Next.js project scaffolding
- Tone.js initialization (shared AudioContext, BPM sync)
- Tonal.js integration (chord note resolution)
- Rhythm Grid component: 8-slot grid, Tone.js Sequencer, 3 pre-built cells
- TPS Explorer component: piano keyboard, bass note selector, Tone.js Sampler (piano samples)
- 12-Bar Generator form: parameter inputs, fetch
/midi/generate-12bar, display chord chart, play via Tone.js
Not in Sprint 1:
- FluidSynth audio export (download button deferred)
- PCS Explorer
- Progress tracking UI
- Demucs/Basic Pitch pipeline
Sprint 1 acceptance criteria:
- Rhythm Grid plays a selected cell at chosen BPM within <10ms click-to-sound
- TPS Explorer plays any selected voicing on click within <10ms
- 12-Bar Generator returns a chord chart for A blues at 70 BPM in under 1 second
- Form, tempo, and feel parameters change the generated output predictably
- Every generated environment displays an AMF learning question
Sprint 2 — PCS Explorer + Progress Tracking UI + Recording Upload
Deliverable: PCS Explorer tool, full progress tracker interface, and recording upload/playback.
Backend work:
GET /pcs/setsandPOST /pcs/buildendpoints (Tonal.js-derived, or FastAPI with music theory logic)- Progress tracking endpoints:
POST /progress/skill,GET /progress/summary - Recording upload endpoint: accept MP3/WAV, store locally, return playback URL
- Self-evaluation questions endpoint:
GET /curriculum/week/{n}/questions
Frontend work:
- PCS Explorer component: set selector, bass note, piano and guitar display, Tone.js playback
- Progress Tracker UI: skill table with 1–6 level selectors, monthly checkpoints, Definitions of Done
- Weekly Sprint Log component: recording upload field, structured self-evaluation questions
- Recording player: simple playback of uploaded recordings within the sprint log
Sprint 2 acceptance criteria:
- PCS Explorer shows correct notes for 027 over any selected root
- Progress Tracker persists skill levels across sessions
- Weekly sprint log self-evaluation questions are loaded from curriculum data (not hardcoded)
- Recording can be uploaded and played back within the sprint log UI
Sprint 3 — Song Workspace (Demucs + Basic Pitch Pipeline)
Deliverable: Complete Song Workspace — upload audio, run stem separation and transcription, annotate, generate AMF practice tasks.
Backend work:
- Async job queue setup (Celery or simple worker pattern)
- Demucs v4 integration:
POST /workspace/uploadtriggers async Demucs job - Basic Pitch integration: on Demucs completion, runs Basic Pitch on "other" stem
GET /workspace/job/{id}endpoint: poll job status- AMF task generator: takes song annotations → returns PDC, RXP, TPS practice tasks
- Result caching: processed stems and MIDI stored so re-upload is not required
Frontend work:
- Song Workspace page: file upload, job status indicator, stem player UI
- Annotation form: chord symbols, section labels, tempo, key, notes
- MIDI viewer: display Basic Pitch output, basic note correction
- AMF practice task display: generated prompts per annotation
Sprint 3 acceptance criteria:
- 3-minute MP3 upload → Demucs stems delivered in under 120 seconds on CPU
- Stem player correctly mutes/solos individual stems
- Basic Pitch MIDI output is displayed with correct note pitches (timing may require user correction)
- AMF practice tasks are generated from annotation data
Post-Sprint 3 — Audio Export and Worksheet Generation
Deliverable: FluidSynth audio export for generated exercises, PDF worksheet export.
Backend work:
- FluidSynth integration:
POST /midi/render-audioruns FluidSynth + ffmpeg server-side, returns MP3 download - Worksheet generator: takes GeneratedExercise → produces HTML worksheet with chord chart, RXP grid, TPS map
- PDF export: browser print-to-PDF or server-side PDF generation
Audio export acceptance criteria:
- Generated 12-bar backing track in A at 70 BPM renders to MP3 in under 30 seconds
- MP3 audio quality is musically acceptable (SoundFont review complete before this sprint)
Architecture Diagram
Browser
|-- Next.js (React components)
| |
| |-- Tone.js (Web Audio API)
| | Interactive audio: Rhythm Grid, TPS Explorer,
| | PCS Explorer, 12-Bar preview playback
| | Latency: <10ms click-to-sound
| |
| |-- Tonal.js
| | Music theory: chord construction, scale computation,
| | interval arithmetic, note naming
| |
| |-- VexFlow
| | Notation snippets, rhythm examples
| |
| |-- alphaTab
| Guitar tablature display + AlphaSynth playback
|
|-- HTTP API calls
|
v
FastAPI (Python)
|
|-- MIDI generation (pretty_midi, mido)
| Returns: note list JSON → frontend Tone.js plays it
| Returns: MIDI file → downloadable
|
|-- Curriculum data (static + database)
|
|-- Progress tracking
|
|-- Async job manager
|
|-- Demucs v4 worker (stem separation)
| 60-90s/track CPU, results cached
|
|-- Basic Pitch worker (audio-to-MIDI)
|
|-- FluidSynth/ffmpeg export worker
NON-REAL-TIME ONLY
For downloadable MP3 backing tracks
NOT for interactive per-click audio
|
v
PostgreSQL
Users, progress, generated exercises, workspace data
File Storage
Generated MIDI files
Generated audio (MP3/WAV exports)
Uploaded audio (song workspace)
Processed stems (Demucs output)
SoundFonts (for FluidSynth)
music21 — OFFLINE ONLY
Pre-computes: curriculum chord maps, TPS color maps,
PCS relationship tables, theoretical annotations.
NEVER called from a FastAPI request handler.
Runs as offline data generation scripts in /scripts/
Security, Privacy, and Copyright
Security:
- Personal-use MVP: deploy behind authentication (HTTP Basic Auth or simple token-based auth) before any public exposure
- Environment variables for all secrets (API keys, database credentials, SoundFont paths)
- Uploaded audio served through authenticated endpoints, not from public web root
- Log job status and errors; do not log file contents or audio data
Copyright:
- Do not store or distribute copyrighted audio publicly
- Generated AMF exercises are original — created from user parameters, not from copyrighted recordings
- User-uploaded audio in the Song Workspace is for personal study only — not shared or reused
- Moises-derived outputs are subject to Moises terms and the source recording's copyright status
Open-source license tracking: Create a dependency register (name, version, URL, license, product use) early. Tone.js: MIT. Tonal.js: MIT. pretty_midi: Apache 2.0. music21: BSD. Demucs: MIT. Basic Pitch: Apache 2.0. VexFlow: MIT. alphaTab: LGPL.
Acceptance Criteria and Definition of Done
Interactive tool readiness (Sprint 1):
- Audio latency: <10ms click-to-sound in Chrome and Firefox (measured with Web Audio API
currentTimelogging) - Correctness: TPS Explorer produces correct notes for any root and triad combination (validated against Tonal.js ground truth)
- Correctness: Rhythm Grid produces correct MIDI timing (validated by MIDI file analysis)
MIDI generation readiness:
- 12-bar blues in any supported key at any supported tempo generates in under 500ms
- Generated MIDI parses correctly with mido and produces correct note events
- Adaptive difficulty parameters produce measurably different outputs at Level 2 vs Level 4
AMF learning question requirement (all tools, all sprints):
- Every generated practice environment displays a specific AMF learning question before or alongside the generated material
- Learning questions are stored in curriculum data, not hardcoded in components
- Questions change based on AMF system focus and mastery level parameters
Adaptive difficulty requirement (12-Bar Generator):
- Tempo decreases automatically for Level 1–2 mastery inputs
- Harmonic complexity (chord substitutions, extensions) is absent in Level 1–2, present in Level 4+
- Rhythmic density of generated comp tracks is sparse for Level 1–2, full for Level 4+
- Manual override is always available regardless of mastery level detection