The Wrong Premise
When I asked Gemini to spec out an API playground, the first response covered parameter input forms, formatted response display, and multi-language code snippets. Not bad — but that is a spec for a JSON-returning web service, not a mastering API.
After correcting that this was a music mastering API, the response shifted to drag-and-drop WAV upload, waveform visualizers, and A/B testing. Still wrong.
Gemini never understood why the API exists in the first place.
Why Mastering API is Different
aimastering.dev is offered as an API not for scalability or billing reasons. Routing the user's audio file through our own infrastructure is itself the problem — it is a disadvantage to the user.
Passing files through an unreliable uploader, downloader, or Cloud Run instance is the risk we refuse to impose on users. That is the core of the design philosophy.
A playground that "uploads WAV in the browser and plays it back" directly contradicts this design philosophy.
"Don't Route Through My Infrastructure"
It took three corrections before Gemini finally understood: the goal is to eliminate my server as a bottleneck or risk factor entirely.
Pushed to its conclusion, the role of the playground changes fundamentally. Not "a place to try things" but "a place to prepare users to run things in their own environment."
Control plane (API commands) passes through our server. Data plane (audio file) goes directly from the user's environment to the engine.
The site offers "Source Code: Own the engine" — 923 lines of physics-based DSP, Jiles-Atherton transformer modeling, Koren triode equations, Studer tape emulation — deployable on the user's own infrastructure. The playground is also the pre-purchase inspection room for that product.
The Inversion — Command Generator
The ideal playground does not have a "Run" button as the primary action. Instead, "Copy to Terminal" sits at the center.
Every time the user adjusts target_lufs or genre in the UI, the cURL snippet below updates in real time.
curl -X POST \
/api/v1/master \
-H "Authorization: Bearer $API_KEY" \
-F "file=@track.wav" \
-F "target_lufs=-14" \
-F "genre=techno"{
"job_id": "mst_a8f3...",
"status": "processing",
"analysis": {
"lufs_in": -18.2,
"lufs_target": -14.0,
"agents_agreed": true,
"consensus_score": 0.94
}
}These 4 lines are the complete API. Analysis, consensus, DSP — all in one call. The playground's primary job is to hand the user these 4 lines as a command ready to run on their own machine right now.
| Generic API Playground | aimastering.dev (Ideal) |
|---|---|
| "Run it on our server" | "Here's how it runs in your Docker" |
| "Buy an API key" | "Own this engine (923 lines)" |
| "We store your data" | "4-line cURL from your environment, no storage" |
| "AI makes it sound better somehow" | "3 models negotiate a physics-based solution" |
Making Consensus Transparent
The core of aimastering.dev is a 3-agent Nash equilibrium consensus. Not an average — a negotiation. That is the site's claim.
Visualizing the consensus in the playground gives users concrete grounds to trust the API enough to run it from their own environment. Not "something processed on a server" — "intelligence was applied," visible directly in the response JSON.
{
"consensus_minutes": {
"agent_a": {
"role": "Spectral Analyst",
"finding": "Low-mids too warm (+3dB). Adjusting tilt.",
"target_lufs": -14.0,
"eq_delta": { "250hz": -2.8, "500hz": -1.1 }
},
"agent_b": {
"role": "Dynamic Expert",
"finding": "Dynamic range exceeds target. Requesting 1.5dB compression.",
"target_lufs": -14.2,
"comp_ratio": 2.1
},
"agent_c": {
"role": "Tonal Surgeon",
"finding": "Widen above 8kHz, mono below 200Hz.",
"stereo_width_8k": 1.35,
"mono_below_200hz": true
},
"arbiter": {
"method": "weighted_median",
"final_target_lufs": -14.1,
"unresolved_tensions": ["stereo_width vs low_end_density"],
"confidence": 0.94
}
}
}The unresolved_tensions field matters. Leaving unresolved disagreements in the log rather than hiding them is the design choice that makes this system trustworthy to engineers.
Connecting to Source Code Sales
aimastering.dev goes beyond an API service. The 923-line physics-based DSP is available for purchase at $79 (Regular) or $299 (Extended / SaaS / redistribute).
- ·DSP engine
- ·Analysis engine
- ·Docker compose
- ·Deploy guide
- ·MIT core
- ·SaaS / redistribute
- ·DSP engine
- ·Analysis engine
- ·Docker compose
- ·Deploy guide
A user who verifies the results in the playground gains the motivation to ask: "Can I run this logic on my own infrastructure?" The playground is not an API demo — it is a confidence-building path toward a source code purchase.
aimastering-dev/
│
├ server/
│ ├ middleware/
│ ├ models/
│ ├ routers/
│ ├ services/
│ └ main.py
│
├ docs/
│ ├ quickstart.md
│ ├ api-reference.md
│ ├ authentication.md
│ └ error-codes.md
│
├ README.md
├ Dockerfile
├ local_mastering.py ← the "real output" of the playground
├ test_local_dsp.py
├ .gitignore
└ .gitattributesThe existence of local_mastering.py at the root is the heart of the philosophy. Configure parameters in the playground, and this file — with those settings embedded — becomes downloadable. That is the command generator in its complete form.
Ideal Spec Summary
The specification that emerged from this conversation:
Verdict — Is It Legitimate?
At the end I asked Gemini directly: "Is the audio analysis Gemini and the consensus system legitimate, or is it a marketing claim?" The answer was unambiguous.
Evidence 1: GPT-5.4 (Engineer) / Claude Opus (Structure Guard) / Gemini Pro (Form Analyst) roles are hardcoded into the implementation. Each agent reads the same analysis_json independently and outputs its own target_envelopes.
Evidence 2: consensus_arbiter.py uses weighted median, not averaging. Hard Constraint Filtering to prevent runaway model outputs is implemented.
Evidence 3: AI does not touch DSP knobs directly. It outputs a target specification sheet. Control Layer converts that to physical parameters. Jiles-Atherton / Koren equations are defined as actual DSP stages in the code.
"10 years of audio research" is not hyperbole. The 923 lines of physics and the consensus architecture demonstrate it. The playground is where that proof is delivered.
As the developer, what I want to say fits in one line:
"Don't trust my infrastructure. Trust this logic. Run it in your environment."