Defining the Name — What is TRIVIUM?
A governance structure in which multiple evaluating agents — each holding contradictory but legitimate claims — converge on a single resolution through deliberation. Applying the medieval liberal arts Trivium (grammar, logic, rhetoric) as three audio AI agents. A proper noun demanded by the structure of the mastering problem itself.
| Agent | Role |
|---|---|
| GRAMMATICA | Guardian of Physical Law Jiles-Atherton model, BS.1770-4 spec, clip prevention — monitors the grammar that keeps audio from breaking |
| LOGICA | Interpreter of Musical Structure Reads Gemini scan data and logically constructs energy transitions from Intro → Drop → Outro |
| RHETORICA | Director of Aesthetic Expression Tube harmonics, high-frequency shimmer — responsible for the sensory parameters that persuade the listener |
Premise — Necessity, Not Homage
The first reaction to applying TRIVIUM to mastering is usually "why medieval scholarship now?" It isn't a historical exercise.
From an acoustic engineering perspective, mastering is structurally a task that requires consensus. It is not the kind of problem where a single agent can declare "this is the optimal solution."
Three Contradictory Truths
There are three axes mastering perpetually oscillates between. Each is "correct." None of the three can be simultaneously maximized.
| Agent | Correct Answer It Defends | The Cost |
|---|---|---|
| GRAMMATICA | Engineering validity — LUFS, True Peak, phase coherence, BS.1770-4 spec compliance | Strict adherence kills impact |
| LOGICA | Structural integrity — arrangement continuity, low-end stability, cross-section coherence | Over-tidying erases character |
| RHETORICA | Aesthetic sensibility — high-frequency shimmer, track-specific edge, audience emotional response | Over-excitement causes ear fatigue |
These three are always in trade-off. Maximizing GRAMMATICA at RHETORICA's expense yields spec-compliant, boring mastering. Let RHETORICA run unchecked and LOGICA raises alarms. This tension is not a bug — it is the exact shape of what mastering is.
Formalizing as Nash Equilibrium
Described in game-theoretic terms, the solution this three-way consensus is seeking is a Nash Equilibrium: the point at which no agent, given the strategies of the others, can improve its own utility by unilaterally changing its strategy.
p ∈ ParameterSpace
∏ Ui(p)
i ∈ { GRAMMATICA, LOGICA, RHETORICA }
Each agent's utility function Ui(p) returns how satisfied it is with parameter set p according to its own evaluation axis. The product is maximized because if any single agent evaluates p near zero, the entire product collapses to zero — meaning each agent has veto power built into the math.
This veto is the mathematical basis for the do_not_damage list. When LOGICA flags "never reduce low-end density in this section," that is ULOGICA(p) → 0 for any p that violates it.
Role Definitions for Three Agents
Role assignment to specific models is fixed as follows. These do not rotate — rotating models breaks role consistency across sessions.
Guarantees engineering validity: LUFS, TP, phase, spectral balance, BS.1770-4 compliance. Rejects proposals without numeric justification.
Monitors arrangement continuity, structural integrity, and cross-section contextual coherence. Holds veto on flattening or character erosion.
Evaluates track-specific aesthetics, genre expectations, and the sensory dimensions of the listening experience. Vetoes "spec-compliant but characterless" outcomes.
Field-level weights mirror the role assignments. RHETORICA leads on macro_form (0.50), GRAMMATICA leads on whole_track_targets (0.55), LOGICA leads on failure_conditions (0.50) — each agent holds maximum influence in its domain of expertise.
Why 'Dumb AI' Kept Failing
Their mental model:
Same logic as applying a filter to an image. The uploader is central because the filter has to receive the file.
In this view, AI's role is just selecting which filter to apply. More presets, more parameter UI — the essence doesn't change. Musical judgment is treated as template matching, not intelligence.
There's a question they will never answer: "Should I boost the low end through this transition, or hold back to set up the next drop?" A filter can't answer this. The answer requires simultaneous reference to context, structure, and aesthetics — and the weighing of those three against each other.
Transformation vs. Intelligence Deliberation
This is where aimastering.dev differs at the root:
| Conventional (Transformation) | aimastering.dev (Consensus) |
|---|---|
| Audio → filter → output | Audio → scan → knowledge |
| Preset or parameter selection | Multi-angle deliberation by 3 agents |
| Static transformation logic | Convergence process toward Nash Equilibrium |
| "Which filter?" is the question | "What to maximize, what to sacrifice?" is the question |
| Output is audio | Output is a target specification (formplan) |
| DSP runs fixed presets | DSP runs section-adaptive dynamic parameters |
Next Questions — ControlLayer & Visualisation
Two questions remained open at the end of this discussion:
How to efficiently distribute the dense numerical output from the analysis scan across the 3 TRIVIUM agents, induce productive conflict between them, and ultimately translate the consensus into 14-stage DSP parameters (Transformer saturation, Tube bias values, etc.). The concrete implementation logic for the Control Layer.
How to visualise the TRIVIUM deliberation process in the playground so users understand the difference from preset-based mastering. UI design that shows GRAMMATICA, LOGICA, and RHETORICA's positions, their conflict, and their convergence on a single screen.
Both are means of proving the engineering necessity of TRIVIUM to the outside world. The next post begins with Question A — starting from the input/output interface design of the ControlLayer.