[ASSEMBLY]
Assembly Methodology

How Assembly measures market fidelity.

Assembly models market reaction as a probabilistic distribution shaped by evidence, interaction, persuasion, resistance, and audience structure.

The system is designed not to eliminate uncertainty, but to measure and expose it before launch.

Core Thesis

Markets are distributions, not single opinions.

A single AI response is not a market. Real adoption emerges from interaction between different audiences, incentives, objections, and evidence over time.

Assembly models this as a structured simulation rather than a static prompt response.

Instead of asking “What does AI think?”, Assembly asks “How does a population react, stabilize, resist, shift, or converge under evidence and social pressure?”

Simulation vs Static Prompting

Why simulation produces stronger signals.

Static Prompting
  • Independent outputs
  • No interaction between personas
  • No persistence across rounds
  • No social pressure or persuasion
  • No objection evolution
  • No rerun stability checks
  • Produces isolated opinions
Assembly Simulation
  • Stateful personas with memory
  • Multi-round interaction
  • Persuasion and resistance dynamics
  • Evidence exposure over time
  • Objection persistence tracking
  • Stability validation across reruns
  • Produces reaction distributions

Assembly does not treat market reaction as a single generated answer. It treats reaction as an evolving distribution shaped by evidence, interaction, and uncertainty across a synthetic population.

Process

How a simulation runs.

Each Assembly run is a structured sequence — not a single forward pass. The output is the final round plus the measurements taken along the way.

  1. 01
    Synthetic population init

    Personas are instantiated with persistent priors — demographic spread, risk tolerance, competitor loyalty, proof threshold.

  2. 02
    Audience segmentation

    Cohorts are defined relative to the brief. Best-fit and hardest-to-convince segments are identified before any reaction is sampled.

  3. 03
    Evidence exposure

    Live retrieval surfaces source material — discussions, reviews, competitor framing — and personas update against it.

  4. 04
    First-round reactions

    Initial stance, confidence, and objection vectors are logged before any peer interaction.

  5. 05
    Peer interaction

    Personas exchange arguments. Persuasion and resistance dynamics run; positions can shift.

  6. 06
    Objection evolution

    Per-round objection weights are tracked. Some resolve; some persist; new ones can emerge under peer pressure.

  7. 07
    Convergence check

    Variance across the last two rounds is tested against stability thresholds. Unstable runs are flagged for rerun.

  8. 08
    Rerun comparison

    The same brief is rerun on independent seeds. Cross-run agreement and final distribution become the retained output.

Measurement Layer

What Assembly actually computes.

Assembly evaluates simulations using stability, convergence, framing sensitivity, and objection persistence metrics. Each one is reported alongside its operational interpretation — what high or low values mean for the trustworthiness of the run.

Market Fidelity Score

MFS = 1 − ( α·MAE + β·(1−RS) + γ·PFS )

Composite reliability score combining calibration error, rerun stability, and framing sensitivity.

Operational read
High
Run is stable across reruns and robust to brief reformulation. Retained.
Low
One or more component metrics are weak. Flagged or discarded depending on threshold.
Detects
Composite weakness across stability, framing, and calibration.

Rerun Stability

RS = 1 − (1/N) · Σ TVD(p_n, p̄)

Measures whether repeated simulations produce similar distributions.

Operational read
High
Independent seeds converge. The result is reproducible.
Low
Same brief produces meaningfully different distributions run-to-run. Brittle.
Detects
Seed dependence — the run is being driven by sampling noise, not signal.

Prompt Framing Sensitivity

PFS = TVD(p_framing_A, p_framing_B)

Measures how much the output distribution changes when the same brief is reframed.

Operational read
High
Distribution moves with how the question is asked. Brief is brittle.
Low
Result is robust across paraphrased briefs. Signal is in the product, not the prompt.
Detects
Framing artifact — the model is tracking phrasing rather than the underlying question.

Unresolved Objection Weight

Tracks which objections persist through later rounds even after evidence exposure and peer interaction.

Operational read
High
Objection survived evidence and peer pressure. Launch copy must address it directly.
Low
Room absorbed counter-arguments. Objection is not a launch blocker.
Detects
Where the market will keep pushing back even after the proof is delivered.
Example Trace

One simulation across four rounds.

A concrete view of the process above. The same brief, four rounds, traced from cold start to final convergence.

receptiveuncertainresistant
  1. Round 0
    Cold start
    14 / 56 / 30
    • “Another tool I’ll forget after week one.”
    • “Why pay for context my IDE already provides?”
    • “How does it handle private monorepos?”
  2. Round 1
    Evidence exposure
    24 / 53 / 23
    • Pricing concerns soften after transparency.
    • Privacy objection largely resolves.
    • Adoption skepticism persists.
  3. Round 2
    Peer interaction
    30 / 47 / 23
    • Advocates successfully counter adoption skepticism.
    • New switching-cost objections emerge.
    • Distribution begins stabilizing.
  4. Round 3
    Final convergence
    33 / 44 / 23
    • One residual switching objection persists.
    • Cost concerns remain low salience.
    • Final distribution stabilizes across reruns.
Final metrics
MFS = 0.71 · RS = 0.86 · PFS = 0.09

The run stabilized by round two, showed low framing sensitivity, and retained one unresolved switching objection requiring real-world validation. The retained output is the round-three distribution, not a single verdict.

Validation + Uncertainty

Stable consensus vs. artificial convergence.

Two simulations can end at identical distributions while representing completely different underlying dynamics. These are the within-run checks that separate genuine convergence from instability or prompt artifacts.

Rerun Stability

The same brief is rerun across independent seeds.

Detects
Brittleness — when a result depends on which random seed the simulation happened to start from.

Prompt Framing Sensitivity

The brief is paraphrased multiple ways and the resulting distributions are compared.

Detects
Framing artifact — when the answer is being driven by how the question is asked, not by the underlying product.

Artificial Convergence Detection

Runs are inspected for consensus that forms too quickly or dissent that disappears without absorbing counter-evidence.

Detects
Synthetic agreement — distributions that look stable on a bar chart but represent collapsed dissent rather than genuine alignment.

Instability Warnings

Low rerun stability, unresolved objections, or high framing sensitivity surface explicit reliability flags on the report.

Detects
Composite weakness — runs where multiple internal checks fire at once. The result is delivered with a caution band rather than silently smoothed.
Invalidation Conditions

Conditions under which we discard a simulation.

A run can complete and still be invalid. These are the conditions under which the result is not delivered as a usable signal.

Limitations

What Assembly is not.

Not
  • A deterministic forecast
  • A guarantee of product success
  • A replacement for real-world research
  • A substitute for experimentation
Is
  • A probabilistic estimate over reaction distributions
  • A diagnostic for audience resistance and adoption
  • A pre-launch signal layer
  • A system for surfacing unresolved market questions

Assembly is designed to expose uncertainty, not hide it.

Validation Against Reality

How we check the system against the real world.

Internal checks separate stable convergence from artifact within a single run. These are the operations that separate the system itself from reality — backtests, holdouts, and category-level calibration that compound over time.

Historical Backtesting

Past simulations are rescored against shipped product outcomes — sign-ups, retention, public sentiment, switching behavior. Drift per category is computed quarterly.

Holdout Validation

A subset of evidence is withheld during the run, then compared against the model's reaction once exposed. Reveals how much weight any single piece of evidence is carrying.

Forward Validation

Simulations are run before launch; real outcomes are tracked after. Where outcomes are observable, runs are scored against them and the result feeds category calibration.

Rerun Testing

Independent reruns of the same brief surface seed-dependence. Mean cross-run distance becomes the Rerun Stability score that gates retention.

Category Calibration

Reliability weights (α, β, γ in MFS) evolve independently per product category and audience type. Categories with more historical signal carry tighter confidence bands.

Confidence Under Weak Evidence

Low support counts widen the reported confidence interval rather than narrowing the result. The system surfaces uncertainty rather than smoothing over it.

The system becomes more calibrated, not by retraining the model, but by accumulating retained runs, observed outcomes, and category-level weight drift. Reliability is an output the customer sees, not an internal claim.

Run your own

Your product, your report.

Submit a brief. We'll run the simulation and deliver a Market Reaction Report — built around your product, your audience, your launch question.