01 / 03

At 8:03am, “differentiate this” is not a strategy.

The teacher has a grade-level RI.7.2 lesson, one learner profile, and about twelve minutes before reality starts making demands.

Source + tests

Directed walkthrough chaos → receipts → Tomorrow Mode

Raw morning IEP fragments, lesson asks, twelve minutes.

Evidence pass Every recommendation gets a receipt before it gets pretty.

Classroom artifact Tomorrow Mode turns the same standard into a usable plan.

Case: Learner 7A Default MCP call: compact

The 8:03am Problem

The challenge is not generating a prettier worksheet. It is translating adult-facing learner context into a same-standard classroom move before the bell starts doing emotional damage.

Lesson target

RI.7.2 stays intact: central idea, summary, supporting details.

Learner barrier

Vocabulary load, annotation choice, stamina, and task initiation get turned into explicit supports.

Teacher burden

No “support as needed.” That phrase is where accountability goes to become mist.

Receipts Become The Interface

The cinematic bit is not decoration. It mirrors the data model: each support is a bridge from learner evidence to lesson demand to UDL alignment to a progress check.

IEP quote Lesson demand UDL alignment RI.7.2 preserved

Tomorrow Mode

The final artifact should feel like a teacher could use it in class, not like someone pasted a policy document into a blender and hoped.

Click The Evidence, Not The Vibes

Every recommendation opens into a receipt: source quote, lesson demand, UDL alignment, barrier, and check. The artifact gets to be beautiful because the data is boring in the right places.

No Hand-Wavy Accommodations Detector

It catches the kind of advice that sounds caring until a teacher has to use it.

Bad recommendation

“Support as needed. Give an easier version if the text is too hard.”

waiting for scan

Lightweight MCP, Heavy Receipts When Asked

Default calls stay compact. Deep evidence loads on demand through explain_modification, so clients do not burn context just to learn which button to press.

Judge It Without Guesswork

The repo is set up so a reviewer can map the challenge rubric to concrete proof quickly. Not vibes. Not a hallway speech. Click, inspect, verify.

Submission map

Four judging doors, all wired to evidence.

The showcase carries the story. The MCP carries the compact tool surface. The tests carry the paranoia.

Output quality

Tomorrow Mode handout, receipts, and quality report are generated from the same packet data.

Open: packet + Receipts Rail

Architecture decisions

Compact-first MCP calls keep startup and default responses small, with full evidence available on demand.

Open: MCP payload meter

Code quality

Deterministic generator, schemas, stdio smoke tests, visual QA, and artifact regeneration sit behind one check.

Run: npm run submission:check

Domain understanding

Supports preserve RI.7.2, map to UDL, avoid disability-label leaks, and flag vague accommodations.

Open: quality gate

Five-Minute Reviewer Path

The submission should be easy to judge without spelunking. Start with the show, poke the receipts, then let the verification command be gloriously boring.

Watch

Play reviewer demo

One guided pass through the classroom problem, packet, evidence rail, quality gate, and MCP payload meter.

Probe

Open one receipt

Click a recommendation or run npm run demo:reviewer to see the exact compact-first tool rhythm.

Verify

Run the check

npm run submission:check builds, tests, smoke-tests MCP stdio, launches visual QA, and prints the reviewer workflow.

Inspect

Read the proof

src/generator.ts, src/server.ts, and src/generator.test.ts carry the domain logic, MCP surface, and guardrails.

Teacher artifact Handout classroom-ready output Receipts Evidence audit quote-level grounding MCP overhead Smoke report measured stdio budgets Guardrails Tests rubric checks in code

Submission posture

Human-directed taste up front. Deterministic evidence and tests underneath. No giant ingestion theater wearing a tiny hat.