TRUST

Safety model

Moderation on both sides of the model call, COPPA-first architecture, parent-initiated cascading delete, no model training on kid data. The trust posture is the product, not a compliance checkbox.

MODERATION PASSES

2 per turn

Pre-model on kid input. Post-model on Claude output. Third layer tunes kid build-talk.

lib/moderation/client.ts

FALSE-NEGATIVE TARGET

Zero harm

A missed harmful message is a trust-ending event. The moderation posture is fail-closed.

TODO(safety): publish monthly on transparency page once data accumulates

FALSE-POSITIVE TARGET

Under 1%

Kid build-talk falsely flagged. Tuning layer is TODO-003; target published, current rate TODO(safety).

COPPA 2026

Apr 22, 2026

Modernized rule live. $51,744 per violation per day. Tekku built for the new regime.

FTC final rule, January 2025 publication

LAST UPDATED 2026-04-22

Safety pipeline

Every kid turn passes through four independent filters before the kid sees the response. Escalations branch to human review and parent notification.

Moderation layers: what each one catchesFour independent filters. OpenAI pre and post, Anthropic-side refusals, Tekku tuning on kid build-talk.

Layer one is OpenAI omni-moderation-latest on the kid input. The wrapper lives in lib/moderation/client.ts. It carries a 10-second timeout and a fail-closed classifier for every error path. A network error or a timeout returns a flagged outcome by default. The review gate forbids the call site from wrapping this in try/catch; the rule is in the file header. What it catches: sexual content directed at minors, explicit self-harm language, credible threats, explicit hate, and the long tail of the OpenAI categories.

Layer two is the Anthropic-side safety prompt. The kid-first voice, refusal patterns, and reading-level rules live in lib/prompts/build-assistant.md. The model declines inside the call rather than after it. What it catches: things the model is better positioned to refuse than a content classifier. A request to write code that impersonates a real person, for example. Or a prompt that is coherent English but morally wrong for an 8-year-old.

Layer three is the Tekku tuning layer over OpenAI Moderation. Kid build-talk is full of phrases a generic moderator flags as violent. "Kill the timer." "Shoot the ball." "Hit the enemy." The tuning layer is an allowlist keyed on phrase plus context (project type, kid age, recent prompts). TODO-003 in TODOS.md is the build-out, blocked on four weeks of real Stage 1 moderation events to tune against. What falls through today: cold-start false positives that send a kid to the redirect flow when they did nothing wrong. This is the biggest live tuning surface.

Layer four is OpenAI moderation on the Claude output. Same wrapper, different direction. What it catches: rare cases where the model emits something a classifier flags, usually an echo of the flagged input that slipped past the redirect. Output-side moderation is a cheap insurance policy.

COPPA posture: Stage 1 founding-family basis, Stage 2 verified consentStage 1 ships because user zero is the founder's own son. Stage 2 turns on Persona or Stripe Identity verified consent before user two.

The COPPA 2026 rule takes effect April 22, 2026. The penalty is $51,744 per violation per day per kid. Enforcement is active: Epic paid $520M, YouTube paid $170M, Disney paid $10M in December 2025, TikTok is in active litigation. The regulator is not patient with retrofit compliance. Every product that launches after the deadline is expected to be compliant from day one. Tekku is built for the new regime.

Stage 1 ships without the full verified-consent stack on the basis that user number one is the founder's 8-year-old son. The parent is verified because he wrote the code. The moment user number two joins, the Stage 2 stack turns on: Persona or Stripe Identity for verified parental consent, a signed consent artifact stored with a dated reference, a no-model-training clause in the privacy policy, a 90-day transcript retention cron, and parent-initiated cascading delete. TODO-001 in TODOS.md is the Stage 2 build-out; it is roughly one week of focused work and is the hard gate for expansion past founding families.

The data-use contracts on the API tier are written to the no-training standard. Anthropic's data-use configuration is set to the no-training tier. OpenAI Moderation is audit-only and does not feed training pipelines. Moderation events are stored for policy review, not for classifier improvement. When we eventually train a concept classifier of our own, it runs on synthetic data and on anonymized aggregate labels, never on raw kid transcripts.

Data handling: what we store, what we do not, how longMinimize what is collected, delete on a 90-day rolling cron, cascade everything on parent request, keep only what the product actually needs.

A kid profile carries a nickname, an age, and a parent linkage. No surname, no address, no device-identifying beacon beyond what the auth layer needs. The parent record carries the identity-verification artifact and the signed consent reference. Project files persist until the parent deletes the account. Transcripts and snapshots live in the ai_transcripts and snapshots tables in Supabase and are deleted on a 90-day rolling cron. Retention is short on purpose: the less kid data we hold, the less harm we can cause.

Parent-facing export and delete controls ship with the Stage 2 COPPA stack. One button exports everything about a kid as JSON. A second button deletes everything in a cascading transaction: kid profile, transcripts, snapshots, shipped apps, moderation events. The confirmation email is the legal record. There is no support ticket in between the click and the delete.

What we do not store: raw keystrokes, screen recordings, device gyroscope data, browser fingerprints beyond what auth requires, any identifier that would make a kid re-identifiable across sites. We have no ad pixels. We have no data broker integrations. The analytics stack is opt-in at the parent level and carries no kid-level identifiers to any third party.

No model training on kid dataPolicy statement, contractual enforcement, and how we would train a classifier of our own without breaking the clause.

Kid inputs, kid code, and kid chat history are not used to train Anthropic, OpenAI, or any Tekku-owned model. The clause is written into the API-tier contracts with Anthropic and OpenAI. It is the first sentence of the Tekku privacy policy. It is the first thing the parent reads during consent. Enforcement runs at three layers: the API provider configuration (set to no-training), the data-use contract signed with the provider, and the Tekku pipeline which never writes kid data to any training bucket.

When we train our own concept classifier (Stage 2, TODO-002), the training set is synthetic data generated from documented concept patterns, plus anonymized aggregate labels. No raw kid transcript, no kid-attributable code sample, no parent-identifiable metadata. The training data schema and the generation script will be published with the classifier release so the no-training claim is auditable.

The moderation event log is audit-only. It is retained so a human reviewer can confirm the tuning layer did the right thing. It is not retained to improve a classifier. TODO(safety): publish the moderation retention schedule on the public transparency page.

Incident response: what happens when something goes wrongDetect, contain, notify, publish. Every step has a time target and a named owner.

A safety incident is any event where a harmful message reached a kid, a kid-identifiable data record left our systems, a moderation false-negative was logged, or a parent reports a concerning interaction. The playbook is four steps: detect, contain, notify, publish.

Detect. The moderation event log feeds a live dashboard. Any flagged-but-allowed output is reviewed inside 24 hours by a named human reviewer during Stage 1. Any false-negative that surfaces later (parent report, manual audit) is logged as an incident with a severity tier. The on-call rotation is documented in the Stage 2 operations runbook. TODO(safety): publish the on-call schedule and escalation tree in the data room once the first operational hire lands.

Contain. The immediate action on a live incident is to suspend the affected kid session, freeze the moderation rule that allowed the event, and open a review thread with the on-call reviewer and counsel. Containment has a 1-hour target. The session freeze is a feature flag flip, not a deploy.

Notify. Parent notification is the next step. For a severe event the notification goes out the same day. For a tuning-layer false-negative with no actual harm, the notification goes out in the weekly parent email with a specific explanation. Regulatory notification follows the rule in force: COPPA has a notification trigger in the modernized rule, and counsel is in the thread from detection onward. TODO(safety): counsel to confirm the COPPA notification trigger language for Tekku's specific data shape.

Publish. The public transparency page publishes every moderation event category monthly, including any confirmed incidents, what was changed, and what the new posture is. The page is live at /safety on the marketing site. Parents read it before they subscribe. It is the strongest possible demonstration that the safety claim is not marketing copy.

Safety posture vs competitors

Architectural safety choices, not marketing claims. The rows are verifiable: either a product has a layer or it does not.

	Tekku	Khanmigo	Tynker	Generic LLM
Moderation layers	4 (pre, Anthropic-side, tuning, post)	2 (pre, post, per public docs)	0 (classical, no AI surface)	1 (provider default)
COPPA verified consent	Stage 2 (Persona or Stripe Identity). Stage 1 founder-son basis.	Yes (school consent path primarily)	Yes (legacy, retrofit debt)	No
No-training clause	Contractual with Anthropic and OpenAI. First sentence of privacy policy.	Yes (public statement)	Partial (legacy terms)	Varies, usually opt-out
Data residency	US (Supabase US-East). EU region at Stage 3.	US	US	Varies
Parent deletion (cascade)	One tap, transactional, same-day	Yes (via Khan account)	Yes (retrofit, ticket-driven)	Varies

Safety pipeline

See also