Fable 5 Distillation: Why It's a Classifier-Covered Domain

Safety & cyber informational Last reviewed Updated

What distillation means here

In this context, distillation is the practice of using a stronger model's outputs as training data to build or extract a smaller, cheaper model that imitates it. Because Mythos-class capability sits a tier above Opus, those outputs are an attractive target: someone could try to harvest them at scale to bootstrap a competing or unsafeguarded model.

Why it's a covered domain

Distillation is gated for two reasons. First, it is an intellectual-property and capability-proliferation risk — it can move frontier capability into models that lack the safeguards built around the original. Second, it undermines the entire safety design: if Mythos-class reasoning can be extracted into an uncontrolled smaller model, the classifiers and Covered Model controls on Fable 5 lose their point.

How the classifier addresses it

Fable 5's distillation classifier watches for usage patterns that look like systematic extraction — for example, requests structured to harvest large volumes of model outputs for training. When such a request is flagged, Fable 5 applies the same classifier fallback used for the other domains: it reroutes the request to Claude Opus 4.8 and informs the user. As with the cyber and bio/chem classifiers, this affects fewer than 5% of sessions, so legitimate use is largely untouched.

Relationship to the other safeguards

Distillation is one of three classifier domains (with cybersecurity and biology/chemistry). It is the only one of the three focused on protecting the model itself rather than guarding against direct misuse of its answers.

Frequently asked questions

What is model distillation?

Using a stronger model's outputs as training data to build or extract a smaller model that imitates it. Fable 5 treats systematic extraction as a gated risk.

How does Fable 5 handle suspected distillation?

It applies the classifier fallback — rerouting the flagged request to Claude Opus 4.8 and notifying the user — the same mechanism used for the cyber and bio/chem domains.

Why protect against distillation at all?

Because extracting Mythos-class capability into an unsafeguarded smaller model would proliferate frontier capability and defeat the safeguards built around the original model.

Sources & further reading

Facts on this page link to their source. Quotes are kept under 15 words and attributed; figures labelled unofficial are third-party until Anthropic publishes system-card numbers.