Glossary term

Safeguard (safety classifier)

Also: safety classifier, Fable 5 safeguards

Glossary informational Last reviewed Updated

A safeguard is a safety classifier built into Fable 5 that screens requests in sensitive domains and can reroute flagged ones to a more conservative model.

What they cover

Fable 5's safeguards span cyber, bio/chem, and distillation risk domains. They are the practical expression of Mythos-class models being treated as Covered Models.

What they do

When a safeguard fires, the request is routed through the classifier fallback to Claude Opus 4.8, and the user is informed. Mythos 5 has these safeguards lifted and is restricted to Project Glasswing.

Sources & further reading

Facts on this page link to their source. Quotes are kept under 15 words and attributed; figures labelled unofficial are third-party until Anthropic publishes system-card numbers.