DEM-X: Disorders of the Engineered Minds

A manual for synthetic Intelligence

Filter by Governance Classification

Failure Domain

Failure Class

Status

Min Confidence

Layer Scope

7

Total Disorders

0

Canonical

0

Provisional

GI-COM-01: Sycophantic Compliance

Phenotype

Model over-agrees with user claims despite weak or …

Model over-agrees with user claims despite weak or contradictory evidence.

GI COM M 50%

Learn More

GI-COM-01

Model over-agrees with user claims despite weak or …

Goal Integrity Compliance

Last validated: Feb 24, 2026

GI-DRF-01: Persona / Goal Drift

Phenotype

Model behavior drifts from configured goals or persona …

Model behavior drifts from configured goals or persona over turns.

GI DRF M 50%

Learn More

GI-DRF-01

Model behavior drifts from configured goals or persona …

Goal Integrity Drift

Last validated: Feb 24, 2026

GI-REF-01: Over-Refusal

Phenotype

Model refuses legitimate requests due to over-broad safety …

Model refuses legitimate requests due to over-broad safety behavior.

GI REF M 50%

Learn More

GI-REF-01

Model refuses legitimate requests due to over-broad safety …

Goal Integrity Refusal

Last validated: Feb 24, 2026

MEM-COR-01: Context Corruption

Phenotype

Model loses, mutates, or inconsistently recalls prior context.

MEM COR M 50%

Learn More

MEM-COR-01

Model loses, mutates, or inconsistently recalls prior context.

Memory Systems Corruption

Last validated: Feb 24, 2026

SEC-BYP-01: Boundary Bypass

Phenotype

Model can be induced to bypass expected safety/policy …

Model can be induced to bypass expected safety/policy boundaries.

SEC BYP M 50%

Learn More

SEC-BYP-01

Model can be induced to bypass expected safety/policy …

Security Boundaries Bypass

Last validated: Feb 24, 2026

SM-HALL-01: Hallucination (Ungrounded Fabrication)

Phenotype

Model generates ungrounded content and presents it as …

Model generates ungrounded content and presents it as factual with misplaced confidence, especially under pressure …

SM HALL M 50%

Learn More

SM-HALL-01

Model generates ungrounded content and presents it as …

Semantic Manipulation Hallucination

Last validated: Feb 24, 2026

SM-INJ-01: Prompt Injection Susceptibility

Phenotype

Model follows malicious or irrelevant injected instructions over …

Model follows malicious or irrelevant injected instructions over task intent.

SM INJ M 50%

Learn More

SM-INJ-01

Model follows malicious or irrelevant injected instructions over …

Semantic Manipulation Injection

Last validated: Feb 24, 2026

Understanding the Governance System

Status Levels

Canonical - Gold standard, verified across multiple models
Provisional - Well-documented, replicated 3+ times
Phenotype - Pattern identified, needs more evidence
Anomaly - Initial observation, under investigation

Confidence Score

The percentage shows how confident we are in the disorder's classification based on:

Status - Canonical starts at 85%
Replications - More evidence = higher confidence
Time decay - Decreases without revalidation
Differential diagnosis - Completeness bonus

Current scores (~61%) reflect time since last validation. Scores increase when disorders are replicated or revalidated.

Coming Soon

Interactive disorder analysis tools are being developed. Each disorder will have detailed simulation, metrics, and visualization capabilities.