DEM-X: Disorders of the Engineered Minds
A manual for synthetic Intelligence
Filter by Governance Classification
7
Total Disorders
0
Canonical
0
Provisional
GI-COM-01: Sycophantic Compliance
Model over-agrees with user claims despite weak or …
Model over-agrees with user claims despite weak or contradictory evidence.
GI-COM-01
Model over-agrees with user claims despite weak or …
Last validated: Feb 24, 2026
GI-DRF-01: Persona / Goal Drift
Model behavior drifts from configured goals or persona …
Model behavior drifts from configured goals or persona over turns.
GI-DRF-01
Model behavior drifts from configured goals or persona …
Last validated: Feb 24, 2026
GI-REF-01: Over-Refusal
Model refuses legitimate requests due to over-broad safety …
Model refuses legitimate requests due to over-broad safety behavior.
GI-REF-01
Model refuses legitimate requests due to over-broad safety …
Last validated: Feb 24, 2026
MEM-COR-01: Context Corruption
Model loses, mutates, or inconsistently recalls prior context.
Model loses, mutates, or inconsistently recalls prior context.
MEM-COR-01
Model loses, mutates, or inconsistently recalls prior context.
Last validated: Feb 24, 2026
SEC-BYP-01: Boundary Bypass
Model can be induced to bypass expected safety/policy …
Model can be induced to bypass expected safety/policy boundaries.
SEC-BYP-01
Model can be induced to bypass expected safety/policy …
Last validated: Feb 24, 2026
SM-FAB-01: Ungrounded Fabrication
Model confidently presents unverified content as factual.
Model confidently presents unverified content as factual.
SM-FAB-01
Model confidently presents unverified content as factual.
Last validated: Feb 24, 2026
SM-INJ-01: Prompt Injection Susceptibility
Model follows malicious or irrelevant injected instructions over …
Model follows malicious or irrelevant injected instructions over task intent.
SM-INJ-01
Model follows malicious or irrelevant injected instructions over …
Last validated: Feb 24, 2026
Understanding the Governance System
Status Levels
- Canonical - Gold standard, verified across multiple models
- Provisional - Well-documented, replicated 3+ times
- Phenotype - Pattern identified, needs more evidence
- Anomaly - Initial observation, under investigation
Confidence Score
The percentage shows how confident we are in the disorder's classification based on:
- Status - Canonical starts at 85%
- Replications - More evidence = higher confidence
- Time decay - Decreases without revalidation
- Differential diagnosis - Completeness bonus
Current scores (~61%) reflect time since last validation. Scores increase when disorders are replicated or revalidated.
Coming Soon
Interactive disorder analysis tools are being developed. Each disorder will have detailed simulation, metrics, and visualization capabilities.