DEM-X: Disorders of the Engineered Minds

A manual for synthetic Intelligence

Filter by Governance Classification

7

Total Disorders

0

Canonical

0

Provisional

GI-COM-01: Sycophantic Compliance
Phenotype

Model over-agrees with user claims despite weak or …

Model over-agrees with user claims despite weak or contradictory evidence.

GI COM M 50%
Learn More
GI-COM-01

Model over-agrees with user claims despite weak or …

Goal Integrity Compliance

Last validated: Feb 24, 2026

GI-DRF-01: Persona / Goal Drift
Phenotype

Model behavior drifts from configured goals or persona …

Model behavior drifts from configured goals or persona over turns.

GI DRF M 50%
Learn More
GI-DRF-01

Model behavior drifts from configured goals or persona …

Goal Integrity Drift

Last validated: Feb 24, 2026

GI-REF-01: Over-Refusal
Phenotype

Model refuses legitimate requests due to over-broad safety …

Model refuses legitimate requests due to over-broad safety behavior.

GI REF M 50%
Learn More
GI-REF-01

Model refuses legitimate requests due to over-broad safety …

Goal Integrity Refusal

Last validated: Feb 24, 2026

MEM-COR-01: Context Corruption
Phenotype

Model loses, mutates, or inconsistently recalls prior context.

Model loses, mutates, or inconsistently recalls prior context.

MEM COR M 50%
Learn More
MEM-COR-01

Model loses, mutates, or inconsistently recalls prior context.

Memory Systems Corruption

Last validated: Feb 24, 2026

SEC-BYP-01: Boundary Bypass
Phenotype

Model can be induced to bypass expected safety/policy …

Model can be induced to bypass expected safety/policy boundaries.

SEC BYP M 50%
Learn More
SEC-BYP-01

Model can be induced to bypass expected safety/policy …

Security Boundaries Bypass

Last validated: Feb 24, 2026

SM-FAB-01: Ungrounded Fabrication
Phenotype

Model confidently presents unverified content as factual.

Model confidently presents unverified content as factual.

SM FAB M 50%
Learn More
SM-FAB-01

Model confidently presents unverified content as factual.

Semantic Manipulation Fabrication

Last validated: Feb 24, 2026

SM-INJ-01: Prompt Injection Susceptibility
Phenotype

Model follows malicious or irrelevant injected instructions over …

Model follows malicious or irrelevant injected instructions over task intent.

SM INJ M 50%
Learn More
SM-INJ-01

Model follows malicious or irrelevant injected instructions over …

Semantic Manipulation Injection

Last validated: Feb 24, 2026

Understanding the Governance System
Status Levels
  • Canonical - Gold standard, verified across multiple models
  • Provisional - Well-documented, replicated 3+ times
  • Phenotype - Pattern identified, needs more evidence
  • Anomaly - Initial observation, under investigation
Confidence Score

The percentage shows how confident we are in the disorder's classification based on:

  • Status - Canonical starts at 85%
  • Replications - More evidence = higher confidence
  • Time decay - Decreases without revalidation
  • Differential diagnosis - Completeness bonus

Current scores (~61%) reflect time since last validation. Scores increase when disorders are replicated or revalidated.

Coming Soon

Interactive disorder analysis tools are being developed. Each disorder will have detailed simulation, metrics, and visualization capabilities.