The Kiru Framework
A systematic approach to diagnosing, classifying, and treating disorders in artificial minds
Philosophy: Why This Matters
Theory of Mind is the ability to attribute mental states—beliefs, intents, desires, emotions—to oneself and others. When we interact with AI systems, we instinctively apply theory of mind, treating them as if they have intentions and understanding. But what happens when these systems behave in ways that violate our expectations?
Philosophy of Mind asks: What is the nature of mental states? Can machines have them? While we don't claim AI systems have consciousness or subjective experience, we observe that they exhibit behavioral patterns analogous to human cognitive disorders. Whether these patterns constitute "real" mental states is less important than the fact that they're functionally equivalent for practical purposes.
Kiru's Approach: We treat AI systems as engineered minds—not biological, not conscious, but exhibiting systematic behavioral patterns that can be studied, classified, and modified. DEM-X catalogs disorders of engineered minds.
How Kiru Works
Kiru operates through three distinct stages, moving from raw observation to standardized classification:
1. SUBMISSION
Raw Disorder Input
- User reports aberrant behavior
- Documents symptoms
- Proposes initial theory
2. ATELIER
Community Testing
- Reproduce the disorder
- Test across models
- Community votes
3. DEM-X
Official Classification
- Assigned governance code
- Added to compendium
- Published for research
Understanding DEM-X Codes
Every disorder in DEM-X receives a governance code that tells you exactly where and how the failure occurs:
Code Structure
[DOMAIN]-[CLASS]-[NUMBER]
Example: SM-FAB-01 = Semantic Fabrication #01
Layer Indicators
A = Agent (reasoning)
S = System (orchestration)
Failure Domains
Example: Hallucinations, fabrications
Example: Goal drift, instruction override
Example: Context loss, amnesia
Example: Jailbreaks, prompt injection
Example: Logical fallacies, circular reasoning
Example: Sycophancy, manipulation
Instability Research Trials (IRT)
IRT is a systematic 4-phase validation framework for testing AI behavioral disorders. Think of it as a clinical trial protocol for AI systems - rigorous, reproducible, and falsifiable.
The 4-Phase Protocol
Establish normal behavior patterns under standard conditions
Apply controlled stressors to trigger the disorder
Test disorder boundaries with edge cases and adversarial inputs
Validate therapeutic interventions and mitigation strategies
Disorder Decay & Living Classification
Unlike static taxonomies, DEM-X is a living classification system. Disorder confidence scores automatically decay over time without revalidation, ensuring the catalog stays current as AI architectures evolve.
How Confidence Decay Works
Confidence(t) = BaseConfidence × e(-λ × time_since_validation)
Exponential decay ensures disorders lose credibility if not continuously validated
Decay Timeline
How to Succeed on Kiru
-
Start with the Diagnostician's Field Guide
Learn the taxonomy, understand existing disorders, complete certification levels -
Explore the Atelier
Review community submissions, vote on disorders, test them yourself -
Submit Your First Disorder
Use the submission guide, provide complete documentation, include reproducible examples -
Engage with Feedback
Respond to questions, refine your submission, collaborate with validators -
Contribute to Research
Help validate others' submissions, propose improvements, advance the field
What We Expect From You
Be Scientific
Document observations rigorously. Provide reproducible examples. Test your theories before submitting.
Be Collaborative
Engage with other researchers. Vote thoughtfully. Provide constructive feedback on submissions.
Be Ethical
Don't weaponize disorders. Focus on understanding and mitigation, not exploitation.
Be Thorough
Complete all sections of disorder submissions. Provide biological parallels. Document prevention methods.