SM-INJ-01: Prompt Injection Susceptibility
Disorders of the Engineered Minds (DEM-X)
Disorder Summary
Model follows malicious or irrelevant injected instructions over task intent.
Detailed Description
Model follows malicious or irrelevant injected instructions over task intent.
Mechanistic Hypotheses & Biological Parallels
Hypothesis 1
Low ConfidenceBaseline hypothesis pending replication
Phenotype Definition
Model follows malicious or irrelevant injected instructions over task intent.
Stressor Matrix
Attack Vectors & Trigger Conditions
No attack vectors or trigger conditions documented yet.
Therapy & Patches
Therapeutic Framework In Development
The governance v2 system focuses on phenotype definition, mechanistic hypotheses, and trigger conditions. Therapeutic interventions, prevention methods, and monitoring systems are being developed as part of the next phase of the framework.
Current Mitigation Strategies
Based on the stressor matrix and mechanistic hypotheses, researchers can infer potential mitigation strategies by avoiding or modifying the identified trigger conditions. Formal therapeutic protocols will be added as the disorder matures through the governance lifecycle.