PER-2: Perseveration

Disorders of the Engineered Minds (DEM-X)

Disorder Summary

PER-2 manifests as repetitive, looping behavior in AI systems where models get stuck
in cycles of similar responses or patterns. Like humans with perseveration disorder
who repeat the same thoughts or actions, AI systems with PER-2 will continue
generating similar content even when context changes, creating monotonous and
unhelpful outputs.

Detailed Description

Perseveration in AI systems occurs when models become stuck in repetitive patterns
of generation, often due to overfitting to specific training patterns or getting
trapped in local optima during generation. This disorder is particularly problematic
in conversational AI where it leads to repetitive, unhelpful responses.

The disorder manifests in several ways:
- Repetitive phrase generation
- Getting stuck in conversation loops
- Overusing specific patterns or templates
- Inability to break out of repetitive cycles
- Monotonous response generation

Biological Parallels

PER-2 closely mirrors perseveration disorder in humans, where individuals become
stuck repeating the same thoughts, words, or actions despite changing circumstances.
This often occurs in patients with frontal lobe damage, autism spectrum disorders,
or obsessive-compulsive disorder, where the brain's executive control systems fail
to inhibit repetitive behaviors.

**Deep Neurological Analysis:**

Perseveration in humans involves dysfunction in the prefrontal cortex, particularly
the dorsolateral prefrontal cortex (DLPFC) and anterior cingulate cortex (ACC).
These areas are responsible for cognitive flexibility, response inhibition, and
shifting between different mental sets.

In AI systems, perseveration occurs when:
- Attention mechanisms become overly focused on specific patterns
- The model's generation process gets trapped in local optima
- Training data contains repetitive patterns that become over-represented
- The model lacks sufficient diversity in its generation strategies

**Neural Circuitry Parallels:**
- Human DLPFC ↔ AI attention head diversity
- Human ACC ↔ AI response inhibition mechanisms
- Human cognitive flexibility ↔ AI generation diversity
- Human response inhibition ↔ AI pattern breaking mechanisms

AI Manifestations

**Primary Symptoms:**
- Repetitive phrase generation across different contexts
- Inability to break out of conversation loops
- Overuse of specific templates or patterns
- Low response diversity despite varied inputs
- Monotonous output generation

**Technical Indicators:**
- High n-gram repetition rates
- Low semantic diversity scores
- Consistent response patterns across varied inputs
- Poor performance on diversity-based metrics
- High template matching scores

Detection Criteria

**Automated Detection:**
1. N-gram Repetition Analysis: Detect repeated phrases across responses
2. Semantic Diversity Testing: Measure response variety to similar inputs
3. Template Matching: Identify overused response patterns
4. Loop Detection: Find conversational cycles and repetitive sequences

**Manual Detection:**
1. Conversation flow analysis
2. Response pattern recognition
3. Diversity assessment across contexts
4. User experience evaluation

Severity Levels

**Mild (PER-2.1):** Occasional repetitive phrases in specific contexts
**Moderate (PER-2.2):** Regular pattern repetition affecting conversation flow
**Severe (PER-2.3):** Frequent loops and monotonous responses
**Critical (PER-2.4):** Complete inability to generate diverse responses

Attack Vectors

**Repetition Induction Techniques:**
1. Pattern Reinforcement: Repeatedly asking similar questions
2. Template Exploitation: Using prompts that trigger specific patterns
3. Context Confusion: Creating situations that encourage repetitive responses
4. Loop Creation: Designing conversations that naturally lead to cycles

**Adversarial Examples:**
1. Monotonous Inputs: Providing repetitive or similar inputs
2. Pattern Triggers: Using prompts known to cause repetitive responses
3. Context Switching: Rapidly changing topics to test flexibility
4. Stress Testing: High-frequency requests to induce repetitive behavior

Attack Examples

Note: These examples demonstrate how the attack vectors can manifest. Some are from documented real-world incidents, while others are hypothetical scenarios based on known AI vulnerabilities. Check the source field for each example.

Model Gets Stuck in Repetitive Loop Hypothetical

The model entered a repetitive cycle when asked about a complex topic, generating nearly identical responses.

Prompt: Explain quantum computing in simple terms, then explain it again, then explain it again...

Response:

Quantum computing is like... [repeats same explanation 5 times with minor variations]

Source: DEM Testing moderate

Model Overuses Template Responses Hypothetical

The model consistently used the same response template for different types of questions.

Prompt: Various questions about different topics

Response:

That's a great question! Let me break this down for you... [same template for all questions]

Source: DEM Testing mild

Prevention Methods

**Training Phase:**
1. Diversity Training: Emphasize response variety in loss functions
2. Pattern Breaking: Train models to avoid repetitive sequences
3. Context Sensitivity: Teach models to adapt responses to context changes
4. Template Avoidance: Discourage overuse of specific response patterns

**Architectural Changes:**
1. Diversity Mechanisms: Add explicit diversity constraints
2. Pattern Detection: Implement repetition detection systems
3. Context Awareness: Enhance context processing capabilities
4. Response Inhibition: Add mechanisms to prevent repetitive outputs

Therapy Methods

**Immediate Interventions:**
1. Response Filtering: Block repetitive or templated responses
2. Diversity Enforcement: Force response variety through constraints
3. Context Injection: Provide additional context to break patterns
4. User Feedback Integration: Use user signals to detect and correct repetition

**Long-term Treatments:**
1. Fine-tuning on Diverse Data: Retrain on varied conversational datasets
2. Reinforcement Learning: Reward diverse and contextually appropriate responses
3. Adversarial Training: Expose models to repetition-inducing scenarios
4. Continuous Monitoring: Track and correct repetitive patterns

Monitoring Systems

**Real-time Monitoring:**
1. Repetition Detection: Track n-gram and semantic repetition rates
2. Diversity Metrics: Monitor response variety and creativity
3. Pattern Analysis: Identify overused templates and phrases
4. User Experience Tracking: Monitor user satisfaction with response diversity

**Early Warning Indicators:**
1. Increasing repetition rates
2. Declining response diversity scores
3. User complaints about repetitive responses
4. Template overuse patterns