The AI Error Taxonomy
AI errors are not random — they cluster into predictable patterns. Knowing the pattern predicts where to look:
Factual hallucination: Confident false statements about facts, statistics, citations, people, and events. Most common in knowledge-recall tasks without provided source material. Detection: verify specific claims against primary sources.
Instruction drift: The output drifts from the original instruction — usually a change in length, format, tone, or scope. Most common in long or multi-part prompts. Detection: compare output to your stated requirements.
Plausible fabrication: Technically false but plausible-sounding claims — the most dangerous type because they pass casual review. Most common when AI is asked about specific organizations, people, or niche topics. Detection: ask "what's your source?" for any specific claim.
Context loss: AI ignores or misapplies context you provided. Most common in long conversations where the context is far from the current prompt. Detection: re-read AI's output with your context in mind and check for obvious mismatches.
Over-hedging: Output is so qualified and balanced it's not actually useful. Most common for sensitive or complex topics. Detection: read for actionability — can you actually do something with this?
Building Quality Checks Into Workflows
The goal is not to manually verify everything — that defeats the purpose of AI assistance. The goal is calibrated verification: systematic checks at the right moments for the right types of errors.
Checkpoint design:
- Identify which error types are most likely for each AI task in your workflow
- Build a specific check for each: "Before sending, verify any statistics in this email against their original source"
- Document the checks as part of the workflow (not in your head)
High-stakes vs. low-stakes differentiation: Not all AI outputs need the same scrutiny. Client-facing and decision-informing outputs need rigorous review. Internal working documents can tolerate more roughness. Calibrate your review intensity to the stakes.
The Pre-Send Checklist
For any AI-assisted output going to an external audience:
- Are there any specific factual claims? If yes, are they verified?
- Does this match the tone and voice appropriate for this recipient?
- Is there anything in here that would be embarrassing if it were wrong?
- Does this actually answer what was asked?
- Would I be comfortable with my name on this exactly as it is?