🔤 Tokens: The Unit of Thought
AI doesn't read text the way you do. It breaks language into tokens — chunks roughly corresponding to words. Common words are usually one token. Rare or long words may be several. Punctuation is often its own token.
Why tokens matter in practice:
- Context window limits (like "128K context") are in tokens, not words
- Every generated token is influenced by every previous token in the conversation
- Unusual words or names get split unexpectedly, making the model less reliable on them
🧠 The Context Window: Your AI's Working Memory
An AI model only "knows" what's currently in its context window — your system prompt, the conversation history, and the current message. Think of it as working memory, not long-term memory.
Each conversation with an AI is a fresh piece of paper. Everything on that paper is available to it. Anything not on that paper doesn't exist from the model's perspective.
What this means in practice:
- In very long conversations, earlier messages may be dropped as the context fills up
- The AI can't access previous conversations unless the platform stores and re-injects them
- To make AI "remember" something important, keep it in the current context
- Context windows are large (100K+ tokens for Claude and ChatGPT) — but not infinite
🎲 Temperature: Why the Same Prompt Gets Different Results
AI doesn't always pick the single most probable next token — that would produce repetitive output. Instead, it samples from a probability distribution. Temperature controls how this sampling works:
Running the same prompt twice and getting different results isn't a bug. It's how sampling works. For creative tasks, variation is a feature. For consistency, ask explicitly or use platform settings.
🌀 Why AI "Hallucinates"
The model is always generating the most statistically plausible continuation of the text it's seen. When you ask about a specific fact, it doesn't retrieve it from a database — it generates a response that looks like a factual answer, in the style factual answers are written.
If the fact was well-represented in training data, the output is often accurate. If it was rare, obscure, or recent, the model still generates something that looks like a confident factual claim — because that's what confident factual claims look like in text.
The practical rule: Any specific claim — a date, a statistic, a quote, a citation — should be independently verified before you use it. Use AI for synthesis, structure, and generation. Use primary sources for facts.