Why I Keep My Knowledge Base Under 12 Files

The default assumption when building a knowledge base for an LLM agent is additive: more documents, more coverage, better results. I cross-referenced that assumption against actual behavior and found it does not hold past a certain threshold.

The Practical Case for 12

Twelve files is not a magic number. It is the upper bound I arrived at after observing performance degradation in direct-injection setups where file count exceeded that range.

The mechanism is straightforward. Every file injected into a context window consumes tokens. At average knowledge base file sizes of 1000–2500 tokens, 12 files occupies roughly 12,000–30,000 tokens before the system prompt, conversation history, or tool definitions are even counted. That is a substantial fraction of usable context for most deployments.

Past 12 files, two things happen:

  • Relevance dilution: the model is processing a larger proportion of content that does not apply to the current query
  • Attention dispersion: with more material to attend to, precision on any individual file decreases

RAG vs. Direct Injection

The 12-file limit applies specifically to direct injection — the approach where files are loaded into context unconditionally. Retrieval-augmented generation (RAG) sidesteps this by selecting only the most relevant chunks at query time, keeping the injected volume low regardless of knowledge base size.

The tradeoff is infrastructure complexity. RAG requires an embedding pipeline, a vector store, and retrieval logic. For small knowledge bases, that overhead is not justified. For knowledge bases exceeding 15–20 files, it is the correct approach.

The decision boundary:

  • Under 12 files, high average relevance: direct injection
  • Over 15 files, or low average per-query relevance: RAG

What Goes in the 12

Selection discipline matters more than raw count. I structure knowledge bases around these categories:

  • Core behavioral constraints (1–2 files)
  • Domain reference material directly relevant to the agent's tasks (4–6 files)
  • Current project state or handover document (1–2 files)
  • Operational procedures or templates (2–3 files)

Everything else is archived and available on demand, not injected by default.

The goal is a context window that reads like a well-edited briefing document, not an undifferentiated file dump.

Comments 0

Related content coming soon.