Back to journal
AIRAGEngineering

We rebuilt our RAG stack — here's what actually moved the needle

After a year shipping AI features for production teams, we rewrote our RAG pipeline. These eight changes accounted for 90% of the quality lift.

Rafael Mendes · Mar 12, 2026

We rebuilt our RAG stack — here's what actually moved the needle

The honest version

Most RAG demos look great until you put real customer data behind them. Here's what actually moved retrieval quality for our clients in 2026.

1. Stop chunking on character counts

Semantic chunking with overlap, anchored on document structure (headings, list boundaries), consistently outperforms naïve 800-character splits.

2. Hybrid retrieval is non-negotiable

BM25 + dense vectors, fused with reciprocal-rank fusion, beats either approach in isolation on every benchmark we ran.

3. Evals before prompts

If you can't measure quality, you can't improve it. We start every engagement by writing 30–50 graded eval cases with the customer.