AIRAGEngineering

We rebuilt our RAG stack — here's what actually moved the needle

After a year shipping AI features for production teams, we rewrote our RAG pipeline. These eight changes accounted for 90% of the quality lift.

Rafael Mendes · Mar 12, 2026

The honest version

Most RAG demos look great until you put real customer data behind them. Here's what actually moved retrieval quality for our clients in 2026.

Semantic chunking with overlap, anchored on document structure (headings, list boundaries), consistently outperforms naïve 800-character splits.

BM25 + dense vectors, fused with reciprocal-rank fusion, beats either approach in isolation on every benchmark we ran.

If you can't measure quality, you can't improve it. We start every engagement by writing 30–50 graded eval cases with the customer.