Member-only story
RAG Without the Vector DB: Say Hi To Multilevel LLM Routing!
Less infrastructure, fewer headaches, better results!
Building effective documentation based AI chatbots traditionally requires complex RAG (Retrieval-Augmented Generation) systems with vector databases and embeddings.
But what if there’s a simpler way?
Today, I’m exploring an alternative approach that uses multilevel LLM routing instead of traditional RAG methods.
After building SimplerLLM to help developers “like me” work with LLMs more easily, I recently added an LLM router feature and have been experimenting with it in different use cases.
I’ve been testing it for AI agent decision making, AI teams collaboration, and now as an alternative to RAG.
In this post, I’ll compare multilevel LLM routing with traditional RAG for documentation search, with simple prototypes to demonstrate both concepts.
The Documentation Problem
Documentation is essential but often challenging to navigate.
Many companies now use AI-powered search to help users find answers quickly. There are two main approaches: