This blog explores optimizing Retrieval-Augmented Generation (RAG) systems by testing various chunking and retrieval methods. Among static, overlap, and agentic chunking, agentic paired with similarity search emerged as the best-performing combination. While agentic chunking requires more time and resources, its superior accuracy and relevance make it an excellent choice for high-quality RAG setups. The post provides insights, testing methodology, and recommendations for developers to implement these advanced techniques efficiently.
Are you confident that your Retrieval-Augmented Generation (RAG) system is performing at its best? With the constant development of new methods like dynamic search and agentic chunking, keeping up with advancements can feel overwhelming. Many developers miss out on techniques that could revolutionize their systems.
I’ve dedicated hours to researching, testing, and refining these methods, aiming to create an optimal RAG setup for any project. Among the standout techniques is agentic chunking, a method that intelligently identifies chunks of data to provide the most relevant input for large language models. The results? Increased relevance and effectiveness.
Best of all, I’m sharing my findings and methods with you for free. In this post, I’ll walk you through the methodology, the codebase, and the data-backed results that reveal the ultimate setup for building a high-performing RAG system.
To test the effectiveness of these chunking and retrieval methods, I used the Amazon 2023 annual report as a dataset. For each query, the system compared the retrieved answers against pre-verified correct answers. The tests looped through all combinations of chunking and retrieval methods, calculating similarity scores to determine the best-performing setup.
Here’s a quick breakdown of the testing process:
After running extensive tests, the combination of agentic chunking and similarity search emerged as the top performer:
Agentic-Similarity: 28.9746
Agentic-Hybrid: 28.6162
Overlap-Hybrid: 28.4701
Overlap-Similarity: 28.2474
Agentic-Keyword: 27.5131
Static-Hybrid: 27.3789
Static-Similarity: 27.3786
Static-Keyword: 26.8396
Overlap-Keyword: 26.7235
These results highlight the superiority of agentic chunking, especially when paired with similarity search. Unlike static or overlap chunking, agentic chunking ensures that all relevant information is grouped cohesively, enhancing retrieval accuracy.
While agentic chunking delivers unparalleled results, it’s worth noting some trade-offs:
However, with advancements in AI models becoming faster and more affordable, these challenges are diminishing rapidly. Even with the current technology, the quality improvements are well worth the investment.
Incorporating advanced methods like agentic chunking into your RAG system can dramatically improve its effectiveness. By following the setup outlined here, you can build a system that not only retrieves the most relevant information but also does so with precision and efficiency.
If you’re interested in more AI tutorials or want to see a live demo of this RAG system, check out my accompanying video. Don’t forget to like, subscribe, and share your thoughts in the comments. Together, let’s take AI to the next level!
Luuk Alleman, founder of Everyman AI, specializes in creating impactful AI solutions using large language models and machine learning to help businesses streamline operations and gain insights.
We are your partner to reliable and effective AI.
Korte Hogendijk 16, 1506 MA Zaandam
hi@everyman.ai
+31 6 82 94 86 56