RAG Blueprint Documentation
Overview
The RAG blueprint project is a Retrieval-Augmented Generation system that integrates with several datasources to provide intelligent document search and analysis. The system combines the power of different large language models with knowledge bases to deliver accurate, context-aware responses through a chat interface.
Data Sources
Data Source | Description |
---|---|
Confluence | Enterprise wiki and knowledge base integration |
Notion | Workspace and document management integration |
PDF document processing and text extraction |
Check how to configure datasources here.
Embeddding Models
Models | Provider | Description |
---|---|---|
* | HuggingFace | Open-sourced, run locally embedding models provided by HuggingFace |
* | OpenAI | Embedding models provided by OpenAI |
* | VoyageAI | Embedding models provided by VoyageAI |
Check how to configure embedding model here.
Language Models
Model | Provider | Description |
---|---|---|
* | OpenAI | Language models provided by OpenAI |
* | OpenAILike | Self-hosted language models shared through OpenAI like API |
Check how to configure LLM here.
Vector Databases
Vector Store | Description |
---|---|
Qdrant | High-performance vector similarity search engine |
Chroma | Lightweight embedded vector database |
Check how to configure vector store here.
Key Features
- Multiple Knowledge Base Integration: Seamless extraction from several Data Sources(Confluence, Notion, PDF)
- Wide Models Support: Availability of numerous embedding and language models
- Vector Search: Efficient similarity search using vector stores
- Interactive Chat: User-friendly interface for querying knowledge on Chainlit
- Performance Monitoring: Query and response tracking with Langfuse
- Evaluation: Comprehensive evaluation metrics using RAGAS
- Setup flexibility: Easy and flexible setup process of the pipeline