RAG Blueprint Documentation

Overview

The RAG blueprint project is a Retrieval-Augmented Generation system that integrates with several datasources to provide intelligent document search and analysis. The system combines the power of different large language models with knowledge bases to deliver accurate, context-aware responses through a chat interface.

Data Sources

Data Source Description
Confluence Enterprise wiki and knowledge base integration
Notion Workspace and document management integration
PDF PDF document processing and text extraction

Check how to configure datasources here.

Embeddding Models

Models Provider Description
* HuggingFace Open-sourced, run locally embedding models provided by HuggingFace
* OpenAI Embedding models provided by OpenAI
* VoyageAI Embedding models provided by VoyageAI

Check how to configure embedding model here.

Language Models

Model Provider Description
* OpenAI Language models provided by OpenAI
* OpenAILike Self-hosted language models shared through OpenAI like API

Check how to configure LLM here.

Vector Databases

Vector Store Description
Qdrant High-performance vector similarity search engine
Chroma Lightweight embedded vector database

Check how to configure vector store here.

Key Features

  • Multiple Knowledge Base Integration: Seamless extraction from several Data Sources(Confluence, Notion, PDF)
  • Wide Models Support: Availability of numerous embedding and language models
  • Vector Search: Efficient similarity search using vector stores
  • Interactive Chat: User-friendly interface for querying knowledge on Chainlit
  • Performance Monitoring: Query and response tracking with Langfuse
  • Evaluation: Comprehensive evaluation metrics using RAGAS
  • Setup flexibility: Easy and flexible setup process of the pipeline

Quick Start