Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.29.0
metadata
title: QA-bot
app_file: app.py
sdk: gradio
sdk_version: 4.44.0
PDF Question-Answering App using LangChain, Pinecone, and Mistral
This project is a RAG app designed to perform question-answering (QA) on PDF documents. It uses the LangChain
framework for embedding, Pinecone
for vector storage, and the mistral
language model for generating responses to user queries.
Features
- PDF Handling: Load and split PDF files into manageable chunks for processing.
- Embeddings: I am using the
SentenceTransformerEmbeddings
to create embeddings for document chunks. - Vector Storage: Pinecone is used to store document embeddings and efficiently retrieve relevant chunks based on user questions.
- LLM Integration: I tried using LLMs locally using
Ollama
but due to lack of compute resources I usedmistral
for faster and better responses. - Environment Variables: Secrets like API keys are securely managed using
.env
files.
Requirements
- Python 3.12
- Run
pip install -r requirements.txt
- The following teck stack is used:
langchain
pinecone
Make sure to sign up and create Pinecone API keyMistral API
Setup
1. Clone the Repository
git clone https://github.com/m-umar-j/RAG-APP
cd RAG-APP
2. install the requirements using
pip install -r requirements.txt`
3. create .env file in your root directory and add pinecone API key
PINECONE_API_KEY=your-pinecone-api-key
4. modify paths
file_path = "/path/to/data.pdf"