Spaces:
Sleeping
Sleeping
assignment 3 submission and fastapi and react code with cursor help.
Browse files- README.md +37 -0
- aimakerspace/vectordatabase.py +4 -0
- app.py +1 -1
- backend/README.md +93 -0
- backend/app/__init__.py +0 -0
- backend/app/dependencies.py +0 -0
- backend/app/main.py +0 -0
- backend/app/models.py +0 -0
- backend/app/services/__init__.py +0 -0
- backend/app/services/chat_service.py +0 -0
- backend/app/services/document_processor.py +0 -0
- backend/app/services/vector_store.py +0 -0
- backend/requirements.txt +0 -0
README.md
CHANGED
@@ -167,6 +167,20 @@ Simply put, this downloads the file as a temp file, we load it in with `TextFile
|
|
167 |
#### β QUESTION #1:
|
168 |
|
169 |
Why do we want to support streaming? What about streaming is important, or useful?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
170 |
|
171 |
### On Chat Start:
|
172 |
|
@@ -210,6 +224,25 @@ Now, we'll save that into our user session!
|
|
210 |
|
211 |
Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
|
212 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
213 |
### On Message
|
214 |
|
215 |
First, we load our chain from the user session:
|
@@ -330,10 +363,14 @@ Try uploading a text file and asking some questions!
|
|
330 |
Upload a PDF file of the recent DeepSeek-R1 paper and ask the following questions:
|
331 |
|
332 |
1. What is RL and how does it help reasoning?
|
|
|
333 |
2. What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
|
|
|
334 |
3. What is this paper about?
|
|
|
335 |
|
336 |
Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
|
|
|
337 |
|
338 |
## π§ CHALLENGE MODE π§
|
339 |
|
|
|
167 |
#### β QUESTION #1:
|
168 |
|
169 |
Why do we want to support streaming? What about streaming is important, or useful?
|
170 |
+
Streaming responses from LLMs provides several key benefits:
|
171 |
+
|
172 |
+
1. **Improved User Experience**: Instead of waiting for the entire response to be generated before seeing anything, users see the response being built word by word, making the interaction feel more natural and engaging, similar to how humans communicate.
|
173 |
+
|
174 |
+
2. **Faster First Token Display**: Users get immediate feedback that the system is working, rather than staring at a blank screen while waiting for a complete response. This reduces perceived latency and user anxiety.
|
175 |
+
|
176 |
+
3. **Memory Efficiency**: Streaming processes and displays tokens as they arrive, rather than waiting to buffer the entire response. This is especially important for longer responses that could otherwise consume significant memory.
|
177 |
+
|
178 |
+
4. **Early Error Detection**: If there are issues with the response generation, they can be detected and handled earlier in the process, rather than waiting for the full response to complete.
|
179 |
+
|
180 |
+
5. **Cancellation Capability**: Users can potentially cancel long responses midway if they realize they don't need the complete answer, saving both computation time and costs.
|
181 |
+
|
182 |
+
In our implementation, we use async streaming to efficiently handle these token-by-token responses while maintaining a responsive user interface.
|
183 |
+
|
184 |
|
185 |
### On Chat Start:
|
186 |
|
|
|
224 |
|
225 |
Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
|
226 |
|
227 |
+
The need for user sessions in Python, particularly in web applications like our Chainlit app, stems from a few key characteristics of Python and web servers:
|
228 |
+
|
229 |
+
1. **Process-Based Request Handling**: Python web servers typically handle each request in a separate process or thread. Without sessions, variables would be reset between requests because each process has its own memory space. This means global variables wouldn't persist between user interactions.
|
230 |
+
|
231 |
+
2. **Memory Isolation**: Unlike some desktop applications where state can be maintained in memory throughout the program's lifetime, web applications need to explicitly manage state across multiple requests. Each time a user sends a message, it's a new request that runs in isolation.
|
232 |
+
|
233 |
+
3. **Concurrent Users**: Multiple users accessing the application simultaneously need their own separate states. Global variables would be shared across all users, leading to data mixing and security issues. User sessions provide isolated storage for each user's conversation context.
|
234 |
+
|
235 |
+
4. **Stateless HTTP Protocol**: Web applications run over HTTP, which is stateless by nature. Each request is independent and doesn't know about previous requests. Sessions provide a way to maintain state across requests.
|
236 |
+
|
237 |
+
In contrast, some other programming environments handle this differently:
|
238 |
+
|
239 |
+
- **JavaScript (Node.js)**: Can maintain state in closure scope when using WebSocket connections, allowing persistent connections that maintain context without explicit session management.
|
240 |
+
- **Elixir/Erlang**: Uses the actor model where each user conversation can be a long-lived process (actor) maintaining its own state naturally.
|
241 |
+
- **Go**: Often uses goroutines that can maintain state for the duration of a user's connection.
|
242 |
+
|
243 |
+
By using Chainlit's user session management, we ensure each user has their own isolated vector database and chain instance, preventing cross-talk between different users' conversations and maintaining conversation context properly.
|
244 |
+
|
245 |
+
|
246 |
### On Message
|
247 |
|
248 |
First, we load our chain from the user session:
|
|
|
363 |
Upload a PDF file of the recent DeepSeek-R1 paper and ask the following questions:
|
364 |
|
365 |
1. What is RL and how does it help reasoning?
|
366 |
+
- Answered the question correctly, but to me this would qualify as going out of the context since the paper does not explain RL.
|
367 |
2. What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
|
368 |
+
- Replied correctly with `I don't know the answer`.
|
369 |
3. What is this paper about?
|
370 |
+
- Replied correctly with `I don't know the answer`, but responds with content for a more specific question like `What is DeepSeek-V3`
|
371 |
|
372 |
Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
|
373 |
+
- In my opinion the vibe check failed for the first question.
|
374 |
|
375 |
## π§ CHALLENGE MODE π§
|
376 |
|
aimakerspace/vectordatabase.py
CHANGED
@@ -12,6 +12,10 @@ def cosine_similarity(vector_a: np.array, vector_b: np.array) -> float:
|
|
12 |
norm_b = np.linalg.norm(vector_b)
|
13 |
return dot_product / (norm_a * norm_b)
|
14 |
|
|
|
|
|
|
|
|
|
15 |
|
16 |
class VectorDatabase:
|
17 |
def __init__(self, embedding_model: EmbeddingModel = None):
|
|
|
12 |
norm_b = np.linalg.norm(vector_b)
|
13 |
return dot_product / (norm_a * norm_b)
|
14 |
|
15 |
+
def euclidean_distance(vector_a: np.array, vector_b: np.array) -> float:
|
16 |
+
"""Computes the euclidean distance between two vectors."""
|
17 |
+
return np.linalg.norm(vector_a - vector_b)
|
18 |
+
|
19 |
|
20 |
class VectorDatabase:
|
21 |
def __init__(self, embedding_model: EmbeddingModel = None):
|
app.py
CHANGED
@@ -26,7 +26,7 @@ Question:
|
|
26 |
user_role_prompt = UserRolePrompt(user_prompt_template)
|
27 |
|
28 |
class RetrievalAugmentedQAPipeline:
|
29 |
-
def __init__(self, llm: ChatOpenAI
|
30 |
self.llm = llm
|
31 |
self.vector_db_retriever = vector_db_retriever
|
32 |
|
|
|
26 |
user_role_prompt = UserRolePrompt(user_prompt_template)
|
27 |
|
28 |
class RetrievalAugmentedQAPipeline:
|
29 |
+
def __init__(self, llm: ChatOpenAI, vector_db_retriever: VectorDatabase) -> None:
|
30 |
self.llm = llm
|
31 |
self.vector_db_retriever = vector_db_retriever
|
32 |
|
backend/README.md
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Document Chat Backend
|
2 |
+
|
3 |
+
This is the FastAPI backend for the Document Chat application that enables document uploading, processing, and chat-based Q&A functionality.
|
4 |
+
|
5 |
+
## Project Structure
|
6 |
+
```
|
7 |
+
backend/
|
8 |
+
βββ app/
|
9 |
+
β βββ __init__.py
|
10 |
+
β βββ main.py # FastAPI application and route definitions
|
11 |
+
β βββ models.py # Pydantic models for request/response validation
|
12 |
+
β βββ services/
|
13 |
+
β β βββ __init__.py
|
14 |
+
β β βββ document_processor.py # Document processing and chunking
|
15 |
+
β β βββ vector_store.py # Vector storage and similarity search
|
16 |
+
β β βββ chat_service.py # OpenAI chat completion handling
|
17 |
+
β βββ dependencies.py # FastAPI dependencies and shared resources
|
18 |
+
βββ requirements.txt # Python package dependencies
|
19 |
+
```
|
20 |
+
|
21 |
+
## Setup
|
22 |
+
|
23 |
+
1. Create a virtual environment:
|
24 |
+
```bash
|
25 |
+
python -m venv .venv
|
26 |
+
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
27 |
+
```
|
28 |
+
|
29 |
+
2. Install dependencies:
|
30 |
+
```bash
|
31 |
+
pip install -r requirements.txt
|
32 |
+
```
|
33 |
+
|
34 |
+
3. Create a `.env` file in the root directory with your OpenAI API key:
|
35 |
+
```
|
36 |
+
OPENAI_API_KEY=your_api_key_here
|
37 |
+
```
|
38 |
+
|
39 |
+
4. Run the development server:
|
40 |
+
```bash
|
41 |
+
uvicorn app.main:app --reload
|
42 |
+
```
|
43 |
+
|
44 |
+
The API will be available at `http://localhost:8000`
|
45 |
+
|
46 |
+
## API Endpoints
|
47 |
+
|
48 |
+
- `POST /upload`
|
49 |
+
- Upload PDF or TXT files for processing
|
50 |
+
- Returns a session ID for subsequent chat interactions
|
51 |
+
|
52 |
+
- `POST /chat`
|
53 |
+
- Send messages to chat with the uploaded document
|
54 |
+
- Requires a session ID from a previous upload
|
55 |
+
|
56 |
+
## Development
|
57 |
+
|
58 |
+
- The `services` directory contains the core business logic:
|
59 |
+
- `document_processor.py`: Handles file uploads and text chunking
|
60 |
+
- `vector_store.py`: Manages document embeddings and similarity search
|
61 |
+
- `chat_service.py`: Interfaces with OpenAI for chat completions
|
62 |
+
|
63 |
+
- `models.py` defines the data structures using Pydantic
|
64 |
+
- `main.py` contains the FastAPI application and route handlers
|
65 |
+
- `dependencies.py` manages shared resources and dependencies
|
66 |
+
|
67 |
+
## Requirements
|
68 |
+
|
69 |
+
See `requirements.txt` for a full list of Python dependencies. Key packages include:
|
70 |
+
- FastAPI
|
71 |
+
- OpenAI
|
72 |
+
- PyPDF2
|
73 |
+
- python-multipart
|
74 |
+
- python-dotenv
|
75 |
+
- uvicorn
|
76 |
+
```
|
77 |
+
|
78 |
+
To create the directory structure, you can run these commands:
|
79 |
+
|
80 |
+
```bash
|
81 |
+
mkdir -p backend/app/services
|
82 |
+
touch backend/app/__init__.py
|
83 |
+
touch backend/app/main.py
|
84 |
+
touch backend/app/models.py
|
85 |
+
touch backend/app/dependencies.py
|
86 |
+
touch backend/app/services/__init__.py
|
87 |
+
touch backend/app/services/document_processor.py
|
88 |
+
touch backend/app/services/vector_store.py
|
89 |
+
touch backend/app/services/chat_service.py
|
90 |
+
touch backend/requirements.txt
|
91 |
+
```
|
92 |
+
|
93 |
+
This will create all the necessary files and directories in the structure you specified. The README.md provides a clear overview of the project structure, setup instructions, and key components for developers who will work with the backend.
|
backend/app/__init__.py
ADDED
File without changes
|
backend/app/dependencies.py
ADDED
File without changes
|
backend/app/main.py
ADDED
File without changes
|
backend/app/models.py
ADDED
File without changes
|
backend/app/services/__init__.py
ADDED
File without changes
|
backend/app/services/chat_service.py
ADDED
File without changes
|
backend/app/services/document_processor.py
ADDED
File without changes
|
backend/app/services/vector_store.py
ADDED
File without changes
|
backend/requirements.txt
ADDED
File without changes
|