vin00d commited on
Commit
1d570ce
Β·
1 Parent(s): e937461

assignment 3 submission and fastapi and react code with cursor help.

Browse files
README.md CHANGED
@@ -167,6 +167,20 @@ Simply put, this downloads the file as a temp file, we load it in with `TextFile
167
  #### ❓ QUESTION #1:
168
 
169
  Why do we want to support streaming? What about streaming is important, or useful?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
 
171
  ### On Chat Start:
172
 
@@ -210,6 +224,25 @@ Now, we'll save that into our user session!
210
 
211
  Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
212
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
213
  ### On Message
214
 
215
  First, we load our chain from the user session:
@@ -330,10 +363,14 @@ Try uploading a text file and asking some questions!
330
  Upload a PDF file of the recent DeepSeek-R1 paper and ask the following questions:
331
 
332
  1. What is RL and how does it help reasoning?
 
333
  2. What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
 
334
  3. What is this paper about?
 
335
 
336
  Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
 
337
 
338
  ## 🚧 CHALLENGE MODE 🚧
339
 
 
167
  #### ❓ QUESTION #1:
168
 
169
  Why do we want to support streaming? What about streaming is important, or useful?
170
+ Streaming responses from LLMs provides several key benefits:
171
+
172
+ 1. **Improved User Experience**: Instead of waiting for the entire response to be generated before seeing anything, users see the response being built word by word, making the interaction feel more natural and engaging, similar to how humans communicate.
173
+
174
+ 2. **Faster First Token Display**: Users get immediate feedback that the system is working, rather than staring at a blank screen while waiting for a complete response. This reduces perceived latency and user anxiety.
175
+
176
+ 3. **Memory Efficiency**: Streaming processes and displays tokens as they arrive, rather than waiting to buffer the entire response. This is especially important for longer responses that could otherwise consume significant memory.
177
+
178
+ 4. **Early Error Detection**: If there are issues with the response generation, they can be detected and handled earlier in the process, rather than waiting for the full response to complete.
179
+
180
+ 5. **Cancellation Capability**: Users can potentially cancel long responses midway if they realize they don't need the complete answer, saving both computation time and costs.
181
+
182
+ In our implementation, we use async streaming to efficiently handle these token-by-token responses while maintaining a responsive user interface.
183
+
184
 
185
  ### On Chat Start:
186
 
 
224
 
225
  Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
226
 
227
+ The need for user sessions in Python, particularly in web applications like our Chainlit app, stems from a few key characteristics of Python and web servers:
228
+
229
+ 1. **Process-Based Request Handling**: Python web servers typically handle each request in a separate process or thread. Without sessions, variables would be reset between requests because each process has its own memory space. This means global variables wouldn't persist between user interactions.
230
+
231
+ 2. **Memory Isolation**: Unlike some desktop applications where state can be maintained in memory throughout the program's lifetime, web applications need to explicitly manage state across multiple requests. Each time a user sends a message, it's a new request that runs in isolation.
232
+
233
+ 3. **Concurrent Users**: Multiple users accessing the application simultaneously need their own separate states. Global variables would be shared across all users, leading to data mixing and security issues. User sessions provide isolated storage for each user's conversation context.
234
+
235
+ 4. **Stateless HTTP Protocol**: Web applications run over HTTP, which is stateless by nature. Each request is independent and doesn't know about previous requests. Sessions provide a way to maintain state across requests.
236
+
237
+ In contrast, some other programming environments handle this differently:
238
+
239
+ - **JavaScript (Node.js)**: Can maintain state in closure scope when using WebSocket connections, allowing persistent connections that maintain context without explicit session management.
240
+ - **Elixir/Erlang**: Uses the actor model where each user conversation can be a long-lived process (actor) maintaining its own state naturally.
241
+ - **Go**: Often uses goroutines that can maintain state for the duration of a user's connection.
242
+
243
+ By using Chainlit's user session management, we ensure each user has their own isolated vector database and chain instance, preventing cross-talk between different users' conversations and maintaining conversation context properly.
244
+
245
+
246
  ### On Message
247
 
248
  First, we load our chain from the user session:
 
363
  Upload a PDF file of the recent DeepSeek-R1 paper and ask the following questions:
364
 
365
  1. What is RL and how does it help reasoning?
366
+ - Answered the question correctly, but to me this would qualify as going out of the context since the paper does not explain RL.
367
  2. What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
368
+ - Replied correctly with `I don't know the answer`.
369
  3. What is this paper about?
370
+ - Replied correctly with `I don't know the answer`, but responds with content for a more specific question like `What is DeepSeek-V3`
371
 
372
  Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
373
+ - In my opinion the vibe check failed for the first question.
374
 
375
  ## 🚧 CHALLENGE MODE 🚧
376
 
aimakerspace/vectordatabase.py CHANGED
@@ -12,6 +12,10 @@ def cosine_similarity(vector_a: np.array, vector_b: np.array) -> float:
12
  norm_b = np.linalg.norm(vector_b)
13
  return dot_product / (norm_a * norm_b)
14
 
 
 
 
 
15
 
16
  class VectorDatabase:
17
  def __init__(self, embedding_model: EmbeddingModel = None):
 
12
  norm_b = np.linalg.norm(vector_b)
13
  return dot_product / (norm_a * norm_b)
14
 
15
+ def euclidean_distance(vector_a: np.array, vector_b: np.array) -> float:
16
+ """Computes the euclidean distance between two vectors."""
17
+ return np.linalg.norm(vector_a - vector_b)
18
+
19
 
20
  class VectorDatabase:
21
  def __init__(self, embedding_model: EmbeddingModel = None):
app.py CHANGED
@@ -26,7 +26,7 @@ Question:
26
  user_role_prompt = UserRolePrompt(user_prompt_template)
27
 
28
  class RetrievalAugmentedQAPipeline:
29
- def __init__(self, llm: ChatOpenAI(), vector_db_retriever: VectorDatabase) -> None:
30
  self.llm = llm
31
  self.vector_db_retriever = vector_db_retriever
32
 
 
26
  user_role_prompt = UserRolePrompt(user_prompt_template)
27
 
28
  class RetrievalAugmentedQAPipeline:
29
+ def __init__(self, llm: ChatOpenAI, vector_db_retriever: VectorDatabase) -> None:
30
  self.llm = llm
31
  self.vector_db_retriever = vector_db_retriever
32
 
backend/README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Document Chat Backend
2
+
3
+ This is the FastAPI backend for the Document Chat application that enables document uploading, processing, and chat-based Q&A functionality.
4
+
5
+ ## Project Structure
6
+ ```
7
+ backend/
8
+ β”œβ”€β”€ app/
9
+ β”‚ β”œβ”€β”€ __init__.py
10
+ β”‚ β”œβ”€β”€ main.py # FastAPI application and route definitions
11
+ β”‚ β”œβ”€β”€ models.py # Pydantic models for request/response validation
12
+ β”‚ β”œβ”€β”€ services/
13
+ β”‚ β”‚ β”œβ”€β”€ __init__.py
14
+ β”‚ β”‚ β”œβ”€β”€ document_processor.py # Document processing and chunking
15
+ β”‚ β”‚ β”œβ”€β”€ vector_store.py # Vector storage and similarity search
16
+ β”‚ β”‚ └── chat_service.py # OpenAI chat completion handling
17
+ β”‚ └── dependencies.py # FastAPI dependencies and shared resources
18
+ └── requirements.txt # Python package dependencies
19
+ ```
20
+
21
+ ## Setup
22
+
23
+ 1. Create a virtual environment:
24
+ ```bash
25
+ python -m venv .venv
26
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
27
+ ```
28
+
29
+ 2. Install dependencies:
30
+ ```bash
31
+ pip install -r requirements.txt
32
+ ```
33
+
34
+ 3. Create a `.env` file in the root directory with your OpenAI API key:
35
+ ```
36
+ OPENAI_API_KEY=your_api_key_here
37
+ ```
38
+
39
+ 4. Run the development server:
40
+ ```bash
41
+ uvicorn app.main:app --reload
42
+ ```
43
+
44
+ The API will be available at `http://localhost:8000`
45
+
46
+ ## API Endpoints
47
+
48
+ - `POST /upload`
49
+ - Upload PDF or TXT files for processing
50
+ - Returns a session ID for subsequent chat interactions
51
+
52
+ - `POST /chat`
53
+ - Send messages to chat with the uploaded document
54
+ - Requires a session ID from a previous upload
55
+
56
+ ## Development
57
+
58
+ - The `services` directory contains the core business logic:
59
+ - `document_processor.py`: Handles file uploads and text chunking
60
+ - `vector_store.py`: Manages document embeddings and similarity search
61
+ - `chat_service.py`: Interfaces with OpenAI for chat completions
62
+
63
+ - `models.py` defines the data structures using Pydantic
64
+ - `main.py` contains the FastAPI application and route handlers
65
+ - `dependencies.py` manages shared resources and dependencies
66
+
67
+ ## Requirements
68
+
69
+ See `requirements.txt` for a full list of Python dependencies. Key packages include:
70
+ - FastAPI
71
+ - OpenAI
72
+ - PyPDF2
73
+ - python-multipart
74
+ - python-dotenv
75
+ - uvicorn
76
+ ```
77
+
78
+ To create the directory structure, you can run these commands:
79
+
80
+ ```bash
81
+ mkdir -p backend/app/services
82
+ touch backend/app/__init__.py
83
+ touch backend/app/main.py
84
+ touch backend/app/models.py
85
+ touch backend/app/dependencies.py
86
+ touch backend/app/services/__init__.py
87
+ touch backend/app/services/document_processor.py
88
+ touch backend/app/services/vector_store.py
89
+ touch backend/app/services/chat_service.py
90
+ touch backend/requirements.txt
91
+ ```
92
+
93
+ This will create all the necessary files and directories in the structure you specified. The README.md provides a clear overview of the project structure, setup instructions, and key components for developers who will work with the backend.
backend/app/__init__.py ADDED
File without changes
backend/app/dependencies.py ADDED
File without changes
backend/app/main.py ADDED
File without changes
backend/app/models.py ADDED
File without changes
backend/app/services/__init__.py ADDED
File without changes
backend/app/services/chat_service.py ADDED
File without changes
backend/app/services/document_processor.py ADDED
File without changes
backend/app/services/vector_store.py ADDED
File without changes
backend/requirements.txt ADDED
File without changes