Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
@@ -1,290 +1,10 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
- [Personal-Website - Rajarshi Roy](https://rajarshi12321.github.io/rajarshi_portfolio/)
|
12 |
-
|
13 |
-
|
14 |
-
## Table of Contents
|
15 |
-
|
16 |
-
- [Legal Agent Codebase](#legal-agent-codebase)
|
17 |
-
- [Table of Contents](#table-of-contents)
|
18 |
-
- [(DEMO README)](#demo-readme)
|
19 |
-
- [About The Project](#about-the-project)
|
20 |
-
- [Functionality](#functionality)
|
21 |
-
- [Code Structure](#code-structure)
|
22 |
-
- [Dependencies](#dependencies)
|
23 |
-
- [Working with the code](#working-with-the-code)
|
24 |
-
- [Deploying the project from your side (In AWS)](#deploying-the-project-from-your-side-in-aws)
|
25 |
-
- [Contributing](#contributing)
|
26 |
-
- [Contact](#contact)
|
27 |
-
- [License](#license)
|
28 |
-
|
29 |
-
# (DEMO README)
|
30 |
-
|
31 |
-
## About The Project
|
32 |
-
|
33 |
-
This projects aim to implements a sophisticated document retrieval and question answering system using LangChain, leveraging Google's Gemini-1.5-flash language model and a FAISS vector database. The system is designed to handle legal queries, providing comprehensive and accurate answers by combining information retrieval from a local knowledge base with web search capabilities when necessary.
|
34 |
-
|
35 |
-
## Functionality
|
36 |
-
|
37 |
-
The system follows a multi-stage workflow:
|
38 |
-
|
39 |
-
1. **Query Input:** The user provides a query (e.g., a legal question).
|
40 |
-
|
41 |
-
2. **FAISS Retrieval:** The query is embedded using Google Generative AI embeddings, and the FAISS index (a local vector database) is queried to retrieve the most relevant documents.
|
42 |
-
|
43 |
-
3. **Grounded Response Generation:** A `DocSummarizerPipeline` summarizes the retrieved documents, focusing on the user's query. This summary attempts to directly answer the question using only the retrieved documents.
|
44 |
-
|
45 |
-
4. **Response Evaluation:** An `IntermediateStateResponseEvaluator` assesses the quality and completeness of the generated response. This evaluation uses the Gemini model to determine if the response sufficiently answers the query.
|
46 |
-
|
47 |
-
5. **Web Search (Conditional):** If the generated response is deemed insufficient, a `WebSearchAgent` performs a web search using DuckDuckGo to gather additional information. The results are then incorporated into the final response.
|
48 |
-
|
49 |
-
6. **Response Output:** The final answer, either from the document summary or the combined document/web search result, is returned to the user.
|
50 |
-
|
51 |
-
|
52 |
-
## Code Structure
|
53 |
-
|
54 |
-
The code is organized into several classes and functions:
|
55 |
-
|
56 |
-
* **`FaissRetriever`:** Loads and interacts with the FAISS index, retrieving relevant documents based on a query.
|
57 |
-
|
58 |
-
* **`DocSummarizerPipeline`:** Summarizes retrieved documents using the Gemini model, generating a concise answer focused on the user's query. It uses a carefully crafted prompt to ensure the response is structured and informative.
|
59 |
-
|
60 |
-
* **`WebSearchAgent`:** Performs web searches using DuckDuckGo and integrates the results into the response.
|
61 |
-
|
62 |
-
* **`IntermediateStateResponseEvaluator`:** Evaluates the quality of the generated response using the Gemini model, determining if additional information is needed.
|
63 |
-
|
64 |
-
* **`State` (TypedDict):** Defines the data structure for passing information between stages of the workflow.
|
65 |
-
|
66 |
-
* **Workflow Functions (`faiss_content_retriever`, `grounded_response`, `response_judge`, `web_response`):** These functions represent individual nodes in the LangGraph workflow.
|
67 |
-
|
68 |
-
* **`StateGraph`:** Defines the workflow using LangGraph, managing the flow of data between the different stages. Conditional logic is implemented to determine whether a web search is necessary.
|
69 |
-
|
70 |
-
* **`run_user_query`:** The main function that takes a user query and processes it through the LangGraph workflow.
|
71 |
-
<div align="center">
|
72 |
-
Agent Workflow:
|
73 |
-
|
74 |
-

|
75 |
-
</div>
|
76 |
-
|
77 |
-
## Dependencies
|
78 |
-
|
79 |
-
The code relies on several libraries:
|
80 |
-
|
81 |
-
* `langgraph`
|
82 |
-
* `langchain-core`
|
83 |
-
* `langchain-google-genai`
|
84 |
-
* `IPython`
|
85 |
-
* `dotenv`
|
86 |
-
* `google.generativeai`
|
87 |
-
* `langchain.chains.question_answering`
|
88 |
-
* `langchain.prompts`
|
89 |
-
* `langchain.vectorstores`
|
90 |
-
* `langchain_community.tools`
|
91 |
-
* `langchain.agents`
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
## Working with the code
|
96 |
-
|
97 |
-
|
98 |
-
I have commented most of the neccesary information in the respective files.
|
99 |
-
|
100 |
-
To run this project locally, please follow these steps:-
|
101 |
-
|
102 |
-
1. Clone the repository:
|
103 |
-
|
104 |
-
```shell
|
105 |
-
git clone https://github.com/Rajarshi12321/legal-agent.git
|
106 |
-
```
|
107 |
-
|
108 |
-
|
109 |
-
2. **Create a Virtual Environment** (Optional but recommended)
|
110 |
-
It's a good practice to create a virtual environment to manage project dependencies. Run the following command:
|
111 |
-
```shell
|
112 |
-
conda create -p <Environment_Name> python==<python version> -y
|
113 |
-
```
|
114 |
-
Example:
|
115 |
-
```shell
|
116 |
-
conda create -p venv python=3.9 -y
|
117 |
-
```
|
118 |
-
Note:
|
119 |
-
- It is important to use python=3.9 or above for proper use of Langchain or else you would get unexpecterd errors
|
120 |
-
|
121 |
-
|
122 |
-
3. **Activate the Virtual Environment** (Optional)
|
123 |
-
Activate the virtual environment based on your operating system:
|
124 |
-
```shell
|
125 |
-
conda activate <Environment_Name>/
|
126 |
-
```
|
127 |
-
Example:
|
128 |
-
```shell
|
129 |
-
conda activate venv/
|
130 |
-
```
|
131 |
-
|
132 |
-
4. **Install Dependencies**
|
133 |
-
|
134 |
-
- Run the following command to install project dependencies:
|
135 |
-
```
|
136 |
-
pip install -r requirements.txt
|
137 |
-
```
|
138 |
-
|
139 |
-
Ensure you have Python installed on your system (Python 3.9 or higher is recommended).<br />
|
140 |
-
Once the dependencies are installed, you're ready to use the project.
|
141 |
-
|
142 |
-
5. Create a .env file in the root directory and add your Gemini and Langchain credentials as follows:
|
143 |
-
```shell
|
144 |
-
GOOGLE_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
|
145 |
-
```
|
146 |
-
|
147 |
-
|
148 |
-
6. Run the Flask app: Execute the following code in your terminal.
|
149 |
-
```shell
|
150 |
-
chainlit run app.py
|
151 |
-
```
|
152 |
-
|
153 |
-
|
154 |
-
6. Access the app: Open your web browser and navigate to http://localhost:8000/ to use the House Price Prediction and Property Recommendation app.
|
155 |
-
|
156 |
-
## Deploying the project from your side (In AWS)
|
157 |
-
|
158 |
-
I have already made a github actions file in `.github\workflows\main.yaml`
|
159 |
-
To use it you need to the following prerequisites:
|
160 |
-
|
161 |
-
1. Make a IAM Role from your aws account
|
162 |
-
- 1. Login to AWS console.
|
163 |
-
|
164 |
-
- 2. Create IAM user for deployment
|
165 |
-
|
166 |
-
#with specific access
|
167 |
-
|
168 |
-
1. EC2 access : It is virtual machine
|
169 |
-
|
170 |
-
2. ECR: Elastic Container registry to save your docker image in aws
|
171 |
-
|
172 |
-
|
173 |
-
#Policy: (You need to select these policies when building the user)
|
174 |
-
|
175 |
-
1. AmazonEC2ContainerRegistryFullAccess
|
176 |
-
|
177 |
-
2. AmazonEC2FullAccess
|
178 |
-
|
179 |
-
2. Building the full infrastructure using Terraform </br></br>
|
180 |
-
1st you need to configure your aws account using the created IAM role by the command `aws configure` so that terraform can know which account to use </br></br>
|
181 |
-
NOTE: If you don't want to use terraform for building infrastructure you can also build this manually from aws console:</br>
|
182 |
-
For reference watch this video from `3:47:20` time frame : [Youtube link](https://www.youtube.com/watch?v=86BKEv0X2xU)</br></br>
|
183 |
-
Get to the terraform directory: `infrastructure\terraform` and execute the following commands:
|
184 |
-
</br></br>
|
185 |
-
Initializing Terraform
|
186 |
-
```shell
|
187 |
-
terraform init
|
188 |
-
```
|
189 |
-
Forming a plan according the described infrastructure
|
190 |
-
```shell
|
191 |
-
terraform plan
|
192 |
-
```
|
193 |
-
Applying the planned infrastructure to build necessary resources
|
194 |
-
```shell
|
195 |
-
terraform apply -auto-approve
|
196 |
-
```
|
197 |
-
|
198 |
-
</br></br>
|
199 |
-
3. After this you Need to configure your EC2 instance to install Docker:
|
200 |
-
</br>Run The Following commands:
|
201 |
-
```shell
|
202 |
-
sudo apt-get update -y
|
203 |
-
|
204 |
-
sudo apt-get upgrade
|
205 |
-
|
206 |
-
|
207 |
-
curl -fsSL https://get.docker.com -o get-docker.sh
|
208 |
-
|
209 |
-
sudo sh get-docker.sh
|
210 |
-
|
211 |
-
sudo usermod -aG docker ubuntu
|
212 |
-
|
213 |
-
newgrp docker
|
214 |
-
```
|
215 |
-
4. After this you need to configure the self-runner for github actions to actually deploy it to EC2 instance:
|
216 |
-
</br></br>
|
217 |
-
Check out the [Youtube vidoe](https://www.youtube.com/watch?v=86BKEv0X2xU) for reference from 3:54:38 time frame
|
218 |
-
</br></br>
|
219 |
-
The commands for settinng up self-hosted runner will be like: </br></br>
|
220 |
-
(NOTE: Do use the commands from your actions runner, the below commands are just for your reference)
|
221 |
-
|
222 |
-
```shell
|
223 |
-
mkdir actions-runner && cd actions-runner
|
224 |
-
|
225 |
-
curl -o actions-runner-linux-x64-2.316.1.tar.gz -L https://github.com/actions/runner/releases/download/v2.316.1/actions-runner-linux-x64-2.316.1.tar.gz
|
226 |
-
|
227 |
-
|
228 |
-
echo "d62de2400eeeacd195db91e2ff011bfb646cd5d85545e81d8f78c436183e09a8 actions-runner-linux-x64-2.316.1.tar.gz" | shasum -a 256 -c
|
229 |
-
|
230 |
-
|
231 |
-
tar xzf ./actions-runner-linux-x64-2.316.1.tar.gz
|
232 |
-
|
233 |
-
./config.sh --url https://github.com/Rajarshi12321/main_app_deploy --token AWSY7XQOYHXWPQKGRAEQWRDGJD2GS
|
234 |
-
|
235 |
-
./run.sh
|
236 |
-
|
237 |
-
```
|
238 |
-
|
239 |
-
name the runner as : `self-hosted`
|
240 |
-
|
241 |
-
1. Follow the Following [youtube video](https://www.youtube.com/watch?v=86BKEv0X2xU) from `3:57:14` time frame to know which secret Key and Value to add to your github actions secrets. Additionlly you have to add the `GOOGLE_API_KEY` in the secrets to with same key name as used in `.env` and their api keys as the values.
|
242 |
-
|
243 |
-
2. Finally after doing all this you can run you github actions smoothly which is run by the instructions of `.github\workflows\main.yaml`
|
244 |
-
</br></br>
|
245 |
-
**Description: About the deployment by main.yaml**
|
246 |
-
|
247 |
-
1. Build docker image of the source code
|
248 |
-
|
249 |
-
2. Push your docker image to ECR
|
250 |
-
|
251 |
-
3. Launch Your EC2
|
252 |
-
|
253 |
-
4. Pull Your image from ECR in EC2
|
254 |
-
|
255 |
-
5. Lauch your docker image in EC2
|
256 |
-
|
257 |
-
|
258 |
-
Now making any changes in any file except the readme.md file and assets folder (which contains images for readme) will trigger the github action CI/CD pipeline for development.
|
259 |
-
|
260 |
-
NOTE: Do keep an eye on the state of the `self-hosted` runner, if its `idle` or `offline` then fix the condition my connecting to ec2 instance and run the `run.sh` file by:
|
261 |
-
|
262 |
-
```shell
|
263 |
-
cd actions-runner
|
264 |
-
|
265 |
-
./run.sh
|
266 |
-
```
|
267 |
-
|
268 |
-
|
269 |
-
|
270 |
-
## Contributing
|
271 |
-
I welcome contributions to improve the functionality and performance of the app. If you'd like to contribute, please follow these guidelines:
|
272 |
-
|
273 |
-
1. Fork the repository and create a new branch for your feature or bug fix.
|
274 |
-
|
275 |
-
2. Make your changes and ensure that the code is well-documented.
|
276 |
-
|
277 |
-
3. Test your changes thoroughly to maintain app reliability.
|
278 |
-
|
279 |
-
4. Create a pull request, detailing the purpose and changes made in your contribution.
|
280 |
-
|
281 |
-
## Contact
|
282 |
-
|
283 |
-
Rajarshi Roy - [[email protected]](mailto:[email protected])
|
284 |
-
|
285 |
-
|
286 |
-
|
287 |
-
## License
|
288 |
-
This project is licensed under the MIT License. Feel free to modify and distribute it as per the terms of the license.
|
289 |
-
|
290 |
-
I hope this README provides you with the necessary information to get started with the road to Generative AI with Google Gemini and Langchain.
|
|
|
1 |
+
---
|
2 |
+
title: Legal-Agent
|
3 |
+
emoji: 🌖
|
4 |
+
colorFrom: gray
|
5 |
+
colorTo: blue
|
6 |
+
sdk: docker
|
7 |
+
sdk_version: 5.18.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|