Rajarshi-Roy-research commited on
Commit
f5c6b19
·
verified ·
1 Parent(s): e9cd6c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -290
README.md CHANGED
@@ -1,290 +1,10 @@
1
- # Legal Agent Codebase
2
-
3
- - [LinkedIn - Rajarshi Roy](https://www.linkedin.com/in/rajarshi-roy-learner/)
4
-
5
- - [Github - Rajarshi Roy](https://github.com/Rajarshi12321/)
6
-
7
- - [Medium - Rajarshi Roy](https://medium.com/@rajarshiroy.machinelearning)
8
-
9
- - [Kaggle - Rajarshi Roy](https://www.kaggle.com/rajarshiroy0123/)
10
- - [Mail - Rajarshi Roy](mailto:[email protected])
11
- - [Personal-Website - Rajarshi Roy](https://rajarshi12321.github.io/rajarshi_portfolio/)
12
-
13
-
14
- ## Table of Contents
15
-
16
- - [Legal Agent Codebase](#legal-agent-codebase)
17
- - [Table of Contents](#table-of-contents)
18
- - [(DEMO README)](#demo-readme)
19
- - [About The Project](#about-the-project)
20
- - [Functionality](#functionality)
21
- - [Code Structure](#code-structure)
22
- - [Dependencies](#dependencies)
23
- - [Working with the code](#working-with-the-code)
24
- - [Deploying the project from your side (In AWS)](#deploying-the-project-from-your-side-in-aws)
25
- - [Contributing](#contributing)
26
- - [Contact](#contact)
27
- - [License](#license)
28
-
29
- # (DEMO README)
30
-
31
- ## About The Project
32
-
33
- This projects aim to implements a sophisticated document retrieval and question answering system using LangChain, leveraging Google's Gemini-1.5-flash language model and a FAISS vector database. The system is designed to handle legal queries, providing comprehensive and accurate answers by combining information retrieval from a local knowledge base with web search capabilities when necessary.
34
-
35
- ## Functionality
36
-
37
- The system follows a multi-stage workflow:
38
-
39
- 1. **Query Input:** The user provides a query (e.g., a legal question).
40
-
41
- 2. **FAISS Retrieval:** The query is embedded using Google Generative AI embeddings, and the FAISS index (a local vector database) is queried to retrieve the most relevant documents.
42
-
43
- 3. **Grounded Response Generation:** A `DocSummarizerPipeline` summarizes the retrieved documents, focusing on the user's query. This summary attempts to directly answer the question using only the retrieved documents.
44
-
45
- 4. **Response Evaluation:** An `IntermediateStateResponseEvaluator` assesses the quality and completeness of the generated response. This evaluation uses the Gemini model to determine if the response sufficiently answers the query.
46
-
47
- 5. **Web Search (Conditional):** If the generated response is deemed insufficient, a `WebSearchAgent` performs a web search using DuckDuckGo to gather additional information. The results are then incorporated into the final response.
48
-
49
- 6. **Response Output:** The final answer, either from the document summary or the combined document/web search result, is returned to the user.
50
-
51
-
52
- ## Code Structure
53
-
54
- The code is organized into several classes and functions:
55
-
56
- * **`FaissRetriever`:** Loads and interacts with the FAISS index, retrieving relevant documents based on a query.
57
-
58
- * **`DocSummarizerPipeline`:** Summarizes retrieved documents using the Gemini model, generating a concise answer focused on the user's query. It uses a carefully crafted prompt to ensure the response is structured and informative.
59
-
60
- * **`WebSearchAgent`:** Performs web searches using DuckDuckGo and integrates the results into the response.
61
-
62
- * **`IntermediateStateResponseEvaluator`:** Evaluates the quality of the generated response using the Gemini model, determining if additional information is needed.
63
-
64
- * **`State` (TypedDict):** Defines the data structure for passing information between stages of the workflow.
65
-
66
- * **Workflow Functions (`faiss_content_retriever`, `grounded_response`, `response_judge`, `web_response`):** These functions represent individual nodes in the LangGraph workflow.
67
-
68
- * **`StateGraph`:** Defines the workflow using LangGraph, managing the flow of data between the different stages. Conditional logic is implemented to determine whether a web search is necessary.
69
-
70
- * **`run_user_query`:** The main function that takes a user query and processes it through the LangGraph workflow.
71
- <div align="center">
72
- Agent Workflow:
73
-
74
- ![alt text](assets/agent_workflow.png)
75
- </div>
76
-
77
- ## Dependencies
78
-
79
- The code relies on several libraries:
80
-
81
- * `langgraph`
82
- * `langchain-core`
83
- * `langchain-google-genai`
84
- * `IPython`
85
- * `dotenv`
86
- * `google.generativeai`
87
- * `langchain.chains.question_answering`
88
- * `langchain.prompts`
89
- * `langchain.vectorstores`
90
- * `langchain_community.tools`
91
- * `langchain.agents`
92
-
93
-
94
-
95
- ## Working with the code
96
-
97
-
98
- I have commented most of the neccesary information in the respective files.
99
-
100
- To run this project locally, please follow these steps:-
101
-
102
- 1. Clone the repository:
103
-
104
- ```shell
105
- git clone https://github.com/Rajarshi12321/legal-agent.git
106
- ```
107
-
108
-
109
- 2. **Create a Virtual Environment** (Optional but recommended)
110
- It's a good practice to create a virtual environment to manage project dependencies. Run the following command:
111
- ```shell
112
- conda create -p <Environment_Name> python==<python version> -y
113
- ```
114
- Example:
115
- ```shell
116
- conda create -p venv python=3.9 -y
117
- ```
118
- Note:
119
- - It is important to use python=3.9 or above for proper use of Langchain or else you would get unexpecterd errors
120
-
121
-
122
- 3. **Activate the Virtual Environment** (Optional)
123
- Activate the virtual environment based on your operating system:
124
- ```shell
125
- conda activate <Environment_Name>/
126
- ```
127
- Example:
128
- ```shell
129
- conda activate venv/
130
- ```
131
-
132
- 4. **Install Dependencies**
133
-
134
- - Run the following command to install project dependencies:
135
- ```
136
- pip install -r requirements.txt
137
- ```
138
-
139
- Ensure you have Python installed on your system (Python 3.9 or higher is recommended).<br />
140
- Once the dependencies are installed, you're ready to use the project.
141
-
142
- 5. Create a .env file in the root directory and add your Gemini and Langchain credentials as follows:
143
- ```shell
144
- GOOGLE_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
145
- ```
146
-
147
-
148
- 6. Run the Flask app: Execute the following code in your terminal.
149
- ```shell
150
- chainlit run app.py
151
- ```
152
-
153
-
154
- 6. Access the app: Open your web browser and navigate to http://localhost:8000/ to use the House Price Prediction and Property Recommendation app.
155
-
156
- ## Deploying the project from your side (In AWS)
157
-
158
- I have already made a github actions file in `.github\workflows\main.yaml`
159
- To use it you need to the following prerequisites:
160
-
161
- 1. Make a IAM Role from your aws account
162
- - 1. Login to AWS console.
163
-
164
- - 2. Create IAM user for deployment
165
-
166
- #with specific access
167
-
168
- 1. EC2 access : It is virtual machine
169
-
170
- 2. ECR: Elastic Container registry to save your docker image in aws
171
-
172
-
173
- #Policy: (You need to select these policies when building the user)
174
-
175
- 1. AmazonEC2ContainerRegistryFullAccess
176
-
177
- 2. AmazonEC2FullAccess
178
-
179
- 2. Building the full infrastructure using Terraform </br></br>
180
- 1st you need to configure your aws account using the created IAM role by the command `aws configure` so that terraform can know which account to use </br></br>
181
- NOTE: If you don't want to use terraform for building infrastructure you can also build this manually from aws console:</br>
182
- For reference watch this video from `3:47:20` time frame : [Youtube link](https://www.youtube.com/watch?v=86BKEv0X2xU)</br></br>
183
- Get to the terraform directory: `infrastructure\terraform` and execute the following commands:
184
- </br></br>
185
- Initializing Terraform
186
- ```shell
187
- terraform init
188
- ```
189
- Forming a plan according the described infrastructure
190
- ```shell
191
- terraform plan
192
- ```
193
- Applying the planned infrastructure to build necessary resources
194
- ```shell
195
- terraform apply -auto-approve
196
- ```
197
-
198
- </br></br>
199
- 3. After this you Need to configure your EC2 instance to install Docker:
200
- </br>Run The Following commands:
201
- ```shell
202
- sudo apt-get update -y
203
-
204
- sudo apt-get upgrade
205
-
206
-
207
- curl -fsSL https://get.docker.com -o get-docker.sh
208
-
209
- sudo sh get-docker.sh
210
-
211
- sudo usermod -aG docker ubuntu
212
-
213
- newgrp docker
214
- ```
215
- 4. After this you need to configure the self-runner for github actions to actually deploy it to EC2 instance:
216
- </br></br>
217
- Check out the [Youtube vidoe](https://www.youtube.com/watch?v=86BKEv0X2xU) for reference from 3:54:38 time frame
218
- </br></br>
219
- The commands for settinng up self-hosted runner will be like: </br></br>
220
- (NOTE: Do use the commands from your actions runner, the below commands are just for your reference)
221
-
222
- ```shell
223
- mkdir actions-runner && cd actions-runner
224
-
225
- curl -o actions-runner-linux-x64-2.316.1.tar.gz -L https://github.com/actions/runner/releases/download/v2.316.1/actions-runner-linux-x64-2.316.1.tar.gz
226
-
227
-
228
- echo "d62de2400eeeacd195db91e2ff011bfb646cd5d85545e81d8f78c436183e09a8 actions-runner-linux-x64-2.316.1.tar.gz" | shasum -a 256 -c
229
-
230
-
231
- tar xzf ./actions-runner-linux-x64-2.316.1.tar.gz
232
-
233
- ./config.sh --url https://github.com/Rajarshi12321/main_app_deploy --token AWSY7XQOYHXWPQKGRAEQWRDGJD2GS
234
-
235
- ./run.sh
236
-
237
- ```
238
-
239
- name the runner as : `self-hosted`
240
-
241
- 1. Follow the Following [youtube video](https://www.youtube.com/watch?v=86BKEv0X2xU) from `3:57:14` time frame to know which secret Key and Value to add to your github actions secrets. Additionlly you have to add the `GOOGLE_API_KEY` in the secrets to with same key name as used in `.env` and their api keys as the values.
242
-
243
- 2. Finally after doing all this you can run you github actions smoothly which is run by the instructions of `.github\workflows\main.yaml`
244
- </br></br>
245
- **Description: About the deployment by main.yaml**
246
-
247
- 1. Build docker image of the source code
248
-
249
- 2. Push your docker image to ECR
250
-
251
- 3. Launch Your EC2
252
-
253
- 4. Pull Your image from ECR in EC2
254
-
255
- 5. Lauch your docker image in EC2
256
-
257
-
258
- Now making any changes in any file except the readme.md file and assets folder (which contains images for readme) will trigger the github action CI/CD pipeline for development.
259
-
260
- NOTE: Do keep an eye on the state of the `self-hosted` runner, if its `idle` or `offline` then fix the condition my connecting to ec2 instance and run the `run.sh` file by:
261
-
262
- ```shell
263
- cd actions-runner
264
-
265
- ./run.sh
266
- ```
267
-
268
-
269
-
270
- ## Contributing
271
- I welcome contributions to improve the functionality and performance of the app. If you'd like to contribute, please follow these guidelines:
272
-
273
- 1. Fork the repository and create a new branch for your feature or bug fix.
274
-
275
- 2. Make your changes and ensure that the code is well-documented.
276
-
277
- 3. Test your changes thoroughly to maintain app reliability.
278
-
279
- 4. Create a pull request, detailing the purpose and changes made in your contribution.
280
-
281
- ## Contact
282
-
283
- Rajarshi Roy - [[email protected]](mailto:[email protected])
284
-
285
-
286
-
287
- ## License
288
- This project is licensed under the MIT License. Feel free to modify and distribute it as per the terms of the license.
289
-
290
- I hope this README provides you with the necessary information to get started with the road to Generative AI with Google Gemini and Langchain.
 
1
+ ---
2
+ title: Legal-Agent
3
+ emoji: 🌖
4
+ colorFrom: gray
5
+ colorTo: blue
6
+ sdk: docker
7
+ sdk_version: 5.18.0
8
+ app_file: app.py
9
+ pinned: false
10
+ ---