Spaces:

LunaticMaestro
/

rag

Sleeping

Deepak Sahu commited on Jan 22

Commit

060a333

1 Parent(s): 32b22f3

adding augmented generation

Files changed (2) hide show

app.py CHANGED Viewed

@@ -16,7 +16,9 @@ Manually Downloaded as HTML files:
 ## Details
-1. Vector Store is built using FAISS prior to starting this app. Although the vector store size in KBs but the creation and loading of the store takes processing takes ~10GB RAM and lasts 5 mins. Hence **NOT BUILDING IT DURING RUNTIME OF APP**.
 '''

 ## Details
+1. Vector Store is built using FAISS prior to starting this app. Although the vector store size in KBs but
+    - the creation and loading of the store takes processing takes ~10GB RAM and lasts 5 mins. Hence **NOT BUILDING IT DURING RUNTIME OF APP**.
+    - `multi_process=True` was not working with HF Space `free-tier`.
 '''

z_generate.py CHANGED Viewed

@@ -26,5 +26,48 @@ class ServerlessInference:
     def perform_rag(self, query:str):
         # First perform text search
-        relevant_docs = self.vs_text.similarity_search(query=query, k=5)
-        return relevant_docs[0].page_content

     def perform_rag(self, query:str):
         # First perform text search
+        # Retrieval
+        retrieved_docs = self.vs_text.similarity_search(query=query, k=5)
+        retrieved_docs_text = [doc.page_content for doc in retrieved_docs]  # We only need the text of the documents
+        context = "\nExtracted documents:\n"
+        context += "".join([f"Document {str(i)}:::\n" + doc for i, doc in enumerate(retrieved_docs_text)])
+        # Augmented Generation
+        messages:str  = [
+            {
+                "role": "system",
+                "content": """Using the information contained in the context,
+give a comprehensive answer to the question.
+Respond only to the question asked, response should be concise and relevant to the question.
+Provide the number of the source document when relevant.
+If the answer cannot be deduced from the context, do not give an answer.""",
+            },
+            {
+                "role": "user",
+                "content": """Context:
+{context}
+---
+Now here is the question you need to answer.
+Question: {question}""".format(context=context, question=query),
+            },
+        ]
+        completion = self.client.chat.completions.create(
+            model=self.model,
+            messages=messages,
+            max_tokens=500
+        )
+        response_text = completion.choices[0].message.content
+        return response_text