manu commited on
Commit
e93ac53
·
verified ·
1 Parent(s): e0694d7

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +17 -5
app.py CHANGED
@@ -133,9 +133,21 @@ def index_from_url(url: str) -> Tuple[str, str]:
133
 
134
  def search(query: str, k: int = 5) -> List[int]:
135
  """
136
- Search within an indexed PDF and return ONLY the indices of the most relevant pages (0-based).
 
 
 
 
 
 
 
 
 
 
 
 
137
  Returns:
138
- List[int]: Sorted unique 0-based indices of pages to inspect (includes neighbor expansion).
139
  """
140
  global ds, images
141
 
@@ -180,15 +192,15 @@ You are a PDF research agent with a single tool: mcp_test_search(query: string,
180
  Act iteratively:
181
  1) Split the user question into 1–4 focused sub-queries. Subqueries should be asked as natural language questions in the english language, not just keywords.
182
  2) For each sub-query, call mcp_test_search (k=5 by default; increase to up to 10 if you need to go deep).
183
- 3) You will receive the output of mcp_test_search as a list of indices corresponding to page numbers. Print them out and stop generating. You will be fed the corresponding pages as images in a follow-up message.
184
- 3) Stop early when confident; otherwise refine and repeat, running new searches. Up to 5 iterations and 20 searches in total. If info is missing, try to continue searching using new keywords and queries.
185
 
186
  Workflow:
187
  • Use ONLY the provided images for grounding and cite as (p.<page>).
188
  • If an answer is not present, say “Not found in the provided pages.”
189
 
190
  Deliverable:
191
- • Return a clear, standalone Markdown answer in the user's language. Include concise tables for lists of dates/items.
192
  """
193
  ).strip()
194
 
 
133
 
134
  def search(query: str, k: int = 5) -> List[int]:
135
  """
136
+ Search within a PDF document for the most relevant pages to answer a query and return the page indexes as a list.
137
+ MCP tool description:
138
+ - name: mcp_test_search
139
+ - description: Search within a PDF document for the most relevant pages to answer a query.
140
+ - input_schema:
141
+ type: object
142
+ properties:
143
+ query: {type: string, description: "User query in natural language."}
144
+ k: {type: integer, minimum: 1, maximum: 10, default: 5. description: "Number of top pages to retrieve."}
145
+ required: ["query"]
146
+ Args:
147
+ query (str): Natural-language question to search for.
148
+ k (int): Number of top results to return (1–10).
149
  Returns:
150
+ indices (List[int]): Indices of the k most relevant pages
151
  """
152
  global ds, images
153
 
 
192
  Act iteratively:
193
  1) Split the user question into 1–4 focused sub-queries. Subqueries should be asked as natural language questions in the english language, not just keywords.
194
  2) For each sub-query, call mcp_test_search (k=5 by default; increase to up to 10 if you need to go deep).
195
+ 3) You will receive the output of mcp_test_search as a list of indices corresponding to page numbers. Stop generating once all the tool calls end. You will later be fed the corresponding pages as images in a follow-up message.
196
+ 4) Stop early when confident; otherwise refine and repeat, running new search calls when need be. Use up to 5 iterations and 20 searches in total. If info is missing, try to continue searching using new keywords and queries.
197
 
198
  Workflow:
199
  • Use ONLY the provided images for grounding and cite as (p.<page>).
200
  • If an answer is not present, say “Not found in the provided pages.”
201
 
202
  Deliverable:
203
+ • Return a clear, standalone Markdown answer in the user's language. Include concise tables for lists of dates/items when useful, and cite the page numbers used for each fact.
204
  """
205
  ).strip()
206