Spaces:
Running
Running
documentation
Browse files
README.md
CHANGED
|
@@ -16,11 +16,14 @@ license: apache-2.0
|
|
| 16 |
|
| 17 |
## Introduction
|
| 18 |
|
| 19 |
-
Question/Answering on scientific documents using LLMs
|
| 20 |
-
|
| 21 |
-
Differently to most of the
|
|
|
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
| 24 |
|
| 25 |
**Demos**:
|
| 26 |
- (on HuggingFace spaces): https://lfoppiano-document-qa.hf.space/
|
|
|
|
| 16 |
|
| 17 |
## Introduction
|
| 18 |
|
| 19 |
+
Question/Answering on scientific documents using LLMs: ChatGPT-3.5-turbo, Mistral-7b-instruct and Zephyr-7b-beta.
|
| 20 |
+
The streamlit application demonstrate the implementaiton of a RAG (Retrieval Augmented Generation) on scientific documents, that we are developing at NIMS (National Institute for Materials Science), in Tsukuba, Japan.
|
| 21 |
+
Differently to most of the projects, we focus on scientific articles.
|
| 22 |
+
We target only the full-text using [Grobid](https://github.com/kermitt2/grobid) that provide and cleaner results than the raw PDF2Text converter (which is comparable with most of other solutions).
|
| 23 |
|
| 24 |
+
Additionally, this frontend provides the visualisation of named entities on LLM responses to extract <span stype="color:yellow">physical quantities, measurements</span> (with [grobid-quantities](https://github.com/kermitt2/grobid-quantities)) and <span stype="color:blue">materials</span> mentions (with [grobid-superconductors](https://github.com/lfoppiano/grobid-superconductors)).
|
| 25 |
+
|
| 26 |
+
The conversation is backed up by a sliding window memory (top 4 more recent messages) that help refers to information previously discussed in the chat.
|
| 27 |
|
| 28 |
**Demos**:
|
| 29 |
- (on HuggingFace spaces): https://lfoppiano-document-qa.hf.space/
|