zakerytclarke commited on
Commit
46490f9
·
verified ·
1 Parent(s): 0407638

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +210 -1
README.md CHANGED
@@ -48,4 +48,213 @@ base_model:
48
  pipeline_tag: text2text-generation
49
  ---
50
 
51
- # Teapot LLM
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  pipeline_tag: text2text-generation
49
  ---
50
 
51
+
52
+ # Teapot LLM
53
+
54
+ [Website](https://teapotai.com/) | [Demo](https://huggingface.co/spaces/teapotai/teapotchat) | [Discord](https://discord.gg/hPxGSn5dST)
55
+
56
+ TeapotAI is a small open-source language model (~300 million parameters) fine-tuned on synthetic data and optimized to run locally on resource-constrained devices such as smartphones and CPUs. Teapot can perform a variety of tasks, including hallucination-resistant Question Answering (QnA), Retrieval-Augmented Generation (RAG), and JSON extraction. Teapot is a model built by and for the community.
57
+
58
+ ## Getting Started
59
+ We recommend using our library [teapotai](https://pypi.org/project/teapotai/) to quickly integrate our models into production environments, as it handles the overhead of model configuration, document embeddings, error handling and prompt formatting. However, you can directly use the model from the transformers library on huggingface.
60
+
61
+ ### Installation
62
+
63
+ ```bash
64
+ ! pip install teapotai
65
+ ```
66
+
67
+ ---
68
+
69
+
70
+ ### 1. General Question Answering (QA)
71
+
72
+ Teapot can be used for general question answering based on a provided context. The model is optimized to respond conversationally and is trained to avoid answering questions that can't be answered from the given context, reducing hallucinations.
73
+
74
+
75
+
76
+ #### Example:
77
+
78
+ ```python
79
+ from teapotai import TeapotAI
80
+
81
+ # Sample context
82
+ context = """
83
+ The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
84
+ It stands at a height of 330 meters and is one of the most recognizable structures in the world.
85
+ """
86
+
87
+ teapot_ai = TeapotAI()
88
+
89
+ answer = teapot_ai.query(query="What is the height of the Eiffel Tower?", context=context)
90
+ print(answer) # => "The Eiffel Tower stands at a height of 330 meters. "
91
+ ```
92
+
93
+ #### Hallucination Example:
94
+
95
+ ```python
96
+ from teapotai import TeapotAI
97
+
98
+ # Sample context without height information
99
+ context = """
100
+ The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
101
+ """
102
+
103
+ teapot_ai = TeapotAI()
104
+
105
+ answer = teapot_ai.query(query="What is the height of the Eiffel Tower?", context=context)
106
+ print(answer) # => "I don't have information on the height of the Eiffel Tower."
107
+ ```
108
+
109
+ ---
110
+
111
+ ### 2. Chat with Retrieval Augmented Generation (RAG)
112
+
113
+ Teapot can also use Retrieval-Augmented Generation (RAG) to determine which documents are relevant before answering a question. This is useful when you have many documents you want to use as context, ensuring the model answers based on the most relevant ones.
114
+
115
+ #### Example:
116
+
117
+ ```python
118
+ from teapotai import TeapotAI
119
+ # Sample documents (in practice, these could be articles or longer documents)
120
+ documents = [
121
+ "The Eiffel Tower is located in Paris, France. It was built in 1889 and stands 330 meters tall.",
122
+ "The Great Wall of China is a historic fortification that stretches over 13,000 miles.",
123
+ "The Amazon Rainforest is the largest tropical rainforest in the world, covering over 5.5 million square kilometers.",
124
+ "The Grand Canyon is a natural landmark located in Arizona, USA, carved by the Colorado River.",
125
+ "Mount Everest is the tallest mountain on Earth, located in the Himalayas along the border between Nepal and China.",
126
+ "The Colosseum in Rome, Italy, is an ancient amphitheater known for its gladiator battles.",
127
+ "The Sahara Desert is the largest hot desert in the world, located in North Africa.",
128
+ "The Nile River is the longest river in the world, flowing through northeastern Africa.",
129
+ "The Empire State Building is an iconic skyscraper in New York City that was completed in 1931 and stands at 1454 feet tall."
130
+ ]
131
+
132
+
133
+ # Initialize TeapotAI with documents for RAG
134
+ teapot_ai = TeapotAI(documents=documents)
135
+
136
+ # Get the answer using RAG
137
+ answer = teapot_ai.chat([
138
+ {
139
+ "role":"system",
140
+ "content": "You are an agent designed to answer facts about famous landmarks."
141
+ },
142
+ {
143
+ "role":"user",
144
+ "content": "What landmark was constructed in the 1800s?"
145
+ }
146
+ ])
147
+ print(answer) # => The Eiffel Tower was constructed in the 1800s.
148
+ ```
149
+
150
+ #### Loading RAG Model:
151
+ You can save a model with pre-computed embeddings to reduce loading times. TeapotAI is pickle-compatible and can be saved and loaded as shown below.
152
+ ```python
153
+ import pickle
154
+
155
+ # Pickle the TeapotAI model to a file with pre-computed embeddings
156
+ with open("teapot_ai.pkl", "wb") as f:
157
+ pickle.dump(teapot_ai, f)
158
+
159
+
160
+ # Load the pickled model
161
+ with open("teapot_ai.pkl", "rb") as f:
162
+ loaded_teapot_ai = pickle.load(f)
163
+
164
+ # You can now use the loaded instance as you would normally
165
+ print(len(loaded_teapot_ai.documents)) # => 10 Documents with precomputed embeddings
166
+
167
+ loaded_teapot_ai.query("What city is the Eiffel Tower in?") # => "The Eiffel Tower is located in Paris, France."
168
+
169
+ ```
170
+
171
+ ---
172
+
173
+ ### 3. Information Extraction
174
+
175
+ Teapot can be used to extract structured information from context using pre-defined JSON structures. The extract method takes a Pydantic model to ensure Teapot extracts the correct types. Teapot can infer fields based on names and will also leverage descriptions if available. This method can also be used with RAG and query functionalities natively.
176
+
177
+ #### Example:
178
+
179
+ ```python
180
+ from teapotai import TeapotAI
181
+ from pydantic import BaseModel
182
+
183
+ # Sample text containing apartment details
184
+ apartment_description = """
185
+ This spacious 2-bedroom apartment is available for rent in downtown New York. The monthly rent is $2500.
186
+ It includes 1 bathrooms and a fully equipped kitchen with modern appliances.
187
+
188
+ Pets are welcome!
189
+
190
+ Please reach out to us at 555-123-4567 or [email protected]
191
+ """
192
+
193
+ # Define a Pydantic model for the data you want to extract
194
+ class ApartmentInfo(BaseModel):
195
+ rent: float = Field(..., description="the monthly rent in dollars")
196
+ bedrooms: int = Field(..., description="the number of bedrooms")
197
+ bathrooms: int = Field(..., description="the number of bathrooms")
198
+ phone_number: str
199
+
200
+ # Initialize TeapotAI
201
+ teapot_ai = TeapotAI()
202
+
203
+ # Extract the apartment details
204
+ extracted_info = teapot_ai.extract(ApartmentInfo, context=apartment_description)
205
+ print(extracted_info) # => ApartmentInfo(rent=2500.0 bedrooms=2 bathrooms=1 phone_number='555-123-4567')
206
+ ```
207
+
208
+ ### Native Transformer Support
209
+ While we recommend using TeapotAI's library, you can load the base model directly with Hugging Face's Transformers library as follows:
210
+ ```python
211
+ from transformers import pipeline
212
+
213
+ # Load the model
214
+ teapot_ai = pipeline("text2text-generation", "teapotai/teapotllm")
215
+
216
+ context = """
217
+ The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
218
+ It stands at a height of 330 meters and is one of the most recognizable structures in the world.
219
+ """
220
+
221
+ question = "What is the height of the Eiffel Tower?"
222
+
223
+ answer = teapot_ai(context+"\n"+question)
224
+
225
+ print(answer[0].get('generated_text')) # => The Eiffel Tower stands at a height of 330 meters.
226
+ ```
227
+
228
+ ---
229
+
230
+
231
+ ## Model Details
232
+ Teapot LLM is fine-tuned from [flan-t5-base](https://huggingface.co/google/flan-t5-base) on a [synthetic dataset](https://huggingface.co/datasets/teapotai/synthqa) of LLM tasks generated using [Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B). Teapot
233
+
234
+ ### Conversational Question Answering
235
+ Teapot is fine-tuned to provide friendly, conversational answers using context and documents provided as references.
236
+
237
+ ### Hallucination Resistance
238
+ Teapot is trained to only output answers that can be derived from the provided context, ensuring that even though it is a small model, it performs demonstrably better by refusing to answer questions when there is insufficient data.
239
+
240
+ ### Retrieval Augmented Generation
241
+ Teapot is further fine-tuned on the task of retrieval augmented generation by utilizing a custom [embedding model](https://huggingface.co/teapotai/teapotembedding). We perform RAG across multiple documents from our training data and the model is able to learn to extract relevant details for question answering.
242
+
243
+ ### Information Extraction
244
+ Teapot has been trained to extract succint answers in a variety of format enabling efficient document parsing. Teapot is trained natively to output standard data types such as numbers, strings, and even json.
245
+
246
+ ### Training Details
247
+ - [Dataset] ~4mb synthetic dataset consisting of QnA pairs with a variety of task specific formats.
248
+ - [Methodology] The model is trained to mimic task specific output formats, and is scored based on its ability to output relevant, succint and verifiable answers in the requested format.
249
+ - [Hardware] Teapot was trained for ~2hr on an A100 provided by Google Colab.
250
+ - [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
251
+
252
+ ### Limitations and Risks
253
+ Teapot is trained specifically for question answering use cases and is not intended to be used for code generation, creative writing or critical decision applications. Teapot has only been trained on specific languages supported by flan-t5 and has not been evaluated for performance in languages other than English.
254
+
255
+ ### License
256
+ This model, the embedding model and the synthetic dataset are all provided open source under the MIT LICENSE.
257
+
258
+ ## Questions, Feature Requests?
259
+
260
+ We hope you find TeapotAI useful and are continuosuly working to improve our models. Please reach out to us on our [Discord](https://discord.gg/hPxGSn5dST) for any technical help or feature requrests. We look forwarding to seeing what our community can build!