sccastillo commited on
Commit
9761db6
·
verified ·
1 Parent(s): 88ca848

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -3
README.md CHANGED
@@ -11,17 +11,80 @@ tags:
11
  base_model: unsloth/llama-3-8b-bnb-4bit
12
  ---
13
 
14
- # LLama 3 for router module in RAG
15
 
16
  While developing complex RAG applications, I found a common need for router functionality to map user queries to different system workflows (and APIs). The router acts as a dispatcher that can enhance responsiveness and accuracy by choosing the best workflow or API based on the query context. This implies that we need to produce structured output from unstructured input text.
17
 
18
  To this end, I will undertake a simple exercise to fine-tune the new Llama 3 model to process text input and generate JSON-like output. My hope is that we can avoid some external dependencies for this part of the system by seamlessly integrating various models to reinforce complex applications in production settings. It is my belief that building a robust critical infrastructure for the semantic modules requires choosing the right LLM for a given task.
19
  For training, we will use structured data from [azizshaw](https://huggingface.co/azizshaw/text_to_json). The dataset has 485 rows and contains 'input', 'output' and 'instruction' columns.
20
 
21
- For a quick evaluation, let's use another dataset for text-to-JSON, the **Diverse Restricted JSON Data Extraction**, curated by: The paraloq analytics team ([here](https://huggingface.co/datasets/paraloq/json_data_extraction))
22
 
 
23
 
24
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - **Developed by:** sccastillo
26
  - **License:** apache-2.0
27
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
 
11
  base_model: unsloth/llama-3-8b-bnb-4bit
12
  ---
13
 
14
+ ## LLama 3 for router module in RAG (a toy example)
15
 
16
  While developing complex RAG applications, I found a common need for router functionality to map user queries to different system workflows (and APIs). The router acts as a dispatcher that can enhance responsiveness and accuracy by choosing the best workflow or API based on the query context. This implies that we need to produce structured output from unstructured input text.
17
 
18
  To this end, I will undertake a simple exercise to fine-tune the new Llama 3 model to process text input and generate JSON-like output. My hope is that we can avoid some external dependencies for this part of the system by seamlessly integrating various models to reinforce complex applications in production settings. It is my belief that building a robust critical infrastructure for the semantic modules requires choosing the right LLM for a given task.
19
  For training, we will use structured data from [azizshaw](https://huggingface.co/azizshaw/text_to_json). The dataset has 485 rows and contains 'input', 'output' and 'instruction' columns.
20
 
21
+ For a quick evaluation, let's use another dataset for text-to-JSON, the **Diverse Restricted JSON Data Extraction**, curated by: The paraloq analytics team ([here](https://huggingface.co/datasets/paraloq/json_data_extraction)).
22
 
23
+ Run the model for inference:
24
 
25
+ ```python
26
+ # alpaca_prompt = Copied from above
27
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
28
+ inputs = tokenizer(
29
+ [
30
+ alpaca_prompt.format(
31
+ """
32
+ Convert this text into a JSON object. Create field names that meaningfully represent the data being reported.
33
+ It is extremely important that you construct a well-formed object.
34
+ """, # instruction
35
+ "**Medical Document** **Patient Information** * Patient ID: PT123456 * Name: Jane Doe * Date of Birth: 1980-01-01 * Gender: Female * Medical Conditions: * Asthma * Hypertension **Prescription Information** * Prescription ID: RX123456 * Date Prescribed: 2023-03-08 * Date Expires: 2023-09-07 * Status: Active **Medication Information** * Medication ID: MD123456 * Name: Albuterol * Dosage: 200 mcg * Units: mcg * Instructions: Inhale 2 puffs every 4-6 hours as needed for shortness of breath. * Refills: 3 **Pharmacy Information** * Pharmacy ID: PH123456 * Name: CVS Pharmacy * Address: 123 Main Street, Anytown, CA 12345 * Phone: (123) 456-7890 **Additional Information** * The patient has been using Albuterol for the past 5 years to manage her asthma. * The patient has been advised to use a spacer device with the Albuterol inhaler to improve the delivery of the medication to the lungs. * The patient should avoid using Albuterol more than 4 times per day. * The patient should contact her doctor if her asthma symptoms worsen or if she experiences any side effects from the medication. **Instructions for the Patient** * Take Albuterol exactly as prescribed by your doctor. * Do not take more than the prescribed dosage. * Use a spacer device with the Albuterol inhaler. * Avoid using Albuterol more than 4 times per day. * Contact your doctor if your asthma symptoms worsen or if you experience any side effects from the medication. **Signature** [Doctor's Name] [Date]", # input
36
+ "", # output - leave this blank for generation!
37
+ )
38
+ ], return_tensors = "pt").to("cuda")
39
+
40
+ outputs = model.generate(**inputs, max_new_tokens = 1000, use_cache = True)
41
+ tokenizer.batch_decode(outputs)
42
+ ```
43
+
44
+ ```
45
+ import json
46
+ text = "{'feature1': {'detail': {'text': 'Medical Document', 'pid': 'PT123456', 'name': 'Jane Doe', 'dob': '1980-01-01', 'gender': 'Female', 'conditions': ['Asthma', 'Hypertension']}, 'detail2': {'text': 'Prescription Information', 'pid': 'RX123456', 'date': '2023-03-08', 'expires': '2023-09-07','status': 'Active'}, 'detail3': {'text': 'Medication Information', 'id': 'MD123456', 'name': 'Albuterol', 'dosage': '200 mcg', 'units':'mcg', 'instructions': 'Inhale 2 puffs every 4-6 hours as needed for shortness of breath.','refills': '3'}, 'detail4': {'text': 'Pharmacy Information', 'id': 'PH123456', 'name': 'CVS Pharmacy', 'address': '123 Main Street, Anytown, CA 12345', 'phone': '(123) 456-7890'}}, 'feature2': {'detail': {'text': 'The patient has been using Albuterol for the past 5 years to manage her asthma.', 'pid': '', 'name': '', 'dob': '', 'gender': '', 'conditions': []}, 'detail2': {'text': 'The patient has been advised to use a spacer device with the Albuterol inhaler to improve the delivery of the medication to the lungs.', 'pid': '', 'name': '', 'date': '', 'expires': '','status': ''}, 'detail3': {'text': 'The patient should avoid using Albuterol more than 4 times per day.', 'id': '', 'name': '', 'dosage': '', 'units': '', 'instructions': '','refills': ''}, 'detail4': {'text': 'The patient should contact her doctor if her asthma symptoms worsen or if she experiences any side effects from the medication.', 'pid': '', 'name': '', 'address': '', 'phone': ''}}}"
47
+ output = text.replace("'", '"')
48
+ data_dict = json.loads(output)
49
+ len(data_dict)
50
+ pprint.pprint(data_dict['feature1'])
51
+ ```
52
+ The result:
53
+
54
+ ```
55
+ {'detail': {'conditions': ['Asthma', 'Hypertension'],
56
+ 'dob': '1980-01-01',
57
+ 'gender': 'Female',
58
+ 'name': 'Jane Doe',
59
+ 'pid': 'PT123456',
60
+ 'text': 'Medical Document'},
61
+ 'detail2': {'date': '2023-03-08',
62
+ 'expires': '2023-09-07',
63
+ 'pid': 'RX123456',
64
+ 'status': 'Active',
65
+ 'text': 'Prescription Information'},
66
+ 'detail3': {'dosage': '200 mcg',
67
+ 'id': 'MD123456',
68
+ 'instructions': 'Inhale 2 puffs every 4-6 hours as needed for '
69
+ 'shortness of breath.',
70
+ 'name': 'Albuterol',
71
+ 'refills': '3',
72
+ 'text': 'Medication Information',
73
+ 'units': 'mcg'},
74
+ 'detail4': {'address': '123 Main Street, Anytown, CA 12345',
75
+ 'id': 'PH123456',
76
+ 'name': 'CVS Pharmacy',
77
+ 'phone': '(123) 456-7890',
78
+ 'text': 'Pharmacy Information'}}
79
+ ```
80
+
81
+ ## Results Notes
82
+
83
+ - Considering that we are working with a toy example and a 4-byte quantization, the results seem like a good starting point.
84
+ - As we fine-tune the model with examples of strings using single quotes enclosed names, the model learns to use this notation, resulting in output generated with single quotes. This approach is far from optimal for securing our workflow and ensuring robust code.
85
+ - Another point to note is that the response tends to repeat information.
86
+
87
+ ## Uploaded model
88
  - **Developed by:** sccastillo
89
  - **License:** apache-2.0
90
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit