sccastillo
/

llama3_router

text-generation-inference

Model card Files Files and versions Community

sccastillo commited on Apr 24, 2024

Commit

0f2d447

·

verified ·

1 Parent(s): bb49ef2

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ While developing complex RAG applications, I found a common need for router func
 To this end, I undertook a simple exercise to fine-tune the new Llama 3 model to process text input and generate JSON-like output (here is the [colab](https://colab.research.google.com/drive/1Vj0LOjU_5N9VWLpY-AG91dgdGD88Vjwm?usp=sharing)). My hope was that we could avoid some external dependencies for this part of the system by seamlessly integrating various models to reinforce complex applications in production settings. I believed that building a robust critical infrastructure for the semantic modules required choosing the right LLM for a given task.
-For training, we used structured data from [azizshaw](https://huggingface.co/azizshaw/text_to_json). The dataset contained 485 rows and included 'input', 'output', and 'instruction' columns.
 For a quick evaluation, we used another dataset for text-to-JSON, the **Diverse Restricted JSON Data Extraction**, curated by the paraloq analytics team ([here](https://huggingface.co/datasets/paraloq/json_data_extraction)).

 To this end, I undertook a simple exercise to fine-tune the new Llama 3 model to process text input and generate JSON-like output (here is the [colab](https://colab.research.google.com/drive/1Vj0LOjU_5N9VWLpY-AG91dgdGD88Vjwm?usp=sharing)). My hope was that we could avoid some external dependencies for this part of the system by seamlessly integrating various models to reinforce complex applications in production settings. I believed that building a robust critical infrastructure for the semantic modules required choosing the right LLM for a given task.
+For training, we used structured data from [azizshaw](https://huggingface.co/datasets/azizshaw/text_to_json). The dataset contained 485 rows and included 'input', 'output', and 'instruction' columns.
 For a quick evaluation, we used another dataset for text-to-JSON, the **Diverse Restricted JSON Data Extraction**, curated by the paraloq analytics team ([here](https://huggingface.co/datasets/paraloq/json_data_extraction)).