neural-bridge
/

Rago-v2-13b

@@ -11,11 +11,15 @@ tags:
 # **Rago v2 13B**
 **Rago v2 13B is a [Llama 2 13B](https://huggingface.co/meta-llama/Llama-2-13b-hf)-based retrieval-augmented generation-optimized model built by [Neural Bridge AI](https://www.neuralbridge.ai/) and trained on [RAG Full Dataset 20000](https://huggingface.co/datasets/neural-bridge/rag-full-20000). It is available under [Apache license 2.0](https://www.apache.org/licenses/LICENSE-2.0.html).**
-## **Model Details**
-Rago v2 13B model is a retrieval-augmented generation-optimized (RAGO) model that enhances large language models by integrating an external authoritative knowledge base (context) for generating responses. This integration significantly improves the model's ability to produce relevant, accurate, and context-specific output across specialized domains or internal data without necessitating retraining. It addresses key challenges of large language models (LLMs), such as unpredictability, reliance on potentially outdated data, and the propagation of incorrect information, thereby improving user trust in AI applications. Rago v2 13B, specifically, is an advancement built upon the [Llama 2 13B](https://huggingface.co/meta-llama/Llama-2-13b-hf) model, optimized for retrieval-augmented generation, making it particularly effective in contextually aware response generation.
 ```python
-model = "neural-bridge/Rago-v2-7b"
 tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
@@ -49,4 +53,20 @@ def print_result(generated_text):
 for seq in sequences:
     print_result(seq["generated_text"])
-```

 # **Rago v2 13B**
 **Rago v2 13B is a [Llama 2 13B](https://huggingface.co/meta-llama/Llama-2-13b-hf)-based retrieval-augmented generation-optimized model built by [Neural Bridge AI](https://www.neuralbridge.ai/) and trained on [RAG Full Dataset 20000](https://huggingface.co/datasets/neural-bridge/rag-full-20000). It is available under [Apache license 2.0](https://www.apache.org/licenses/LICENSE-2.0.html).**
+## **Model Description**
+Rago v2 13B model is a retrieval-augmented generation-optimized (RAGO) model, which enhances large language models by integrating an external authoritative knowledge base (context) for generating responses. This integration significantly improves the model's ability to produce relevant, accurate, and context-specific output across specialized domains or internal data without necessitating retraining. It addresses key challenges of large language models (LLMs), such as unpredictability, reliance on potentially outdated data, and the propagation of incorrect information, thereby improving user trust in AI applications. Rago v2 13B, specifically, is an advancement built upon the [Llama 2 13B](https://huggingface.co/meta-llama/Llama-2-13b-hf) model, optimized for retrieval-augmented generation, making it particularly effective in contextually aware response generation.
 ```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+model = "neural-bridge/Rago-v2-13b"
 tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
 for seq in sequences:
     print_result(seq["generated_text"])
+```
+## **Model Details**
+### Training Data
+Rago v2 13B is trained using the [Neural Bridge's RAG Full 20000](https://huggingface.co/datasets/neural-bridge/rag-full-20000) dataset, which is a dataset that is mixture of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb), [gms8k](https://huggingface.co/datasets/gsm8k), and [RAG Hallucination Dataset 1000](https://huggingface.co/datasets/neural-bridge/rag-hallucination-dataset-1000).
+### Training Details
+Rago v2 13B is built upon [Llama 2 13B](https://huggingface.co/meta-llama/Llama-2-13b-hf) using [LoRA](https://arxiv.org/abs/2106.09685). It is designed to improve the model's ability to produce relevant, accurate, and context-specific output across specialized domains or internal data and to address key challenges of LLMs by taking advantage of the power of RAG, such as unpredictability, reliance on potentially outdated data, the propagation of incorrect information, etc. The models has the same architecture as [Llama 2 13B's](https://huggingface.co/meta-llama/Llama-2-13b-hf) in addition to the LoRA adapters. The model is trained with an NVIDIA A100 for around 2 days with 1e-5 learning rate (with cosine schedular) as well as the following LoRA parameters:
+* LoRA Rank (R): 64
+* LoRA Alpha: 16
+* LoRA Dropout: 0.1
+* Target Modules: *q_proj, k_proj, v_proj, o_proj*
+Rago v2 13B is boosted a custom data collator to enhance the model performace. It is trained embracing the masked language modeling (MLM) approach. Thereby, it is pushed to generate more accurate responses by only masking the answer part of the training data. Thanks to the custom data collator, it is observed improvement in the factuality performance of the model.