dsfsi
/

OMT-LR-Mistral7b

@@ -21,7 +21,7 @@ base_model:
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-The model is a result of fine-tuning Mistral-7B-v0.1 on a down stream task, in low resourced setting. It is able to translate English sentences to Zulu and Xhosa sentrences.
 ## Model Details
@@ -57,8 +57,8 @@ for large language model in regard to low resourced morphologically rich African
 ## Uses
 The model can be used to translate Engslih to Zulu and Xhosa. With further improvement it can be used to translate domain specific infromation from English to Zulu and Xhosa,
-thus it can be used to get research information that was written in English in the agriculture industry to small scale farmers that speak Zulu and Xhosa. Further, it can
-be used in the Education industry to teach core subjects in native South African langauges thus can improve pupils' performance in the core subjects.
 ### Direct Use
@@ -66,13 +66,6 @@ be used in the Education industry to teach core subjects in native South African
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 You can download the model, dsfsi/OMT-LR-Mistral7b, and prompt it to translate English sentences to Zulu and Xhosa sentences.
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
@@ -130,7 +123,7 @@ translator("Translate to Zulu: The cow is eating grass.")
 #### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
@@ -160,12 +153,6 @@ translator("Translate to Zulu: The cow is eating grass.")
 )
 ```
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
@@ -174,15 +161,11 @@ translator("Translate to Zulu: The cow is eating grass.")
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
@@ -259,15 +242,6 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 Khoboko, P. W., Marivate, V., & Sefara, J. (2025). Optimizing translation for low-resource languages: Efficient fine-tuning with custom prompt engineering in large language models. Machine Learning with Applications, 20, 100649.
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
 ## Model Card Authors [optional]

 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+The model is a result of fine-tuning Mistral-7B-v0.1 on a down stream task, in low resourced setting. It is able to translate English sentences to Zulu and Xhosa sentences.
 ## Model Details
 ## Uses
 The model can be used to translate Engslih to Zulu and Xhosa. With further improvement it can be used to translate domain specific infromation from English to Zulu and Xhosa,
+thus it can be used to translate research information that was written in English, in the agriculture industry, to small scale farmers that speak Zulu and Xhosa. Further, it can
+be used in the Education industry to teach core subjects in native South African langauges thus can improve pupils' performance in these subjects.
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 You can download the model, dsfsi/OMT-LR-Mistral7b, and prompt it to translate English sentences to Zulu and Xhosa sentences.
 ### Out-of-Scope Use
 #### Preprocessing [optional]
+Look at the repo to find out how the dataset clean up and preparation code in python.
 #### Training Hyperparameters
 )
 ```
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
 #### Testing Data
+- [nwu-ctext/autshumato](https://huggingface.co/datasets/nwu-ctext/autshumato)
+- [Helsinki-NLP/opus-100](https://huggingface.co/datasets/Helsinki-NLP/opus-100)
+- [WMT22](https://huggingface.co/datasets/wmt22)
+- [gsarti/flores_101](https://huggingface.co/datasets/gsarti/flores_101)
 #### Metrics
 Khoboko, P. W., Marivate, V., & Sefara, J. (2025). Optimizing translation for low-resource languages: Efficient fine-tuning with custom prompt engineering in large language models. Machine Learning with Applications, 20, 100649.
 ## Model Card Authors [optional]