Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ base_model:
|
|
8 |
---
|
9 |
|
10 |
|
11 |
-
# Model Card for GAIA (
|
12 |
|
13 |
**GAIA** is an open, state-of-the-art language model for Brazilian Portuguese. It was developed by continuously pre-training the `google/gemma-3-4b-pt` model on an extensive, high-quality corpus of Portuguese data.
|
14 |
|
@@ -25,15 +25,26 @@ The development process started with the base model `google/gemma-3-4b-pt` and i
|
|
25 |
2. **Instruction-Following Capability Restoration:** To enable the model to follow instructions without traditional supervised fine-tuning (SFT), a weight merging operation was applied. This technique, described in the paper *“Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs”*, allows the model to integrate the knowledge acquired during continuous pre-training with the ability to interact in a chat format and follow instructions.
|
26 |
|
27 |
- **Developed by:** The Brazilian Association of AI (ABRIA), the Center of Excellence in Artificial Intelligence (CEIA-UFG), Nama, Amadeus AI, and Google DeepMind.
|
28 |
-
- **Model:** GAIA
|
29 |
- **Model type:** Causal decoder-only Transformer-based language model.
|
30 |
- **Language(s):** Brazilian Portuguese (pt-BR)
|
31 |
- **License:** Gemma
|
32 |
- **Based on:** `google/gemma-3-4b-pt`
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
### Model Sources
|
35 |
|
36 |
-
- **Repository:** [CEIA-UFG/
|
37 |
- **Paper (Merge Methodology):** [Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs](https://arxiv.org/pdf/2410.10739)
|
38 |
|
39 |
## Uses
|
@@ -94,9 +105,9 @@ The model was evaluated on a set of multiple-choice benchmarks in Portuguese, co
|
|
94 |
| Benchmark | `google/gemma-3-4b-it` (Baseline) | GAIA (Our Model) |
|
95 |
|------------------|-----------------------------------|------------------|
|
96 |
| BlueX | **0.6630** | 0.6575 |
|
97 |
-
| ENEM 2024 | 0.6556 | **0.7000**
|
98 |
-
| ENEM (General) | 0.7416
|
99 |
-
| OAB (Bar Exam) | **0.4502**
|
100 |
|
101 |
#### Summary
|
102 |
|
@@ -110,9 +121,9 @@ If you use this model in your research or application, please cite our work.
|
|
110 |
```bibtex
|
111 |
@misc{gaia-gemma-3-4b-2025,
|
112 |
title={GAIA: An Open Language Model for Brazilian Portuguese},
|
113 |
-
author={
|
114 |
year={2025},
|
115 |
publisher={Hugging Face},
|
116 |
journal={Hugging Face repository},
|
117 |
-
howpublished={\url{[https://huggingface.co/CEIA-UFG/
|
118 |
}
|
|
|
8 |
---
|
9 |
|
10 |
|
11 |
+
# Model Card for GAIA (Gemma-3-Gaia-PT-BR-4b-it)
|
12 |
|
13 |
**GAIA** is an open, state-of-the-art language model for Brazilian Portuguese. It was developed by continuously pre-training the `google/gemma-3-4b-pt` model on an extensive, high-quality corpus of Portuguese data.
|
14 |
|
|
|
25 |
2. **Instruction-Following Capability Restoration:** To enable the model to follow instructions without traditional supervised fine-tuning (SFT), a weight merging operation was applied. This technique, described in the paper *“Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs”*, allows the model to integrate the knowledge acquired during continuous pre-training with the ability to interact in a chat format and follow instructions.
|
26 |
|
27 |
- **Developed by:** The Brazilian Association of AI (ABRIA), the Center of Excellence in Artificial Intelligence (CEIA-UFG), Nama, Amadeus AI, and Google DeepMind.
|
28 |
+
- **Model:** GAIA
|
29 |
- **Model type:** Causal decoder-only Transformer-based language model.
|
30 |
- **Language(s):** Brazilian Portuguese (pt-BR)
|
31 |
- **License:** Gemma
|
32 |
- **Based on:** `google/gemma-3-4b-pt`
|
33 |
|
34 |
+
### Team
|
35 |
+
This project was made possible by the contributions of the following individuals:
|
36 |
+
- Dr. Celso Gonçalves Camilo-Junior
|
37 |
+
- Dr. Sávio Salvarino Teles de Oliveira
|
38 |
+
- Me. Lucas Araujo Pereira
|
39 |
+
- Marcellus Amadeus
|
40 |
+
- Daniel Fazzioni
|
41 |
+
- Artur Matos Andrade Novais
|
42 |
+
- Salatiel Abraão Avelar Jordão
|
43 |
+
|
44 |
+
|
45 |
### Model Sources
|
46 |
|
47 |
+
- **Repository:** [CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it](https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it)
|
48 |
- **Paper (Merge Methodology):** [Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs](https://arxiv.org/pdf/2410.10739)
|
49 |
|
50 |
## Uses
|
|
|
105 |
| Benchmark | `google/gemma-3-4b-it` (Baseline) | GAIA (Our Model) |
|
106 |
|------------------|-----------------------------------|------------------|
|
107 |
| BlueX | **0.6630** | 0.6575 |
|
108 |
+
| ENEM 2024 | 0.6556 | **0.7000** |
|
109 |
+
| ENEM (General) | 0.7416 | **0.7486I** |
|
110 |
+
| OAB (Bar Exam) | **0.4502** | 0.4416 |
|
111 |
|
112 |
#### Summary
|
113 |
|
|
|
121 |
```bibtex
|
122 |
@misc{gaia-gemma-3-4b-2025,
|
123 |
title={GAIA: An Open Language Model for Brazilian Portuguese},
|
124 |
+
author={CAMILO-JUNIOR, C. G.; OLIVEIRA, S. S. T.; PEREIRA, L. A.; AMADEUS, M.; FAZZIONI, D.; NOVAIS, A. M. A.; JORDÃO, S. A. A.},
|
125 |
year={2025},
|
126 |
publisher={Hugging Face},
|
127 |
journal={Hugging Face repository},
|
128 |
+
howpublished={\url{[https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it](https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it)}}
|
129 |
}
|