Fazzioni commited on
Commit
996d1d5
·
verified ·
1 Parent(s): 0fde545

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -8
README.md CHANGED
@@ -8,7 +8,7 @@ base_model:
8
  ---
9
 
10
 
11
- # Model Card for GAIA (gemma-3-4b-it-pt)
12
 
13
  **GAIA** is an open, state-of-the-art language model for Brazilian Portuguese. It was developed by continuously pre-training the `google/gemma-3-4b-pt` model on an extensive, high-quality corpus of Portuguese data.
14
 
@@ -25,15 +25,26 @@ The development process started with the base model `google/gemma-3-4b-pt` and i
25
  2. **Instruction-Following Capability Restoration:** To enable the model to follow instructions without traditional supervised fine-tuning (SFT), a weight merging operation was applied. This technique, described in the paper *“Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs”*, allows the model to integrate the knowledge acquired during continuous pre-training with the ability to interact in a chat format and follow instructions.
26
 
27
  - **Developed by:** The Brazilian Association of AI (ABRIA), the Center of Excellence in Artificial Intelligence (CEIA-UFG), Nama, Amadeus AI, and Google DeepMind.
28
- - **Model:** GAIA (gemma-3-4b-it-pt)
29
  - **Model type:** Causal decoder-only Transformer-based language model.
30
  - **Language(s):** Brazilian Portuguese (pt-BR)
31
  - **License:** Gemma
32
  - **Based on:** `google/gemma-3-4b-pt`
33
 
 
 
 
 
 
 
 
 
 
 
 
34
  ### Model Sources
35
 
36
- - **Repository:** [CEIA-UFG/gemma-3-4b-it-pt](https://huggingface.co/CEIA-UFG/gemma-3-4b-it-pt)
37
  - **Paper (Merge Methodology):** [Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs](https://arxiv.org/pdf/2410.10739)
38
 
39
  ## Uses
@@ -94,9 +105,9 @@ The model was evaluated on a set of multiple-choice benchmarks in Portuguese, co
94
  | Benchmark | `google/gemma-3-4b-it` (Baseline) | GAIA (Our Model) |
95
  |------------------|-----------------------------------|------------------|
96
  | BlueX | **0.6630** | 0.6575 |
97
- | ENEM 2024 | 0.6556 | **0.7000** |
98
- | ENEM (General) | 0.7416 | **0.7486I** |
99
- | OAB (Bar Exam) | **0.4502** | 0.4416 |
100
 
101
  #### Summary
102
 
@@ -110,9 +121,9 @@ If you use this model in your research or application, please cite our work.
110
  ```bibtex
111
  @misc{gaia-gemma-3-4b-2025,
112
  title={GAIA: An Open Language Model for Brazilian Portuguese},
113
- author={Center of Excellence in Artificial Intelligence (CEIA-UFG) and The Brazilian Association of AI (ABRIA) and Nama and Amadeus AI and Google DeepMind},
114
  year={2025},
115
  publisher={Hugging Face},
116
  journal={Hugging Face repository},
117
- howpublished={\url{[https://huggingface.co/CEIA-UFG/gemma-3-4b-it-pt](https://huggingface.co/CEIA-UFG/gemma-3-4b-it-pt)}}
118
  }
 
8
  ---
9
 
10
 
11
+ # Model Card for GAIA (Gemma-3-Gaia-PT-BR-4b-it)
12
 
13
  **GAIA** is an open, state-of-the-art language model for Brazilian Portuguese. It was developed by continuously pre-training the `google/gemma-3-4b-pt` model on an extensive, high-quality corpus of Portuguese data.
14
 
 
25
  2. **Instruction-Following Capability Restoration:** To enable the model to follow instructions without traditional supervised fine-tuning (SFT), a weight merging operation was applied. This technique, described in the paper *“Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs”*, allows the model to integrate the knowledge acquired during continuous pre-training with the ability to interact in a chat format and follow instructions.
26
 
27
  - **Developed by:** The Brazilian Association of AI (ABRIA), the Center of Excellence in Artificial Intelligence (CEIA-UFG), Nama, Amadeus AI, and Google DeepMind.
28
+ - **Model:** GAIA
29
  - **Model type:** Causal decoder-only Transformer-based language model.
30
  - **Language(s):** Brazilian Portuguese (pt-BR)
31
  - **License:** Gemma
32
  - **Based on:** `google/gemma-3-4b-pt`
33
 
34
+ ### Team
35
+ This project was made possible by the contributions of the following individuals:
36
+ - Dr. Celso Gonçalves Camilo-Junior
37
+ - Dr. Sávio Salvarino Teles de Oliveira
38
+ - Me. Lucas Araujo Pereira
39
+ - Marcellus Amadeus
40
+ - Daniel Fazzioni
41
+ - Artur Matos Andrade Novais
42
+ - Salatiel Abraão Avelar Jordão
43
+
44
+
45
  ### Model Sources
46
 
47
+ - **Repository:** [CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it](https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it)
48
  - **Paper (Merge Methodology):** [Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs](https://arxiv.org/pdf/2410.10739)
49
 
50
  ## Uses
 
105
  | Benchmark | `google/gemma-3-4b-it` (Baseline) | GAIA (Our Model) |
106
  |------------------|-----------------------------------|------------------|
107
  | BlueX | **0.6630** | 0.6575 |
108
+ | ENEM 2024 | 0.6556 | **0.7000** |
109
+ | ENEM (General) | 0.7416 | **0.7486I** |
110
+ | OAB (Bar Exam) | **0.4502** | 0.4416 |
111
 
112
  #### Summary
113
 
 
121
  ```bibtex
122
  @misc{gaia-gemma-3-4b-2025,
123
  title={GAIA: An Open Language Model for Brazilian Portuguese},
124
+ author={CAMILO-JUNIOR, C. G.; OLIVEIRA, S. S. T.; PEREIRA, L. A.; AMADEUS, M.; FAZZIONI, D.; NOVAIS, A. M. A.; JORDÃO, S. A. A.},
125
  year={2025},
126
  publisher={Hugging Face},
127
  journal={Hugging Face repository},
128
+ howpublished={\url{[https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it](https://huggingface.co/CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it)}}
129
  }