ybelkada commited on
Commit
471ffd5
·
verified ·
1 Parent(s): 4e0bc48

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +167 -0
README.md ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - bitnet
5
+ - falcon-e
6
+ license: other
7
+ license_name: falcon-llm-license
8
+ license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
9
+ ---
10
+
11
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62441d1d9fdefb55a0b7d12c/KVAEDoch-o0HgA0e2L4HL.png)
12
+
13
+ # Table of Contents
14
+
15
+ 0. [TL;DR](#TL;DR)
16
+ 1. [Model Details](#model-details)
17
+ 2. [Training Details](#training-details)
18
+ 3. [Usage](#usage)
19
+ 4. [Evaluation](#evaluation)
20
+ 5. [Citation](#citation)
21
+
22
+
23
+ # TL;DR
24
+
25
+ # Model Details
26
+
27
+ ## Model Description
28
+
29
+ - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
30
+ - **Model type:** Causal decoder-only / Base version
31
+ - **Architecture:** Pure-transformer - 1.58bit version
32
+ - **Language(s) (NLP):** English
33
+ - **License:** Falcon-LLM License
34
+
35
+ # Training details
36
+
37
+ For more details about the training protocol of this model, please refer to the [Falcon-E technical blogpost](https://falcon-lm.github.io/blog/falcon-edge/).
38
+
39
+ # Usage
40
+
41
+ Currently to use this model you can either rely on Hugging Face transformers library or [BitNet](https://github.com/microsoft/BitNet) library. There are multiple ways to interact with the model depending on your target usage. For each of the Falcon-E series model, you have three variants: the BitNet model, the prequantized checkpoint for fine-tuning and the `bfloat16` version of the BitNet model.
42
+
43
+ ### Inference
44
+
45
+ #### BitNet
46
+
47
+ ```
48
+ git clone https://github.com/microsoft/BitNet && cd BitNet
49
+ pip install -r requirements.txt
50
+ huggingface-cli download tiiuae/Falcon-E-3B-Instruct-GGUF ggml-model-i2_s.gguf --local-dir models/Falcon-E-3B-Instruct/
51
+ python run_inference.py -m models/Falcon-E-3B-Instruct/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv
52
+ ```
53
+
54
+ ### Fine-tuning
55
+
56
+ For fine-tuning the model, you should load the `prequantized` revision of the model and use the `onebitllms` Python package:
57
+
58
+ ```diff
59
+ import torch
60
+
61
+ from transformers import AutoModelForCausalLM, AutoTokenizer
62
+ from trl import SFTTrainer
63
+ + from onebitllms import replace_linear_with_bitnet_linear, quantize_to_1bit
64
+
65
+ model_id = "tiiuae/Falcon-E-1B-Base"
66
+
67
+ tokenizer = AutoTokenizer.from_pretrained(model_id, revision="prequantized")
68
+ model = AutoModelForCausalLM.from_pretrained(
69
+ model_id,
70
+ torch_dtype=torch.bfloat16,
71
+ + revision="prequantized"
72
+ )
73
+ + model = replace_linear_with_bitnet_linear(model)
74
+
75
+ trainer = SFTTrainer(
76
+ model,
77
+ ...
78
+ )
79
+
80
+ trainer.train()
81
+
82
+ + quantize_to_1bit(output_directory)
83
+ ```
84
+
85
+ # Evaluation
86
+
87
+ We report in the following table our internal pipeline benchmarks:
88
+
89
+ **Note evaluation results are normalized score from former Hugging Face leaderboard v2 tasks**
90
+
91
+ <details>
92
+ <summary class="bold"> For 1B scale models and below </summary>
93
+
94
+ | Model | Nb Params | Mem Footprint | IFEVAL | Math-Hard | GPQA | MuSR | BBH | MMLU-Pro | Avg. |
95
+ | -------- | ------- | ------- | ------- | ------ | ----- | ----- | ----- | ------ | ---- |
96
+ | Qwen-2.5-0.5B | 0.5B | 1GB | 16.27 | 3.93 | 0.0 | 2.08 | 6.95 | 10.06 | 6.55 |
97
+ | SmolLM2-360M | 0.36B | 720MB | 21.15 | 1.21 | 0.0 | 7.73 | 5.54 | 1.88 | 6.25 |
98
+ | Qwen-2.5-1.5B | 1.5B | 3.1GB | 26.74 | 9.14 | 16.66 | 5.27 | 20.61 | 4.7 | 13.85 |
99
+ | Llama-3.2-1B | 1.24B | 2.47GB | 14.78 | 1.21 | 4.37 | 2.56 | 2.26 | 0 | 4.2 |
100
+ | SmolLM2-1.7B | 1.7B | 3.4GB | 24.4 | 2.64 | 9.3 | 4.6 | 12.64 | 3.91 | 9.58 |
101
+ | Falcon-3-1B-Base | 1.5B | 3GB | 24.28 | 3.32 | 11.34 | 9.71 | 6.76 | 3.91 | 9.89 |
102
+ | Hymba-1.5B-Base | 1.5B | 3GB | 22.95 | 1.36 | 7.69 | 5.18 | 10.25 | 0.78 | 8.04 |
103
+ | Falcon-E-1B-Base | 1.8B | **635MB** | 32.9 | 10.97 | 2.8 | 3.65 | 12.28 | 17.82 | 13.40 |
104
+
105
+ </details>
106
+
107
+
108
+ <details>
109
+ <summary class="bold"> For 3B scale models </summary>
110
+
111
+ | Model | Nb Params | Mem Footprint | IFEVAL | Math-Hard | GPQA | MuSR | BBH | MMLU-Pro | Avg. |
112
+ | -------- | ------- | ------- | ------- | ------ | ----- | ----- | ----- | ------ | ---- |
113
+ | Falcon-3-3B-Base | 3B | 6.46GB | 15.74 | 11.78 | 21.58 | 6.27 | 18.09 | 6.26 | 15.74 |
114
+ | Qwen2.5-3B | 3B | 6.17GB | 26.9 | 14.8 | 24.3 | 11.76 | 24.48 | 6.38 | 18.1 |
115
+ | Falcon-E-3B-Base | 3B | **955MB** | 36.67 | 13.45 | 8.67 | 4.14 | 19.83 | 27.16 | 18.32 |
116
+
117
+ </details>
118
+
119
+ Below are the results for instruction fine-tuned models:
120
+
121
+ <details>
122
+ <summary class="bold"> For 1B scale models and below </summary>
123
+
124
+ | Model | Nb Params | Mem Footprint | IFEVAL | Math-Hard | GPQA | MuSR | BBH | MMLU-Pro | Avg. |
125
+ | -------- | ------- | ------- | ------- | ------ | ----- | ----- | ----- | ------ | ---- |
126
+ | Qwen-2.5-0.5B-Instruct | 500M | 1GB | 30.71 | 0 | 8.43 | 0.94 | 7.75 | 0 | 6.59 |
127
+ | SmolLM2-360M-Instruct | 360M | 720MB | 38.42 | 1.51 | 4.17 | 2.77 | 1.3 | 0.67 | 8.14 |
128
+ | Qwen-2.5-1.5B-Instruct | 1.5B | 3.1GB | 44.76 | 22.05 | 19.81 | 3.19 | 19.99 | 0.78 | 18.43 |
129
+ | SmolLM2-1.7B | 1.7B | 3.4GB | 53.68 | 5.82 | 10.92 | 4.1 | 11.71 | 0 | 15.02 |
130
+ | Falcon-3-1B-Instruct | 1.5B | 3GB | 55.57 | 6.34 | 12.96 | 10.56 | 9.32 | 2.24 | 16.16 |
131
+ | Hymba-1.5B-Instruct | 1.5B | 3GB | 60.09 | 2.72 | 4.59 | 1.05 | 11.56 | 5.515 | 14.19 |
132
+ | Falcon-E-1B-Instruct | 1.8B | **635MB** | 54.35 | 9.12 | 16.5 | 2.51 | 19.42 | 9.64 | 18.59 |
133
+
134
+ </details>
135
+
136
+
137
+ <details>
138
+ <summary class="bold"> For 3B scale models </summary>
139
+
140
+ | Model | Nb Params | Mem Footprint | IFEVAL | Math-Hard | GPQA | MuSR | BBH | MMLU-Pro | Avg. |
141
+ | -------- | ------- | ------- | ------- | ------ | ----- | ----- | ----- | ------ | ---- |
142
+ | Falcon-3-3B-Instruct | 3B | 6.46GB | 69.77 | 25 | 26.29 | 11.13 | 22.28 | 5.15 | 26.6 |
143
+ | Qwen2.5-3B-Instruct | 3B | 6.17GB | 64.75 | 36.78 | 25.8 | 7.57 | 25.05 | 3.02 | 27.16 |
144
+ | Falcon-E-3B-Instruct | 3B | **955MB** | 60.97 | 15.3 | 23.59 | 2.12 | 26.45 | 7.45 | 22.64666667 |
145
+
146
+ </details>
147
+
148
+
149
+ ## Useful links
150
+
151
+ - View [our release blogpost](https://falcon-lm.github.io/blog/falcon-edge/).
152
+ - Learn more about [`onebitllms` library](https://github.com/tiiuae/onebitllms).
153
+ - Feel free to join [our discord server](https://discord.gg/fwXpMyGc) if you have any questions or to interact with our researchers and developers.
154
+
155
+ ## Citation
156
+
157
+ If the Falcon-E family of models were helpful to your work, feel free to give us a cite.
158
+
159
+ ```
160
+ @misc{tiionebitllms,
161
+ title = {Falcon-E, a series of powerful, universal and fine-tunable 1.58bit language models.},
162
+ author = {Falcon-LLM Team},
163
+ month = {April},
164
+ url = {https://falcon-lm.github.io/blog/falcon-edge},
165
+ year = {2025}
166
+ }
167
+ ```