lbourdois commited on
Commit
a8a42f4
·
verified ·
1 Parent(s): b58fe7d

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +285 -273
README.md CHANGED
@@ -1,274 +1,286 @@
1
- ---
2
- language:
3
- - en
4
- license: other
5
- library_name: transformers
6
- tags:
7
- - generated_from_trainer
8
- base_model:
9
- - Qwen/Qwen2.5-7B-Instruct
10
- datasets:
11
- - Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
12
- license_name: qwen
13
- license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
14
- model-index:
15
- - name: cybertron-v4-qw7B-UNAMGS
16
- results:
17
- - task:
18
- type: text-generation
19
- name: Text Generation
20
- dataset:
21
- name: IFEval (0-Shot)
22
- type: HuggingFaceH4/ifeval
23
- args:
24
- num_few_shot: 0
25
- metrics:
26
- - type: inst_level_strict_acc and prompt_level_strict_acc
27
- value: 60.84
28
- name: strict accuracy
29
- source:
30
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
31
- name: Open LLM Leaderboard
32
- - task:
33
- type: text-generation
34
- name: Text Generation
35
- dataset:
36
- name: BBH (3-Shot)
37
- type: BBH
38
- args:
39
- num_few_shot: 3
40
- metrics:
41
- - type: acc_norm
42
- value: 37.71
43
- name: normalized accuracy
44
- source:
45
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
46
- name: Open LLM Leaderboard
47
- - task:
48
- type: text-generation
49
- name: Text Generation
50
- dataset:
51
- name: MATH Lvl 5 (4-Shot)
52
- type: hendrycks/competition_math
53
- args:
54
- num_few_shot: 4
55
- metrics:
56
- - type: exact_match
57
- value: 29.91
58
- name: exact match
59
- source:
60
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
61
- name: Open LLM Leaderboard
62
- - task:
63
- type: text-generation
64
- name: Text Generation
65
- dataset:
66
- name: GPQA (0-shot)
67
- type: Idavidrein/gpqa
68
- args:
69
- num_few_shot: 0
70
- metrics:
71
- - type: acc_norm
72
- value: 10.85
73
- name: acc_norm
74
- source:
75
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
76
- name: Open LLM Leaderboard
77
- - task:
78
- type: text-generation
79
- name: Text Generation
80
- dataset:
81
- name: MuSR (0-shot)
82
- type: TAUR-Lab/MuSR
83
- args:
84
- num_few_shot: 0
85
- metrics:
86
- - type: acc_norm
87
- value: 12.69
88
- name: acc_norm
89
- source:
90
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
91
- name: Open LLM Leaderboard
92
- - task:
93
- type: text-generation
94
- name: Text Generation
95
- dataset:
96
- name: MMLU-PRO (5-shot)
97
- type: TIGER-Lab/MMLU-Pro
98
- config: main
99
- split: test
100
- args:
101
- num_few_shot: 5
102
- metrics:
103
- - type: acc
104
- value: 38.89
105
- name: accuracy
106
- source:
107
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
108
- name: Open LLM Leaderboard
109
- ---
110
-
111
- # cybertron-v4-qw7B-UNAMGS
112
-
113
- **UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
114
-
115
- **SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
116
-
117
- ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
118
-
119
- This special edition went thru UNA at MLP layers just like [miniclaus-1.5B](https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS)
120
-
121
- Here we use our novel approach called `MGS`. Its up to you to figure out what it means. On top of that we used `UNA: Uniform Neural Alignment`
122
-
123
- Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
124
-
125
- ## Contamination Benchmark
126
- https://gair-nlp.github.io/benbench/
127
-
128
- - MATH:
129
- ```
130
- 5gram-Qwen2.5-7B-Instruct-orgn-MATH-test.jsonl: 37.52666666666667
131
- 5gram-Qwen2.5-7B-Instruct-orgn-MATH-train.jsonl: 46.36666666666667
132
- ```
133
- vs
134
- ```
135
- 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-test.jsonl: 37.42666666666667
136
- 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-train.jsonl: 46.053333333333335
137
- ```
138
- vs
139
- ```
140
- 5gram-Homer-v0.5-orgn-MATH-test.jsonl: 38.77333333333333
141
- 5gram-Homer-v0.5-orgn-MATH-train.jsonl: 47.16666666666667
142
- ```
143
-
144
- ## Quantz
145
- Available at bartowski repo
146
-
147
- https://huggingface.co/bartowski/cybertron-v4-qw7B-UNAMGS-GGUF
148
-
149
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
150
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
151
-
152
- | Metric |Value|
153
- |-------------------|----:|
154
- |Avg. |31.82|
155
- |IFEval (0-Shot) |60.84|
156
- |BBH (3-Shot) |37.71|
157
- |MATH Lvl 5 (4-Shot)|29.91|
158
- |GPQA (0-shot) |10.85|
159
- |MuSR (0-shot) |12.69|
160
- |MMLU-PRO (5-shot) |38.89|
161
-
162
- ## MGS & UNA & Details
163
-
164
- * MGS, `1+1 = 2 and not 3`
165
- * UNA, `1+1 = 2 obviously`
166
-
167
- We also followed https://arxiv.org/pdf/2410.21228 insights.
168
-
169
- ## Training procedure
170
-
171
- 1 Epoch as usual.
172
-
173
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
174
- ```
175
- datasets:
176
- - path: Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
177
- split: train
178
- type: chat_template
179
- field_messages: conversations
180
- message_field_role: from
181
- message_field_content: value
182
- roles:
183
- user: ["human", "user"]
184
- assistant: ["gpt", "assistant", "ai"]
185
- system: ["system"]
186
- ```
187
-
188
- ### Training hyperparameters
189
-
190
- The following hyperparameters were used during training:
191
- - seed: 42
192
- - distributed_type: multi-GPU
193
- - num_devices: 8
194
- - total_train_batch_size: 64
195
- - total_eval_batch_size: 16
196
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
197
- - num_epochs: 1
198
-
199
- ### Training results
200
-
201
- | Training Loss | Epoch | Step | Validation Loss |
202
- |:-------------:|:------:|:----:|:---------------:|
203
- | 0.7824 | 0.0003 | 1 | 0.5555 |
204
- | 0.5489 | 0.0503 | 144 | 0.4848 |
205
- | 0.5348 | 0.1006 | 288 | 0.4732 |
206
- | 0.5256 | 0.1509 | 432 | 0.4670 |
207
- | 0.5172 | 0.2012 | 576 | 0.4621 |
208
- | 0.4882 | 0.2515 | 720 | 0.4578 |
209
- | 0.4848 | 0.3018 | 864 | 0.4550 |
210
- | 0.4678 | 0.3520 | 1008 | 0.4522 |
211
- | 0.4686 | 0.4023 | 1152 | 0.4502 |
212
- | 0.4775 | 0.4526 | 1296 | 0.4474 |
213
- | 0.4464 | 0.5029 | 1440 | 0.4454 |
214
- | 0.4772 | 0.5532 | 1584 | 0.4438 |
215
- | 0.4546 | 0.6035 | 1728 | 0.4425 |
216
- | 0.4661 | 0.6538 | 1872 | 0.4411 |
217
- | 0.4569 | 0.7041 | 2016 | 0.4399 |
218
- | 0.4529 | 0.7544 | 2160 | 0.4390 |
219
- | 0.4409 | 0.8047 | 2304 | 0.4380 |
220
- | 0.4405 | 0.8550 | 2448 | 0.4370 |
221
- | 0.4642 | 0.9053 | 2592 | 0.4363 |
222
- | 0.4566 | 0.9556 | 2736 | 0.4359 |
223
-
224
- ### Framework versions
225
-
226
- - PEFT 0.13.2
227
- - Transformers 4.45.2 (UNA & MGS patch)
228
- - Pytorch 2.3.0+cu121
229
- - Datasets 3.0.1
230
- - Tokenizers 0.20.1
231
-
232
- ## Citations
233
- ```
234
- @misc{thebeagle-v2,
235
- title={TheBeagle v2: MGS},
236
- author={Xavier Murias},
237
- year={2024},
238
- publisher = {HuggingFace},
239
- journal = {HuggingFace repository},
240
- howpublished = {\url{https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS}},
241
- }
242
-
243
- @misc{Magpie,
244
- title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
245
- author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
246
- year={2024},
247
- eprint={2406.08464},
248
- archivePrefix={arXiv},
249
- primaryClass={cs.CL}
250
- }
251
-
252
- @misc{qwen2.5,
253
- title = {Qwen2.5: A Party of Foundation Models},
254
- url = {https://qwenlm.github.io/blog/qwen2.5/},
255
- author = {Qwen Team},
256
- month = {September},
257
- year = {2024}
258
- }
259
-
260
- @article{qwen2,
261
- title={Qwen2 Technical Report},
262
- author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
263
- journal={arXiv preprint arXiv:2407.10671},
264
- year={2024}
265
- }
266
-
267
- @article{xu2024benchmarking,
268
- title={Benchmarking Benchmark Leakage in Large Language Models},
269
- author={Xu, Ruijie and Wang, Zengzhi and Fan, Run-Ze and Liu, Pengfei},
270
- year={2024},
271
- journal={arXiv preprint arXiv:2404.18824},
272
- url={https://arxiv.org/abs/2404.18824}
273
- }
 
 
 
 
 
 
 
 
 
 
 
 
274
  ```
 
1
+ ---
2
+ language:
3
+ - zho
4
+ - eng
5
+ - fra
6
+ - spa
7
+ - por
8
+ - deu
9
+ - ita
10
+ - rus
11
+ - jpn
12
+ - kor
13
+ - vie
14
+ - tha
15
+ - ara
16
+ license: other
17
+ library_name: transformers
18
+ tags:
19
+ - generated_from_trainer
20
+ base_model:
21
+ - Qwen/Qwen2.5-7B-Instruct
22
+ datasets:
23
+ - Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
24
+ license_name: qwen
25
+ license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
26
+ model-index:
27
+ - name: cybertron-v4-qw7B-UNAMGS
28
+ results:
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: IFEval (0-Shot)
34
+ type: HuggingFaceH4/ifeval
35
+ args:
36
+ num_few_shot: 0
37
+ metrics:
38
+ - type: inst_level_strict_acc and prompt_level_strict_acc
39
+ value: 60.84
40
+ name: strict accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: BBH (3-Shot)
49
+ type: BBH
50
+ args:
51
+ num_few_shot: 3
52
+ metrics:
53
+ - type: acc_norm
54
+ value: 37.71
55
+ name: normalized accuracy
56
+ source:
57
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: MATH Lvl 5 (4-Shot)
64
+ type: hendrycks/competition_math
65
+ args:
66
+ num_few_shot: 4
67
+ metrics:
68
+ - type: exact_match
69
+ value: 29.91
70
+ name: exact match
71
+ source:
72
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
73
+ name: Open LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: GPQA (0-shot)
79
+ type: Idavidrein/gpqa
80
+ args:
81
+ num_few_shot: 0
82
+ metrics:
83
+ - type: acc_norm
84
+ value: 10.85
85
+ name: acc_norm
86
+ source:
87
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: MuSR (0-shot)
94
+ type: TAUR-Lab/MuSR
95
+ args:
96
+ num_few_shot: 0
97
+ metrics:
98
+ - type: acc_norm
99
+ value: 12.69
100
+ name: acc_norm
101
+ source:
102
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
103
+ name: Open LLM Leaderboard
104
+ - task:
105
+ type: text-generation
106
+ name: Text Generation
107
+ dataset:
108
+ name: MMLU-PRO (5-shot)
109
+ type: TIGER-Lab/MMLU-Pro
110
+ config: main
111
+ split: test
112
+ args:
113
+ num_few_shot: 5
114
+ metrics:
115
+ - type: acc
116
+ value: 38.89
117
+ name: accuracy
118
+ source:
119
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-UNAMGS
120
+ name: Open LLM Leaderboard
121
+ ---
122
+
123
+ # cybertron-v4-qw7B-UNAMGS
124
+
125
+ **UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
126
+
127
+ **SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
128
+
129
+ ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
130
+
131
+ This special edition went thru UNA at MLP layers just like [miniclaus-1.5B](https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS)
132
+
133
+ Here we use our novel approach called `MGS`. Its up to you to figure out what it means. On top of that we used `UNA: Uniform Neural Alignment`
134
+
135
+ Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
136
+
137
+ ## Contamination Benchmark
138
+ https://gair-nlp.github.io/benbench/
139
+
140
+ - MATH:
141
+ ```
142
+ 5gram-Qwen2.5-7B-Instruct-orgn-MATH-test.jsonl: 37.52666666666667
143
+ 5gram-Qwen2.5-7B-Instruct-orgn-MATH-train.jsonl: 46.36666666666667
144
+ ```
145
+ vs
146
+ ```
147
+ 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-test.jsonl: 37.42666666666667
148
+ 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-train.jsonl: 46.053333333333335
149
+ ```
150
+ vs
151
+ ```
152
+ 5gram-Homer-v0.5-orgn-MATH-test.jsonl: 38.77333333333333
153
+ 5gram-Homer-v0.5-orgn-MATH-train.jsonl: 47.16666666666667
154
+ ```
155
+
156
+ ## Quantz
157
+ Available at bartowski repo
158
+
159
+ https://huggingface.co/bartowski/cybertron-v4-qw7B-UNAMGS-GGUF
160
+
161
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
162
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
163
+
164
+ | Metric |Value|
165
+ |-------------------|----:|
166
+ |Avg. |31.82|
167
+ |IFEval (0-Shot) |60.84|
168
+ |BBH (3-Shot) |37.71|
169
+ |MATH Lvl 5 (4-Shot)|29.91|
170
+ |GPQA (0-shot) |10.85|
171
+ |MuSR (0-shot) |12.69|
172
+ |MMLU-PRO (5-shot) |38.89|
173
+
174
+ ## MGS & UNA & Details
175
+
176
+ * MGS, `1+1 = 2 and not 3`
177
+ * UNA, `1+1 = 2 obviously`
178
+
179
+ We also followed https://arxiv.org/pdf/2410.21228 insights.
180
+
181
+ ## Training procedure
182
+
183
+ 1 Epoch as usual.
184
+
185
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
186
+ ```
187
+ datasets:
188
+ - path: Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
189
+ split: train
190
+ type: chat_template
191
+ field_messages: conversations
192
+ message_field_role: from
193
+ message_field_content: value
194
+ roles:
195
+ user: ["human", "user"]
196
+ assistant: ["gpt", "assistant", "ai"]
197
+ system: ["system"]
198
+ ```
199
+
200
+ ### Training hyperparameters
201
+
202
+ The following hyperparameters were used during training:
203
+ - seed: 42
204
+ - distributed_type: multi-GPU
205
+ - num_devices: 8
206
+ - total_train_batch_size: 64
207
+ - total_eval_batch_size: 16
208
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
209
+ - num_epochs: 1
210
+
211
+ ### Training results
212
+
213
+ | Training Loss | Epoch | Step | Validation Loss |
214
+ |:-------------:|:------:|:----:|:---------------:|
215
+ | 0.7824 | 0.0003 | 1 | 0.5555 |
216
+ | 0.5489 | 0.0503 | 144 | 0.4848 |
217
+ | 0.5348 | 0.1006 | 288 | 0.4732 |
218
+ | 0.5256 | 0.1509 | 432 | 0.4670 |
219
+ | 0.5172 | 0.2012 | 576 | 0.4621 |
220
+ | 0.4882 | 0.2515 | 720 | 0.4578 |
221
+ | 0.4848 | 0.3018 | 864 | 0.4550 |
222
+ | 0.4678 | 0.3520 | 1008 | 0.4522 |
223
+ | 0.4686 | 0.4023 | 1152 | 0.4502 |
224
+ | 0.4775 | 0.4526 | 1296 | 0.4474 |
225
+ | 0.4464 | 0.5029 | 1440 | 0.4454 |
226
+ | 0.4772 | 0.5532 | 1584 | 0.4438 |
227
+ | 0.4546 | 0.6035 | 1728 | 0.4425 |
228
+ | 0.4661 | 0.6538 | 1872 | 0.4411 |
229
+ | 0.4569 | 0.7041 | 2016 | 0.4399 |
230
+ | 0.4529 | 0.7544 | 2160 | 0.4390 |
231
+ | 0.4409 | 0.8047 | 2304 | 0.4380 |
232
+ | 0.4405 | 0.8550 | 2448 | 0.4370 |
233
+ | 0.4642 | 0.9053 | 2592 | 0.4363 |
234
+ | 0.4566 | 0.9556 | 2736 | 0.4359 |
235
+
236
+ ### Framework versions
237
+
238
+ - PEFT 0.13.2
239
+ - Transformers 4.45.2 (UNA & MGS patch)
240
+ - Pytorch 2.3.0+cu121
241
+ - Datasets 3.0.1
242
+ - Tokenizers 0.20.1
243
+
244
+ ## Citations
245
+ ```
246
+ @misc{thebeagle-v2,
247
+ title={TheBeagle v2: MGS},
248
+ author={Xavier Murias},
249
+ year={2024},
250
+ publisher = {HuggingFace},
251
+ journal = {HuggingFace repository},
252
+ howpublished = {\url{https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS}},
253
+ }
254
+
255
+ @misc{Magpie,
256
+ title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
257
+ author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
258
+ year={2024},
259
+ eprint={2406.08464},
260
+ archivePrefix={arXiv},
261
+ primaryClass={cs.CL}
262
+ }
263
+
264
+ @misc{qwen2.5,
265
+ title = {Qwen2.5: A Party of Foundation Models},
266
+ url = {https://qwenlm.github.io/blog/qwen2.5/},
267
+ author = {Qwen Team},
268
+ month = {September},
269
+ year = {2024}
270
+ }
271
+
272
+ @article{qwen2,
273
+ title={Qwen2 Technical Report},
274
+ author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
275
+ journal={arXiv preprint arXiv:2407.10671},
276
+ year={2024}
277
+ }
278
+
279
+ @article{xu2024benchmarking,
280
+ title={Benchmarking Benchmark Leakage in Large Language Models},
281
+ author={Xu, Ruijie and Wang, Zengzhi and Fan, Run-Ze and Liu, Pengfei},
282
+ year={2024},
283
+ journal={arXiv preprint arXiv:2404.18824},
284
+ url={https://arxiv.org/abs/2404.18824}
285
+ }
286
  ```