BlackSamorez commited on
Commit
34ba1c9
·
verified ·
1 Parent(s): 763af25

Upload Qwen2ForCausalLM

Browse files
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
config.json ADDED
@@ -0,0 +1,3737 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
3
+ "architectures": [
4
+ "Qwen2ForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "eos_token_id": 151643,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 5120,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 13824,
13
+ "max_position_embeddings": 131072,
14
+ "max_window_layers": 48,
15
+ "model_type": "qwen2",
16
+ "num_attention_heads": 40,
17
+ "num_hidden_layers": 48,
18
+ "num_key_value_heads": 8,
19
+ "quantization_config": {
20
+ "bits": 4,
21
+ "group_size": 256,
22
+ "hadamard_size": 512,
23
+ "modules_to_not_convert": [
24
+ "lm_head"
25
+ ],
26
+ "p": 2,
27
+ "quant_method": "higgs",
28
+ "tune_metadata": {
29
+ "model.layers.0.mlp.down_proj": {
30
+ "K": 13824,
31
+ "M": 1,
32
+ "N": 5120,
33
+ "device": "cuda:1",
34
+ "dtype": "torch.float16",
35
+ "group_size": 256,
36
+ "num_bits": 4,
37
+ "num_sms": 128,
38
+ "template_id": 37
39
+ },
40
+ "model.layers.0.mlp.gate_proj": {
41
+ "K": 5120,
42
+ "M": 1,
43
+ "N": 13824,
44
+ "device": "cuda:1",
45
+ "dtype": "torch.float16",
46
+ "group_size": 256,
47
+ "num_bits": 4,
48
+ "num_sms": 128,
49
+ "template_id": 54
50
+ },
51
+ "model.layers.0.mlp.up_proj": {
52
+ "K": 5120,
53
+ "M": 1,
54
+ "N": 13824,
55
+ "device": "cuda:1",
56
+ "dtype": "torch.float16",
57
+ "group_size": 256,
58
+ "num_bits": 4,
59
+ "num_sms": 128,
60
+ "template_id": 54
61
+ },
62
+ "model.layers.0.self_attn.k_proj": {
63
+ "K": 5120,
64
+ "M": 1,
65
+ "N": 1024,
66
+ "device": "cuda:1",
67
+ "dtype": "torch.float16",
68
+ "group_size": 256,
69
+ "num_bits": 4,
70
+ "num_sms": 128,
71
+ "template_id": 59
72
+ },
73
+ "model.layers.0.self_attn.o_proj": {
74
+ "K": 5120,
75
+ "M": 1,
76
+ "N": 5120,
77
+ "device": "cuda:1",
78
+ "dtype": "torch.float16",
79
+ "group_size": 256,
80
+ "num_bits": 4,
81
+ "num_sms": 128,
82
+ "template_id": 46
83
+ },
84
+ "model.layers.0.self_attn.q_proj": {
85
+ "K": 5120,
86
+ "M": 1,
87
+ "N": 5120,
88
+ "device": "cuda:1",
89
+ "dtype": "torch.float16",
90
+ "group_size": 256,
91
+ "num_bits": 4,
92
+ "num_sms": 128,
93
+ "template_id": 46
94
+ },
95
+ "model.layers.0.self_attn.v_proj": {
96
+ "K": 5120,
97
+ "M": 1,
98
+ "N": 1024,
99
+ "device": "cuda:1",
100
+ "dtype": "torch.float16",
101
+ "group_size": 256,
102
+ "num_bits": 4,
103
+ "num_sms": 128,
104
+ "template_id": 59
105
+ },
106
+ "model.layers.1.mlp.down_proj": {
107
+ "K": 13824,
108
+ "M": 1,
109
+ "N": 5120,
110
+ "device": "cuda:1",
111
+ "dtype": "torch.float16",
112
+ "group_size": 256,
113
+ "num_bits": 4,
114
+ "num_sms": 128,
115
+ "template_id": 37
116
+ },
117
+ "model.layers.1.mlp.gate_proj": {
118
+ "K": 5120,
119
+ "M": 1,
120
+ "N": 13824,
121
+ "device": "cuda:1",
122
+ "dtype": "torch.float16",
123
+ "group_size": 256,
124
+ "num_bits": 4,
125
+ "num_sms": 128,
126
+ "template_id": 54
127
+ },
128
+ "model.layers.1.mlp.up_proj": {
129
+ "K": 5120,
130
+ "M": 1,
131
+ "N": 13824,
132
+ "device": "cuda:1",
133
+ "dtype": "torch.float16",
134
+ "group_size": 256,
135
+ "num_bits": 4,
136
+ "num_sms": 128,
137
+ "template_id": 54
138
+ },
139
+ "model.layers.1.self_attn.k_proj": {
140
+ "K": 5120,
141
+ "M": 1,
142
+ "N": 1024,
143
+ "device": "cuda:1",
144
+ "dtype": "torch.float16",
145
+ "group_size": 256,
146
+ "num_bits": 4,
147
+ "num_sms": 128,
148
+ "template_id": 59
149
+ },
150
+ "model.layers.1.self_attn.o_proj": {
151
+ "K": 5120,
152
+ "M": 1,
153
+ "N": 5120,
154
+ "device": "cuda:1",
155
+ "dtype": "torch.float16",
156
+ "group_size": 256,
157
+ "num_bits": 4,
158
+ "num_sms": 128,
159
+ "template_id": 46
160
+ },
161
+ "model.layers.1.self_attn.q_proj": {
162
+ "K": 5120,
163
+ "M": 1,
164
+ "N": 5120,
165
+ "device": "cuda:1",
166
+ "dtype": "torch.float16",
167
+ "group_size": 256,
168
+ "num_bits": 4,
169
+ "num_sms": 128,
170
+ "template_id": 46
171
+ },
172
+ "model.layers.1.self_attn.v_proj": {
173
+ "K": 5120,
174
+ "M": 1,
175
+ "N": 1024,
176
+ "device": "cuda:1",
177
+ "dtype": "torch.float16",
178
+ "group_size": 256,
179
+ "num_bits": 4,
180
+ "num_sms": 128,
181
+ "template_id": 59
182
+ },
183
+ "model.layers.10.mlp.down_proj": {
184
+ "K": 13824,
185
+ "M": 1,
186
+ "N": 5120,
187
+ "device": "cuda:2",
188
+ "dtype": "torch.float16",
189
+ "group_size": 256,
190
+ "num_bits": 4,
191
+ "num_sms": 128,
192
+ "template_id": 37
193
+ },
194
+ "model.layers.10.mlp.gate_proj": {
195
+ "K": 5120,
196
+ "M": 1,
197
+ "N": 13824,
198
+ "device": "cuda:2",
199
+ "dtype": "torch.float16",
200
+ "group_size": 256,
201
+ "num_bits": 4,
202
+ "num_sms": 128,
203
+ "template_id": 54
204
+ },
205
+ "model.layers.10.mlp.up_proj": {
206
+ "K": 5120,
207
+ "M": 1,
208
+ "N": 13824,
209
+ "device": "cuda:2",
210
+ "dtype": "torch.float16",
211
+ "group_size": 256,
212
+ "num_bits": 4,
213
+ "num_sms": 128,
214
+ "template_id": 54
215
+ },
216
+ "model.layers.10.self_attn.k_proj": {
217
+ "K": 5120,
218
+ "M": 1,
219
+ "N": 1024,
220
+ "device": "cuda:2",
221
+ "dtype": "torch.float16",
222
+ "group_size": 256,
223
+ "num_bits": 4,
224
+ "num_sms": 128,
225
+ "template_id": 59
226
+ },
227
+ "model.layers.10.self_attn.o_proj": {
228
+ "K": 5120,
229
+ "M": 1,
230
+ "N": 5120,
231
+ "device": "cuda:2",
232
+ "dtype": "torch.float16",
233
+ "group_size": 256,
234
+ "num_bits": 4,
235
+ "num_sms": 128,
236
+ "template_id": 46
237
+ },
238
+ "model.layers.10.self_attn.q_proj": {
239
+ "K": 5120,
240
+ "M": 1,
241
+ "N": 5120,
242
+ "device": "cuda:2",
243
+ "dtype": "torch.float16",
244
+ "group_size": 256,
245
+ "num_bits": 4,
246
+ "num_sms": 128,
247
+ "template_id": 46
248
+ },
249
+ "model.layers.10.self_attn.v_proj": {
250
+ "K": 5120,
251
+ "M": 1,
252
+ "N": 1024,
253
+ "device": "cuda:2",
254
+ "dtype": "torch.float16",
255
+ "group_size": 256,
256
+ "num_bits": 4,
257
+ "num_sms": 128,
258
+ "template_id": 59
259
+ },
260
+ "model.layers.11.mlp.down_proj": {
261
+ "K": 13824,
262
+ "M": 1,
263
+ "N": 5120,
264
+ "device": "cuda:2",
265
+ "dtype": "torch.float16",
266
+ "group_size": 256,
267
+ "num_bits": 4,
268
+ "num_sms": 128,
269
+ "template_id": 37
270
+ },
271
+ "model.layers.11.mlp.gate_proj": {
272
+ "K": 5120,
273
+ "M": 1,
274
+ "N": 13824,
275
+ "device": "cuda:2",
276
+ "dtype": "torch.float16",
277
+ "group_size": 256,
278
+ "num_bits": 4,
279
+ "num_sms": 128,
280
+ "template_id": 54
281
+ },
282
+ "model.layers.11.mlp.up_proj": {
283
+ "K": 5120,
284
+ "M": 1,
285
+ "N": 13824,
286
+ "device": "cuda:2",
287
+ "dtype": "torch.float16",
288
+ "group_size": 256,
289
+ "num_bits": 4,
290
+ "num_sms": 128,
291
+ "template_id": 54
292
+ },
293
+ "model.layers.11.self_attn.k_proj": {
294
+ "K": 5120,
295
+ "M": 1,
296
+ "N": 1024,
297
+ "device": "cuda:2",
298
+ "dtype": "torch.float16",
299
+ "group_size": 256,
300
+ "num_bits": 4,
301
+ "num_sms": 128,
302
+ "template_id": 59
303
+ },
304
+ "model.layers.11.self_attn.o_proj": {
305
+ "K": 5120,
306
+ "M": 1,
307
+ "N": 5120,
308
+ "device": "cuda:2",
309
+ "dtype": "torch.float16",
310
+ "group_size": 256,
311
+ "num_bits": 4,
312
+ "num_sms": 128,
313
+ "template_id": 46
314
+ },
315
+ "model.layers.11.self_attn.q_proj": {
316
+ "K": 5120,
317
+ "M": 1,
318
+ "N": 5120,
319
+ "device": "cuda:2",
320
+ "dtype": "torch.float16",
321
+ "group_size": 256,
322
+ "num_bits": 4,
323
+ "num_sms": 128,
324
+ "template_id": 46
325
+ },
326
+ "model.layers.11.self_attn.v_proj": {
327
+ "K": 5120,
328
+ "M": 1,
329
+ "N": 1024,
330
+ "device": "cuda:2",
331
+ "dtype": "torch.float16",
332
+ "group_size": 256,
333
+ "num_bits": 4,
334
+ "num_sms": 128,
335
+ "template_id": 59
336
+ },
337
+ "model.layers.12.mlp.down_proj": {
338
+ "K": 13824,
339
+ "M": 1,
340
+ "N": 5120,
341
+ "device": "cuda:2",
342
+ "dtype": "torch.float16",
343
+ "group_size": 256,
344
+ "num_bits": 4,
345
+ "num_sms": 128,
346
+ "template_id": 37
347
+ },
348
+ "model.layers.12.mlp.gate_proj": {
349
+ "K": 5120,
350
+ "M": 1,
351
+ "N": 13824,
352
+ "device": "cuda:2",
353
+ "dtype": "torch.float16",
354
+ "group_size": 256,
355
+ "num_bits": 4,
356
+ "num_sms": 128,
357
+ "template_id": 54
358
+ },
359
+ "model.layers.12.mlp.up_proj": {
360
+ "K": 5120,
361
+ "M": 1,
362
+ "N": 13824,
363
+ "device": "cuda:2",
364
+ "dtype": "torch.float16",
365
+ "group_size": 256,
366
+ "num_bits": 4,
367
+ "num_sms": 128,
368
+ "template_id": 54
369
+ },
370
+ "model.layers.12.self_attn.k_proj": {
371
+ "K": 5120,
372
+ "M": 1,
373
+ "N": 1024,
374
+ "device": "cuda:2",
375
+ "dtype": "torch.float16",
376
+ "group_size": 256,
377
+ "num_bits": 4,
378
+ "num_sms": 128,
379
+ "template_id": 59
380
+ },
381
+ "model.layers.12.self_attn.o_proj": {
382
+ "K": 5120,
383
+ "M": 1,
384
+ "N": 5120,
385
+ "device": "cuda:2",
386
+ "dtype": "torch.float16",
387
+ "group_size": 256,
388
+ "num_bits": 4,
389
+ "num_sms": 128,
390
+ "template_id": 46
391
+ },
392
+ "model.layers.12.self_attn.q_proj": {
393
+ "K": 5120,
394
+ "M": 1,
395
+ "N": 5120,
396
+ "device": "cuda:2",
397
+ "dtype": "torch.float16",
398
+ "group_size": 256,
399
+ "num_bits": 4,
400
+ "num_sms": 128,
401
+ "template_id": 46
402
+ },
403
+ "model.layers.12.self_attn.v_proj": {
404
+ "K": 5120,
405
+ "M": 1,
406
+ "N": 1024,
407
+ "device": "cuda:2",
408
+ "dtype": "torch.float16",
409
+ "group_size": 256,
410
+ "num_bits": 4,
411
+ "num_sms": 128,
412
+ "template_id": 59
413
+ },
414
+ "model.layers.13.mlp.down_proj": {
415
+ "K": 13824,
416
+ "M": 1,
417
+ "N": 5120,
418
+ "device": "cuda:2",
419
+ "dtype": "torch.float16",
420
+ "group_size": 256,
421
+ "num_bits": 4,
422
+ "num_sms": 128,
423
+ "template_id": 37
424
+ },
425
+ "model.layers.13.mlp.gate_proj": {
426
+ "K": 5120,
427
+ "M": 1,
428
+ "N": 13824,
429
+ "device": "cuda:2",
430
+ "dtype": "torch.float16",
431
+ "group_size": 256,
432
+ "num_bits": 4,
433
+ "num_sms": 128,
434
+ "template_id": 54
435
+ },
436
+ "model.layers.13.mlp.up_proj": {
437
+ "K": 5120,
438
+ "M": 1,
439
+ "N": 13824,
440
+ "device": "cuda:2",
441
+ "dtype": "torch.float16",
442
+ "group_size": 256,
443
+ "num_bits": 4,
444
+ "num_sms": 128,
445
+ "template_id": 54
446
+ },
447
+ "model.layers.13.self_attn.k_proj": {
448
+ "K": 5120,
449
+ "M": 1,
450
+ "N": 1024,
451
+ "device": "cuda:2",
452
+ "dtype": "torch.float16",
453
+ "group_size": 256,
454
+ "num_bits": 4,
455
+ "num_sms": 128,
456
+ "template_id": 59
457
+ },
458
+ "model.layers.13.self_attn.o_proj": {
459
+ "K": 5120,
460
+ "M": 1,
461
+ "N": 5120,
462
+ "device": "cuda:2",
463
+ "dtype": "torch.float16",
464
+ "group_size": 256,
465
+ "num_bits": 4,
466
+ "num_sms": 128,
467
+ "template_id": 46
468
+ },
469
+ "model.layers.13.self_attn.q_proj": {
470
+ "K": 5120,
471
+ "M": 1,
472
+ "N": 5120,
473
+ "device": "cuda:2",
474
+ "dtype": "torch.float16",
475
+ "group_size": 256,
476
+ "num_bits": 4,
477
+ "num_sms": 128,
478
+ "template_id": 46
479
+ },
480
+ "model.layers.13.self_attn.v_proj": {
481
+ "K": 5120,
482
+ "M": 1,
483
+ "N": 1024,
484
+ "device": "cuda:2",
485
+ "dtype": "torch.float16",
486
+ "group_size": 256,
487
+ "num_bits": 4,
488
+ "num_sms": 128,
489
+ "template_id": 59
490
+ },
491
+ "model.layers.14.mlp.down_proj": {
492
+ "K": 13824,
493
+ "M": 1,
494
+ "N": 5120,
495
+ "device": "cuda:2",
496
+ "dtype": "torch.float16",
497
+ "group_size": 256,
498
+ "num_bits": 4,
499
+ "num_sms": 128,
500
+ "template_id": 37
501
+ },
502
+ "model.layers.14.mlp.gate_proj": {
503
+ "K": 5120,
504
+ "M": 1,
505
+ "N": 13824,
506
+ "device": "cuda:2",
507
+ "dtype": "torch.float16",
508
+ "group_size": 256,
509
+ "num_bits": 4,
510
+ "num_sms": 128,
511
+ "template_id": 54
512
+ },
513
+ "model.layers.14.mlp.up_proj": {
514
+ "K": 5120,
515
+ "M": 1,
516
+ "N": 13824,
517
+ "device": "cuda:2",
518
+ "dtype": "torch.float16",
519
+ "group_size": 256,
520
+ "num_bits": 4,
521
+ "num_sms": 128,
522
+ "template_id": 54
523
+ },
524
+ "model.layers.14.self_attn.k_proj": {
525
+ "K": 5120,
526
+ "M": 1,
527
+ "N": 1024,
528
+ "device": "cuda:2",
529
+ "dtype": "torch.float16",
530
+ "group_size": 256,
531
+ "num_bits": 4,
532
+ "num_sms": 128,
533
+ "template_id": 59
534
+ },
535
+ "model.layers.14.self_attn.o_proj": {
536
+ "K": 5120,
537
+ "M": 1,
538
+ "N": 5120,
539
+ "device": "cuda:2",
540
+ "dtype": "torch.float16",
541
+ "group_size": 256,
542
+ "num_bits": 4,
543
+ "num_sms": 128,
544
+ "template_id": 46
545
+ },
546
+ "model.layers.14.self_attn.q_proj": {
547
+ "K": 5120,
548
+ "M": 1,
549
+ "N": 5120,
550
+ "device": "cuda:2",
551
+ "dtype": "torch.float16",
552
+ "group_size": 256,
553
+ "num_bits": 4,
554
+ "num_sms": 128,
555
+ "template_id": 46
556
+ },
557
+ "model.layers.14.self_attn.v_proj": {
558
+ "K": 5120,
559
+ "M": 1,
560
+ "N": 1024,
561
+ "device": "cuda:2",
562
+ "dtype": "torch.float16",
563
+ "group_size": 256,
564
+ "num_bits": 4,
565
+ "num_sms": 128,
566
+ "template_id": 59
567
+ },
568
+ "model.layers.15.mlp.down_proj": {
569
+ "K": 13824,
570
+ "M": 1,
571
+ "N": 5120,
572
+ "device": "cuda:2",
573
+ "dtype": "torch.float16",
574
+ "group_size": 256,
575
+ "num_bits": 4,
576
+ "num_sms": 128,
577
+ "template_id": 37
578
+ },
579
+ "model.layers.15.mlp.gate_proj": {
580
+ "K": 5120,
581
+ "M": 1,
582
+ "N": 13824,
583
+ "device": "cuda:2",
584
+ "dtype": "torch.float16",
585
+ "group_size": 256,
586
+ "num_bits": 4,
587
+ "num_sms": 128,
588
+ "template_id": 54
589
+ },
590
+ "model.layers.15.mlp.up_proj": {
591
+ "K": 5120,
592
+ "M": 1,
593
+ "N": 13824,
594
+ "device": "cuda:2",
595
+ "dtype": "torch.float16",
596
+ "group_size": 256,
597
+ "num_bits": 4,
598
+ "num_sms": 128,
599
+ "template_id": 54
600
+ },
601
+ "model.layers.15.self_attn.k_proj": {
602
+ "K": 5120,
603
+ "M": 1,
604
+ "N": 1024,
605
+ "device": "cuda:2",
606
+ "dtype": "torch.float16",
607
+ "group_size": 256,
608
+ "num_bits": 4,
609
+ "num_sms": 128,
610
+ "template_id": 59
611
+ },
612
+ "model.layers.15.self_attn.o_proj": {
613
+ "K": 5120,
614
+ "M": 1,
615
+ "N": 5120,
616
+ "device": "cuda:2",
617
+ "dtype": "torch.float16",
618
+ "group_size": 256,
619
+ "num_bits": 4,
620
+ "num_sms": 128,
621
+ "template_id": 46
622
+ },
623
+ "model.layers.15.self_attn.q_proj": {
624
+ "K": 5120,
625
+ "M": 1,
626
+ "N": 5120,
627
+ "device": "cuda:2",
628
+ "dtype": "torch.float16",
629
+ "group_size": 256,
630
+ "num_bits": 4,
631
+ "num_sms": 128,
632
+ "template_id": 46
633
+ },
634
+ "model.layers.15.self_attn.v_proj": {
635
+ "K": 5120,
636
+ "M": 1,
637
+ "N": 1024,
638
+ "device": "cuda:2",
639
+ "dtype": "torch.float16",
640
+ "group_size": 256,
641
+ "num_bits": 4,
642
+ "num_sms": 128,
643
+ "template_id": 59
644
+ },
645
+ "model.layers.16.mlp.down_proj": {
646
+ "K": 13824,
647
+ "M": 1,
648
+ "N": 5120,
649
+ "device": "cuda:2",
650
+ "dtype": "torch.float16",
651
+ "group_size": 256,
652
+ "num_bits": 4,
653
+ "num_sms": 128,
654
+ "template_id": 37
655
+ },
656
+ "model.layers.16.mlp.gate_proj": {
657
+ "K": 5120,
658
+ "M": 1,
659
+ "N": 13824,
660
+ "device": "cuda:2",
661
+ "dtype": "torch.float16",
662
+ "group_size": 256,
663
+ "num_bits": 4,
664
+ "num_sms": 128,
665
+ "template_id": 54
666
+ },
667
+ "model.layers.16.mlp.up_proj": {
668
+ "K": 5120,
669
+ "M": 1,
670
+ "N": 13824,
671
+ "device": "cuda:2",
672
+ "dtype": "torch.float16",
673
+ "group_size": 256,
674
+ "num_bits": 4,
675
+ "num_sms": 128,
676
+ "template_id": 54
677
+ },
678
+ "model.layers.16.self_attn.k_proj": {
679
+ "K": 5120,
680
+ "M": 1,
681
+ "N": 1024,
682
+ "device": "cuda:2",
683
+ "dtype": "torch.float16",
684
+ "group_size": 256,
685
+ "num_bits": 4,
686
+ "num_sms": 128,
687
+ "template_id": 59
688
+ },
689
+ "model.layers.16.self_attn.o_proj": {
690
+ "K": 5120,
691
+ "M": 1,
692
+ "N": 5120,
693
+ "device": "cuda:2",
694
+ "dtype": "torch.float16",
695
+ "group_size": 256,
696
+ "num_bits": 4,
697
+ "num_sms": 128,
698
+ "template_id": 46
699
+ },
700
+ "model.layers.16.self_attn.q_proj": {
701
+ "K": 5120,
702
+ "M": 1,
703
+ "N": 5120,
704
+ "device": "cuda:2",
705
+ "dtype": "torch.float16",
706
+ "group_size": 256,
707
+ "num_bits": 4,
708
+ "num_sms": 128,
709
+ "template_id": 46
710
+ },
711
+ "model.layers.16.self_attn.v_proj": {
712
+ "K": 5120,
713
+ "M": 1,
714
+ "N": 1024,
715
+ "device": "cuda:2",
716
+ "dtype": "torch.float16",
717
+ "group_size": 256,
718
+ "num_bits": 4,
719
+ "num_sms": 128,
720
+ "template_id": 59
721
+ },
722
+ "model.layers.17.mlp.down_proj": {
723
+ "K": 13824,
724
+ "M": 1,
725
+ "N": 5120,
726
+ "device": "cuda:2",
727
+ "dtype": "torch.float16",
728
+ "group_size": 256,
729
+ "num_bits": 4,
730
+ "num_sms": 128,
731
+ "template_id": 37
732
+ },
733
+ "model.layers.17.mlp.gate_proj": {
734
+ "K": 5120,
735
+ "M": 1,
736
+ "N": 13824,
737
+ "device": "cuda:2",
738
+ "dtype": "torch.float16",
739
+ "group_size": 256,
740
+ "num_bits": 4,
741
+ "num_sms": 128,
742
+ "template_id": 54
743
+ },
744
+ "model.layers.17.mlp.up_proj": {
745
+ "K": 5120,
746
+ "M": 1,
747
+ "N": 13824,
748
+ "device": "cuda:2",
749
+ "dtype": "torch.float16",
750
+ "group_size": 256,
751
+ "num_bits": 4,
752
+ "num_sms": 128,
753
+ "template_id": 54
754
+ },
755
+ "model.layers.17.self_attn.k_proj": {
756
+ "K": 5120,
757
+ "M": 1,
758
+ "N": 1024,
759
+ "device": "cuda:2",
760
+ "dtype": "torch.float16",
761
+ "group_size": 256,
762
+ "num_bits": 4,
763
+ "num_sms": 128,
764
+ "template_id": 59
765
+ },
766
+ "model.layers.17.self_attn.o_proj": {
767
+ "K": 5120,
768
+ "M": 1,
769
+ "N": 5120,
770
+ "device": "cuda:2",
771
+ "dtype": "torch.float16",
772
+ "group_size": 256,
773
+ "num_bits": 4,
774
+ "num_sms": 128,
775
+ "template_id": 46
776
+ },
777
+ "model.layers.17.self_attn.q_proj": {
778
+ "K": 5120,
779
+ "M": 1,
780
+ "N": 5120,
781
+ "device": "cuda:2",
782
+ "dtype": "torch.float16",
783
+ "group_size": 256,
784
+ "num_bits": 4,
785
+ "num_sms": 128,
786
+ "template_id": 46
787
+ },
788
+ "model.layers.17.self_attn.v_proj": {
789
+ "K": 5120,
790
+ "M": 1,
791
+ "N": 1024,
792
+ "device": "cuda:2",
793
+ "dtype": "torch.float16",
794
+ "group_size": 256,
795
+ "num_bits": 4,
796
+ "num_sms": 128,
797
+ "template_id": 59
798
+ },
799
+ "model.layers.18.mlp.down_proj": {
800
+ "K": 13824,
801
+ "M": 1,
802
+ "N": 5120,
803
+ "device": "cuda:2",
804
+ "dtype": "torch.float16",
805
+ "group_size": 256,
806
+ "num_bits": 4,
807
+ "num_sms": 128,
808
+ "template_id": 37
809
+ },
810
+ "model.layers.18.mlp.gate_proj": {
811
+ "K": 5120,
812
+ "M": 1,
813
+ "N": 13824,
814
+ "device": "cuda:2",
815
+ "dtype": "torch.float16",
816
+ "group_size": 256,
817
+ "num_bits": 4,
818
+ "num_sms": 128,
819
+ "template_id": 54
820
+ },
821
+ "model.layers.18.mlp.up_proj": {
822
+ "K": 5120,
823
+ "M": 1,
824
+ "N": 13824,
825
+ "device": "cuda:2",
826
+ "dtype": "torch.float16",
827
+ "group_size": 256,
828
+ "num_bits": 4,
829
+ "num_sms": 128,
830
+ "template_id": 54
831
+ },
832
+ "model.layers.18.self_attn.k_proj": {
833
+ "K": 5120,
834
+ "M": 1,
835
+ "N": 1024,
836
+ "device": "cuda:2",
837
+ "dtype": "torch.float16",
838
+ "group_size": 256,
839
+ "num_bits": 4,
840
+ "num_sms": 128,
841
+ "template_id": 59
842
+ },
843
+ "model.layers.18.self_attn.o_proj": {
844
+ "K": 5120,
845
+ "M": 1,
846
+ "N": 5120,
847
+ "device": "cuda:2",
848
+ "dtype": "torch.float16",
849
+ "group_size": 256,
850
+ "num_bits": 4,
851
+ "num_sms": 128,
852
+ "template_id": 46
853
+ },
854
+ "model.layers.18.self_attn.q_proj": {
855
+ "K": 5120,
856
+ "M": 1,
857
+ "N": 5120,
858
+ "device": "cuda:2",
859
+ "dtype": "torch.float16",
860
+ "group_size": 256,
861
+ "num_bits": 4,
862
+ "num_sms": 128,
863
+ "template_id": 46
864
+ },
865
+ "model.layers.18.self_attn.v_proj": {
866
+ "K": 5120,
867
+ "M": 1,
868
+ "N": 1024,
869
+ "device": "cuda:2",
870
+ "dtype": "torch.float16",
871
+ "group_size": 256,
872
+ "num_bits": 4,
873
+ "num_sms": 128,
874
+ "template_id": 59
875
+ },
876
+ "model.layers.19.mlp.down_proj": {
877
+ "K": 13824,
878
+ "M": 1,
879
+ "N": 5120,
880
+ "device": "cuda:2",
881
+ "dtype": "torch.float16",
882
+ "group_size": 256,
883
+ "num_bits": 4,
884
+ "num_sms": 128,
885
+ "template_id": 37
886
+ },
887
+ "model.layers.19.mlp.gate_proj": {
888
+ "K": 5120,
889
+ "M": 1,
890
+ "N": 13824,
891
+ "device": "cuda:2",
892
+ "dtype": "torch.float16",
893
+ "group_size": 256,
894
+ "num_bits": 4,
895
+ "num_sms": 128,
896
+ "template_id": 54
897
+ },
898
+ "model.layers.19.mlp.up_proj": {
899
+ "K": 5120,
900
+ "M": 1,
901
+ "N": 13824,
902
+ "device": "cuda:2",
903
+ "dtype": "torch.float16",
904
+ "group_size": 256,
905
+ "num_bits": 4,
906
+ "num_sms": 128,
907
+ "template_id": 54
908
+ },
909
+ "model.layers.19.self_attn.k_proj": {
910
+ "K": 5120,
911
+ "M": 1,
912
+ "N": 1024,
913
+ "device": "cuda:2",
914
+ "dtype": "torch.float16",
915
+ "group_size": 256,
916
+ "num_bits": 4,
917
+ "num_sms": 128,
918
+ "template_id": 59
919
+ },
920
+ "model.layers.19.self_attn.o_proj": {
921
+ "K": 5120,
922
+ "M": 1,
923
+ "N": 5120,
924
+ "device": "cuda:2",
925
+ "dtype": "torch.float16",
926
+ "group_size": 256,
927
+ "num_bits": 4,
928
+ "num_sms": 128,
929
+ "template_id": 46
930
+ },
931
+ "model.layers.19.self_attn.q_proj": {
932
+ "K": 5120,
933
+ "M": 1,
934
+ "N": 5120,
935
+ "device": "cuda:2",
936
+ "dtype": "torch.float16",
937
+ "group_size": 256,
938
+ "num_bits": 4,
939
+ "num_sms": 128,
940
+ "template_id": 46
941
+ },
942
+ "model.layers.19.self_attn.v_proj": {
943
+ "K": 5120,
944
+ "M": 1,
945
+ "N": 1024,
946
+ "device": "cuda:2",
947
+ "dtype": "torch.float16",
948
+ "group_size": 256,
949
+ "num_bits": 4,
950
+ "num_sms": 128,
951
+ "template_id": 59
952
+ },
953
+ "model.layers.2.mlp.down_proj": {
954
+ "K": 13824,
955
+ "M": 1,
956
+ "N": 5120,
957
+ "device": "cuda:1",
958
+ "dtype": "torch.float16",
959
+ "group_size": 256,
960
+ "num_bits": 4,
961
+ "num_sms": 128,
962
+ "template_id": 37
963
+ },
964
+ "model.layers.2.mlp.gate_proj": {
965
+ "K": 5120,
966
+ "M": 1,
967
+ "N": 13824,
968
+ "device": "cuda:1",
969
+ "dtype": "torch.float16",
970
+ "group_size": 256,
971
+ "num_bits": 4,
972
+ "num_sms": 128,
973
+ "template_id": 54
974
+ },
975
+ "model.layers.2.mlp.up_proj": {
976
+ "K": 5120,
977
+ "M": 1,
978
+ "N": 13824,
979
+ "device": "cuda:1",
980
+ "dtype": "torch.float16",
981
+ "group_size": 256,
982
+ "num_bits": 4,
983
+ "num_sms": 128,
984
+ "template_id": 54
985
+ },
986
+ "model.layers.2.self_attn.k_proj": {
987
+ "K": 5120,
988
+ "M": 1,
989
+ "N": 1024,
990
+ "device": "cuda:1",
991
+ "dtype": "torch.float16",
992
+ "group_size": 256,
993
+ "num_bits": 4,
994
+ "num_sms": 128,
995
+ "template_id": 59
996
+ },
997
+ "model.layers.2.self_attn.o_proj": {
998
+ "K": 5120,
999
+ "M": 1,
1000
+ "N": 5120,
1001
+ "device": "cuda:1",
1002
+ "dtype": "torch.float16",
1003
+ "group_size": 256,
1004
+ "num_bits": 4,
1005
+ "num_sms": 128,
1006
+ "template_id": 46
1007
+ },
1008
+ "model.layers.2.self_attn.q_proj": {
1009
+ "K": 5120,
1010
+ "M": 1,
1011
+ "N": 5120,
1012
+ "device": "cuda:1",
1013
+ "dtype": "torch.float16",
1014
+ "group_size": 256,
1015
+ "num_bits": 4,
1016
+ "num_sms": 128,
1017
+ "template_id": 46
1018
+ },
1019
+ "model.layers.2.self_attn.v_proj": {
1020
+ "K": 5120,
1021
+ "M": 1,
1022
+ "N": 1024,
1023
+ "device": "cuda:1",
1024
+ "dtype": "torch.float16",
1025
+ "group_size": 256,
1026
+ "num_bits": 4,
1027
+ "num_sms": 128,
1028
+ "template_id": 59
1029
+ },
1030
+ "model.layers.20.mlp.down_proj": {
1031
+ "K": 13824,
1032
+ "M": 1,
1033
+ "N": 5120,
1034
+ "device": "cuda:2",
1035
+ "dtype": "torch.float16",
1036
+ "group_size": 256,
1037
+ "num_bits": 4,
1038
+ "num_sms": 128,
1039
+ "template_id": 37
1040
+ },
1041
+ "model.layers.20.mlp.gate_proj": {
1042
+ "K": 5120,
1043
+ "M": 1,
1044
+ "N": 13824,
1045
+ "device": "cuda:2",
1046
+ "dtype": "torch.float16",
1047
+ "group_size": 256,
1048
+ "num_bits": 4,
1049
+ "num_sms": 128,
1050
+ "template_id": 54
1051
+ },
1052
+ "model.layers.20.mlp.up_proj": {
1053
+ "K": 5120,
1054
+ "M": 1,
1055
+ "N": 13824,
1056
+ "device": "cuda:2",
1057
+ "dtype": "torch.float16",
1058
+ "group_size": 256,
1059
+ "num_bits": 4,
1060
+ "num_sms": 128,
1061
+ "template_id": 54
1062
+ },
1063
+ "model.layers.20.self_attn.k_proj": {
1064
+ "K": 5120,
1065
+ "M": 1,
1066
+ "N": 1024,
1067
+ "device": "cuda:2",
1068
+ "dtype": "torch.float16",
1069
+ "group_size": 256,
1070
+ "num_bits": 4,
1071
+ "num_sms": 128,
1072
+ "template_id": 59
1073
+ },
1074
+ "model.layers.20.self_attn.o_proj": {
1075
+ "K": 5120,
1076
+ "M": 1,
1077
+ "N": 5120,
1078
+ "device": "cuda:2",
1079
+ "dtype": "torch.float16",
1080
+ "group_size": 256,
1081
+ "num_bits": 4,
1082
+ "num_sms": 128,
1083
+ "template_id": 46
1084
+ },
1085
+ "model.layers.20.self_attn.q_proj": {
1086
+ "K": 5120,
1087
+ "M": 1,
1088
+ "N": 5120,
1089
+ "device": "cuda:2",
1090
+ "dtype": "torch.float16",
1091
+ "group_size": 256,
1092
+ "num_bits": 4,
1093
+ "num_sms": 128,
1094
+ "template_id": 46
1095
+ },
1096
+ "model.layers.20.self_attn.v_proj": {
1097
+ "K": 5120,
1098
+ "M": 1,
1099
+ "N": 1024,
1100
+ "device": "cuda:2",
1101
+ "dtype": "torch.float16",
1102
+ "group_size": 256,
1103
+ "num_bits": 4,
1104
+ "num_sms": 128,
1105
+ "template_id": 59
1106
+ },
1107
+ "model.layers.21.mlp.down_proj": {
1108
+ "K": 13824,
1109
+ "M": 1,
1110
+ "N": 5120,
1111
+ "device": "cuda:2",
1112
+ "dtype": "torch.float16",
1113
+ "group_size": 256,
1114
+ "num_bits": 4,
1115
+ "num_sms": 128,
1116
+ "template_id": 37
1117
+ },
1118
+ "model.layers.21.mlp.gate_proj": {
1119
+ "K": 5120,
1120
+ "M": 1,
1121
+ "N": 13824,
1122
+ "device": "cuda:2",
1123
+ "dtype": "torch.float16",
1124
+ "group_size": 256,
1125
+ "num_bits": 4,
1126
+ "num_sms": 128,
1127
+ "template_id": 54
1128
+ },
1129
+ "model.layers.21.mlp.up_proj": {
1130
+ "K": 5120,
1131
+ "M": 1,
1132
+ "N": 13824,
1133
+ "device": "cuda:2",
1134
+ "dtype": "torch.float16",
1135
+ "group_size": 256,
1136
+ "num_bits": 4,
1137
+ "num_sms": 128,
1138
+ "template_id": 54
1139
+ },
1140
+ "model.layers.21.self_attn.k_proj": {
1141
+ "K": 5120,
1142
+ "M": 1,
1143
+ "N": 1024,
1144
+ "device": "cuda:2",
1145
+ "dtype": "torch.float16",
1146
+ "group_size": 256,
1147
+ "num_bits": 4,
1148
+ "num_sms": 128,
1149
+ "template_id": 59
1150
+ },
1151
+ "model.layers.21.self_attn.o_proj": {
1152
+ "K": 5120,
1153
+ "M": 1,
1154
+ "N": 5120,
1155
+ "device": "cuda:2",
1156
+ "dtype": "torch.float16",
1157
+ "group_size": 256,
1158
+ "num_bits": 4,
1159
+ "num_sms": 128,
1160
+ "template_id": 46
1161
+ },
1162
+ "model.layers.21.self_attn.q_proj": {
1163
+ "K": 5120,
1164
+ "M": 1,
1165
+ "N": 5120,
1166
+ "device": "cuda:2",
1167
+ "dtype": "torch.float16",
1168
+ "group_size": 256,
1169
+ "num_bits": 4,
1170
+ "num_sms": 128,
1171
+ "template_id": 46
1172
+ },
1173
+ "model.layers.21.self_attn.v_proj": {
1174
+ "K": 5120,
1175
+ "M": 1,
1176
+ "N": 1024,
1177
+ "device": "cuda:2",
1178
+ "dtype": "torch.float16",
1179
+ "group_size": 256,
1180
+ "num_bits": 4,
1181
+ "num_sms": 128,
1182
+ "template_id": 59
1183
+ },
1184
+ "model.layers.22.mlp.down_proj": {
1185
+ "K": 13824,
1186
+ "M": 1,
1187
+ "N": 5120,
1188
+ "device": "cuda:2",
1189
+ "dtype": "torch.float16",
1190
+ "group_size": 256,
1191
+ "num_bits": 4,
1192
+ "num_sms": 128,
1193
+ "template_id": 37
1194
+ },
1195
+ "model.layers.22.mlp.gate_proj": {
1196
+ "K": 5120,
1197
+ "M": 1,
1198
+ "N": 13824,
1199
+ "device": "cuda:2",
1200
+ "dtype": "torch.float16",
1201
+ "group_size": 256,
1202
+ "num_bits": 4,
1203
+ "num_sms": 128,
1204
+ "template_id": 54
1205
+ },
1206
+ "model.layers.22.mlp.up_proj": {
1207
+ "K": 5120,
1208
+ "M": 1,
1209
+ "N": 13824,
1210
+ "device": "cuda:2",
1211
+ "dtype": "torch.float16",
1212
+ "group_size": 256,
1213
+ "num_bits": 4,
1214
+ "num_sms": 128,
1215
+ "template_id": 54
1216
+ },
1217
+ "model.layers.22.self_attn.k_proj": {
1218
+ "K": 5120,
1219
+ "M": 1,
1220
+ "N": 1024,
1221
+ "device": "cuda:2",
1222
+ "dtype": "torch.float16",
1223
+ "group_size": 256,
1224
+ "num_bits": 4,
1225
+ "num_sms": 128,
1226
+ "template_id": 59
1227
+ },
1228
+ "model.layers.22.self_attn.o_proj": {
1229
+ "K": 5120,
1230
+ "M": 1,
1231
+ "N": 5120,
1232
+ "device": "cuda:2",
1233
+ "dtype": "torch.float16",
1234
+ "group_size": 256,
1235
+ "num_bits": 4,
1236
+ "num_sms": 128,
1237
+ "template_id": 46
1238
+ },
1239
+ "model.layers.22.self_attn.q_proj": {
1240
+ "K": 5120,
1241
+ "M": 1,
1242
+ "N": 5120,
1243
+ "device": "cuda:2",
1244
+ "dtype": "torch.float16",
1245
+ "group_size": 256,
1246
+ "num_bits": 4,
1247
+ "num_sms": 128,
1248
+ "template_id": 46
1249
+ },
1250
+ "model.layers.22.self_attn.v_proj": {
1251
+ "K": 5120,
1252
+ "M": 1,
1253
+ "N": 1024,
1254
+ "device": "cuda:2",
1255
+ "dtype": "torch.float16",
1256
+ "group_size": 256,
1257
+ "num_bits": 4,
1258
+ "num_sms": 128,
1259
+ "template_id": 59
1260
+ },
1261
+ "model.layers.23.mlp.down_proj": {
1262
+ "K": 13824,
1263
+ "M": 1,
1264
+ "N": 5120,
1265
+ "device": "cuda:2",
1266
+ "dtype": "torch.float16",
1267
+ "group_size": 256,
1268
+ "num_bits": 4,
1269
+ "num_sms": 128,
1270
+ "template_id": 37
1271
+ },
1272
+ "model.layers.23.mlp.gate_proj": {
1273
+ "K": 5120,
1274
+ "M": 1,
1275
+ "N": 13824,
1276
+ "device": "cuda:2",
1277
+ "dtype": "torch.float16",
1278
+ "group_size": 256,
1279
+ "num_bits": 4,
1280
+ "num_sms": 128,
1281
+ "template_id": 54
1282
+ },
1283
+ "model.layers.23.mlp.up_proj": {
1284
+ "K": 5120,
1285
+ "M": 1,
1286
+ "N": 13824,
1287
+ "device": "cuda:2",
1288
+ "dtype": "torch.float16",
1289
+ "group_size": 256,
1290
+ "num_bits": 4,
1291
+ "num_sms": 128,
1292
+ "template_id": 54
1293
+ },
1294
+ "model.layers.23.self_attn.k_proj": {
1295
+ "K": 5120,
1296
+ "M": 1,
1297
+ "N": 1024,
1298
+ "device": "cuda:2",
1299
+ "dtype": "torch.float16",
1300
+ "group_size": 256,
1301
+ "num_bits": 4,
1302
+ "num_sms": 128,
1303
+ "template_id": 59
1304
+ },
1305
+ "model.layers.23.self_attn.o_proj": {
1306
+ "K": 5120,
1307
+ "M": 1,
1308
+ "N": 5120,
1309
+ "device": "cuda:2",
1310
+ "dtype": "torch.float16",
1311
+ "group_size": 256,
1312
+ "num_bits": 4,
1313
+ "num_sms": 128,
1314
+ "template_id": 46
1315
+ },
1316
+ "model.layers.23.self_attn.q_proj": {
1317
+ "K": 5120,
1318
+ "M": 1,
1319
+ "N": 5120,
1320
+ "device": "cuda:2",
1321
+ "dtype": "torch.float16",
1322
+ "group_size": 256,
1323
+ "num_bits": 4,
1324
+ "num_sms": 128,
1325
+ "template_id": 46
1326
+ },
1327
+ "model.layers.23.self_attn.v_proj": {
1328
+ "K": 5120,
1329
+ "M": 1,
1330
+ "N": 1024,
1331
+ "device": "cuda:2",
1332
+ "dtype": "torch.float16",
1333
+ "group_size": 256,
1334
+ "num_bits": 4,
1335
+ "num_sms": 128,
1336
+ "template_id": 59
1337
+ },
1338
+ "model.layers.24.mlp.down_proj": {
1339
+ "K": 13824,
1340
+ "M": 1,
1341
+ "N": 5120,
1342
+ "device": "cuda:2",
1343
+ "dtype": "torch.float16",
1344
+ "group_size": 256,
1345
+ "num_bits": 4,
1346
+ "num_sms": 128,
1347
+ "template_id": 37
1348
+ },
1349
+ "model.layers.24.mlp.gate_proj": {
1350
+ "K": 5120,
1351
+ "M": 1,
1352
+ "N": 13824,
1353
+ "device": "cuda:2",
1354
+ "dtype": "torch.float16",
1355
+ "group_size": 256,
1356
+ "num_bits": 4,
1357
+ "num_sms": 128,
1358
+ "template_id": 54
1359
+ },
1360
+ "model.layers.24.mlp.up_proj": {
1361
+ "K": 5120,
1362
+ "M": 1,
1363
+ "N": 13824,
1364
+ "device": "cuda:2",
1365
+ "dtype": "torch.float16",
1366
+ "group_size": 256,
1367
+ "num_bits": 4,
1368
+ "num_sms": 128,
1369
+ "template_id": 54
1370
+ },
1371
+ "model.layers.24.self_attn.k_proj": {
1372
+ "K": 5120,
1373
+ "M": 1,
1374
+ "N": 1024,
1375
+ "device": "cuda:2",
1376
+ "dtype": "torch.float16",
1377
+ "group_size": 256,
1378
+ "num_bits": 4,
1379
+ "num_sms": 128,
1380
+ "template_id": 59
1381
+ },
1382
+ "model.layers.24.self_attn.o_proj": {
1383
+ "K": 5120,
1384
+ "M": 1,
1385
+ "N": 5120,
1386
+ "device": "cuda:2",
1387
+ "dtype": "torch.float16",
1388
+ "group_size": 256,
1389
+ "num_bits": 4,
1390
+ "num_sms": 128,
1391
+ "template_id": 46
1392
+ },
1393
+ "model.layers.24.self_attn.q_proj": {
1394
+ "K": 5120,
1395
+ "M": 1,
1396
+ "N": 5120,
1397
+ "device": "cuda:2",
1398
+ "dtype": "torch.float16",
1399
+ "group_size": 256,
1400
+ "num_bits": 4,
1401
+ "num_sms": 128,
1402
+ "template_id": 46
1403
+ },
1404
+ "model.layers.24.self_attn.v_proj": {
1405
+ "K": 5120,
1406
+ "M": 1,
1407
+ "N": 1024,
1408
+ "device": "cuda:2",
1409
+ "dtype": "torch.float16",
1410
+ "group_size": 256,
1411
+ "num_bits": 4,
1412
+ "num_sms": 128,
1413
+ "template_id": 59
1414
+ },
1415
+ "model.layers.25.mlp.down_proj": {
1416
+ "K": 13824,
1417
+ "M": 1,
1418
+ "N": 5120,
1419
+ "device": "cuda:3",
1420
+ "dtype": "torch.float16",
1421
+ "group_size": 256,
1422
+ "num_bits": 4,
1423
+ "num_sms": 128,
1424
+ "template_id": 37
1425
+ },
1426
+ "model.layers.25.mlp.gate_proj": {
1427
+ "K": 5120,
1428
+ "M": 1,
1429
+ "N": 13824,
1430
+ "device": "cuda:3",
1431
+ "dtype": "torch.float16",
1432
+ "group_size": 256,
1433
+ "num_bits": 4,
1434
+ "num_sms": 128,
1435
+ "template_id": 54
1436
+ },
1437
+ "model.layers.25.mlp.up_proj": {
1438
+ "K": 5120,
1439
+ "M": 1,
1440
+ "N": 13824,
1441
+ "device": "cuda:3",
1442
+ "dtype": "torch.float16",
1443
+ "group_size": 256,
1444
+ "num_bits": 4,
1445
+ "num_sms": 128,
1446
+ "template_id": 54
1447
+ },
1448
+ "model.layers.25.self_attn.k_proj": {
1449
+ "K": 5120,
1450
+ "M": 1,
1451
+ "N": 1024,
1452
+ "device": "cuda:3",
1453
+ "dtype": "torch.float16",
1454
+ "group_size": 256,
1455
+ "num_bits": 4,
1456
+ "num_sms": 128,
1457
+ "template_id": 59
1458
+ },
1459
+ "model.layers.25.self_attn.o_proj": {
1460
+ "K": 5120,
1461
+ "M": 1,
1462
+ "N": 5120,
1463
+ "device": "cuda:3",
1464
+ "dtype": "torch.float16",
1465
+ "group_size": 256,
1466
+ "num_bits": 4,
1467
+ "num_sms": 128,
1468
+ "template_id": 46
1469
+ },
1470
+ "model.layers.25.self_attn.q_proj": {
1471
+ "K": 5120,
1472
+ "M": 1,
1473
+ "N": 5120,
1474
+ "device": "cuda:3",
1475
+ "dtype": "torch.float16",
1476
+ "group_size": 256,
1477
+ "num_bits": 4,
1478
+ "num_sms": 128,
1479
+ "template_id": 46
1480
+ },
1481
+ "model.layers.25.self_attn.v_proj": {
1482
+ "K": 5120,
1483
+ "M": 1,
1484
+ "N": 1024,
1485
+ "device": "cuda:3",
1486
+ "dtype": "torch.float16",
1487
+ "group_size": 256,
1488
+ "num_bits": 4,
1489
+ "num_sms": 128,
1490
+ "template_id": 59
1491
+ },
1492
+ "model.layers.26.mlp.down_proj": {
1493
+ "K": 13824,
1494
+ "M": 1,
1495
+ "N": 5120,
1496
+ "device": "cuda:3",
1497
+ "dtype": "torch.float16",
1498
+ "group_size": 256,
1499
+ "num_bits": 4,
1500
+ "num_sms": 128,
1501
+ "template_id": 37
1502
+ },
1503
+ "model.layers.26.mlp.gate_proj": {
1504
+ "K": 5120,
1505
+ "M": 1,
1506
+ "N": 13824,
1507
+ "device": "cuda:3",
1508
+ "dtype": "torch.float16",
1509
+ "group_size": 256,
1510
+ "num_bits": 4,
1511
+ "num_sms": 128,
1512
+ "template_id": 54
1513
+ },
1514
+ "model.layers.26.mlp.up_proj": {
1515
+ "K": 5120,
1516
+ "M": 1,
1517
+ "N": 13824,
1518
+ "device": "cuda:3",
1519
+ "dtype": "torch.float16",
1520
+ "group_size": 256,
1521
+ "num_bits": 4,
1522
+ "num_sms": 128,
1523
+ "template_id": 54
1524
+ },
1525
+ "model.layers.26.self_attn.k_proj": {
1526
+ "K": 5120,
1527
+ "M": 1,
1528
+ "N": 1024,
1529
+ "device": "cuda:3",
1530
+ "dtype": "torch.float16",
1531
+ "group_size": 256,
1532
+ "num_bits": 4,
1533
+ "num_sms": 128,
1534
+ "template_id": 59
1535
+ },
1536
+ "model.layers.26.self_attn.o_proj": {
1537
+ "K": 5120,
1538
+ "M": 1,
1539
+ "N": 5120,
1540
+ "device": "cuda:3",
1541
+ "dtype": "torch.float16",
1542
+ "group_size": 256,
1543
+ "num_bits": 4,
1544
+ "num_sms": 128,
1545
+ "template_id": 46
1546
+ },
1547
+ "model.layers.26.self_attn.q_proj": {
1548
+ "K": 5120,
1549
+ "M": 1,
1550
+ "N": 5120,
1551
+ "device": "cuda:3",
1552
+ "dtype": "torch.float16",
1553
+ "group_size": 256,
1554
+ "num_bits": 4,
1555
+ "num_sms": 128,
1556
+ "template_id": 46
1557
+ },
1558
+ "model.layers.26.self_attn.v_proj": {
1559
+ "K": 5120,
1560
+ "M": 1,
1561
+ "N": 1024,
1562
+ "device": "cuda:3",
1563
+ "dtype": "torch.float16",
1564
+ "group_size": 256,
1565
+ "num_bits": 4,
1566
+ "num_sms": 128,
1567
+ "template_id": 59
1568
+ },
1569
+ "model.layers.27.mlp.down_proj": {
1570
+ "K": 13824,
1571
+ "M": 1,
1572
+ "N": 5120,
1573
+ "device": "cuda:3",
1574
+ "dtype": "torch.float16",
1575
+ "group_size": 256,
1576
+ "num_bits": 4,
1577
+ "num_sms": 128,
1578
+ "template_id": 37
1579
+ },
1580
+ "model.layers.27.mlp.gate_proj": {
1581
+ "K": 5120,
1582
+ "M": 1,
1583
+ "N": 13824,
1584
+ "device": "cuda:3",
1585
+ "dtype": "torch.float16",
1586
+ "group_size": 256,
1587
+ "num_bits": 4,
1588
+ "num_sms": 128,
1589
+ "template_id": 54
1590
+ },
1591
+ "model.layers.27.mlp.up_proj": {
1592
+ "K": 5120,
1593
+ "M": 1,
1594
+ "N": 13824,
1595
+ "device": "cuda:3",
1596
+ "dtype": "torch.float16",
1597
+ "group_size": 256,
1598
+ "num_bits": 4,
1599
+ "num_sms": 128,
1600
+ "template_id": 54
1601
+ },
1602
+ "model.layers.27.self_attn.k_proj": {
1603
+ "K": 5120,
1604
+ "M": 1,
1605
+ "N": 1024,
1606
+ "device": "cuda:3",
1607
+ "dtype": "torch.float16",
1608
+ "group_size": 256,
1609
+ "num_bits": 4,
1610
+ "num_sms": 128,
1611
+ "template_id": 59
1612
+ },
1613
+ "model.layers.27.self_attn.o_proj": {
1614
+ "K": 5120,
1615
+ "M": 1,
1616
+ "N": 5120,
1617
+ "device": "cuda:3",
1618
+ "dtype": "torch.float16",
1619
+ "group_size": 256,
1620
+ "num_bits": 4,
1621
+ "num_sms": 128,
1622
+ "template_id": 46
1623
+ },
1624
+ "model.layers.27.self_attn.q_proj": {
1625
+ "K": 5120,
1626
+ "M": 1,
1627
+ "N": 5120,
1628
+ "device": "cuda:3",
1629
+ "dtype": "torch.float16",
1630
+ "group_size": 256,
1631
+ "num_bits": 4,
1632
+ "num_sms": 128,
1633
+ "template_id": 46
1634
+ },
1635
+ "model.layers.27.self_attn.v_proj": {
1636
+ "K": 5120,
1637
+ "M": 1,
1638
+ "N": 1024,
1639
+ "device": "cuda:3",
1640
+ "dtype": "torch.float16",
1641
+ "group_size": 256,
1642
+ "num_bits": 4,
1643
+ "num_sms": 128,
1644
+ "template_id": 59
1645
+ },
1646
+ "model.layers.28.mlp.down_proj": {
1647
+ "K": 13824,
1648
+ "M": 1,
1649
+ "N": 5120,
1650
+ "device": "cuda:3",
1651
+ "dtype": "torch.float16",
1652
+ "group_size": 256,
1653
+ "num_bits": 4,
1654
+ "num_sms": 128,
1655
+ "template_id": 37
1656
+ },
1657
+ "model.layers.28.mlp.gate_proj": {
1658
+ "K": 5120,
1659
+ "M": 1,
1660
+ "N": 13824,
1661
+ "device": "cuda:3",
1662
+ "dtype": "torch.float16",
1663
+ "group_size": 256,
1664
+ "num_bits": 4,
1665
+ "num_sms": 128,
1666
+ "template_id": 54
1667
+ },
1668
+ "model.layers.28.mlp.up_proj": {
1669
+ "K": 5120,
1670
+ "M": 1,
1671
+ "N": 13824,
1672
+ "device": "cuda:3",
1673
+ "dtype": "torch.float16",
1674
+ "group_size": 256,
1675
+ "num_bits": 4,
1676
+ "num_sms": 128,
1677
+ "template_id": 54
1678
+ },
1679
+ "model.layers.28.self_attn.k_proj": {
1680
+ "K": 5120,
1681
+ "M": 1,
1682
+ "N": 1024,
1683
+ "device": "cuda:3",
1684
+ "dtype": "torch.float16",
1685
+ "group_size": 256,
1686
+ "num_bits": 4,
1687
+ "num_sms": 128,
1688
+ "template_id": 59
1689
+ },
1690
+ "model.layers.28.self_attn.o_proj": {
1691
+ "K": 5120,
1692
+ "M": 1,
1693
+ "N": 5120,
1694
+ "device": "cuda:3",
1695
+ "dtype": "torch.float16",
1696
+ "group_size": 256,
1697
+ "num_bits": 4,
1698
+ "num_sms": 128,
1699
+ "template_id": 46
1700
+ },
1701
+ "model.layers.28.self_attn.q_proj": {
1702
+ "K": 5120,
1703
+ "M": 1,
1704
+ "N": 5120,
1705
+ "device": "cuda:3",
1706
+ "dtype": "torch.float16",
1707
+ "group_size": 256,
1708
+ "num_bits": 4,
1709
+ "num_sms": 128,
1710
+ "template_id": 46
1711
+ },
1712
+ "model.layers.28.self_attn.v_proj": {
1713
+ "K": 5120,
1714
+ "M": 1,
1715
+ "N": 1024,
1716
+ "device": "cuda:3",
1717
+ "dtype": "torch.float16",
1718
+ "group_size": 256,
1719
+ "num_bits": 4,
1720
+ "num_sms": 128,
1721
+ "template_id": 59
1722
+ },
1723
+ "model.layers.29.mlp.down_proj": {
1724
+ "K": 13824,
1725
+ "M": 1,
1726
+ "N": 5120,
1727
+ "device": "cuda:3",
1728
+ "dtype": "torch.float16",
1729
+ "group_size": 256,
1730
+ "num_bits": 4,
1731
+ "num_sms": 128,
1732
+ "template_id": 37
1733
+ },
1734
+ "model.layers.29.mlp.gate_proj": {
1735
+ "K": 5120,
1736
+ "M": 1,
1737
+ "N": 13824,
1738
+ "device": "cuda:3",
1739
+ "dtype": "torch.float16",
1740
+ "group_size": 256,
1741
+ "num_bits": 4,
1742
+ "num_sms": 128,
1743
+ "template_id": 54
1744
+ },
1745
+ "model.layers.29.mlp.up_proj": {
1746
+ "K": 5120,
1747
+ "M": 1,
1748
+ "N": 13824,
1749
+ "device": "cuda:3",
1750
+ "dtype": "torch.float16",
1751
+ "group_size": 256,
1752
+ "num_bits": 4,
1753
+ "num_sms": 128,
1754
+ "template_id": 54
1755
+ },
1756
+ "model.layers.29.self_attn.k_proj": {
1757
+ "K": 5120,
1758
+ "M": 1,
1759
+ "N": 1024,
1760
+ "device": "cuda:3",
1761
+ "dtype": "torch.float16",
1762
+ "group_size": 256,
1763
+ "num_bits": 4,
1764
+ "num_sms": 128,
1765
+ "template_id": 59
1766
+ },
1767
+ "model.layers.29.self_attn.o_proj": {
1768
+ "K": 5120,
1769
+ "M": 1,
1770
+ "N": 5120,
1771
+ "device": "cuda:3",
1772
+ "dtype": "torch.float16",
1773
+ "group_size": 256,
1774
+ "num_bits": 4,
1775
+ "num_sms": 128,
1776
+ "template_id": 46
1777
+ },
1778
+ "model.layers.29.self_attn.q_proj": {
1779
+ "K": 5120,
1780
+ "M": 1,
1781
+ "N": 5120,
1782
+ "device": "cuda:3",
1783
+ "dtype": "torch.float16",
1784
+ "group_size": 256,
1785
+ "num_bits": 4,
1786
+ "num_sms": 128,
1787
+ "template_id": 46
1788
+ },
1789
+ "model.layers.29.self_attn.v_proj": {
1790
+ "K": 5120,
1791
+ "M": 1,
1792
+ "N": 1024,
1793
+ "device": "cuda:3",
1794
+ "dtype": "torch.float16",
1795
+ "group_size": 256,
1796
+ "num_bits": 4,
1797
+ "num_sms": 128,
1798
+ "template_id": 59
1799
+ },
1800
+ "model.layers.3.mlp.down_proj": {
1801
+ "K": 13824,
1802
+ "M": 1,
1803
+ "N": 5120,
1804
+ "device": "cuda:1",
1805
+ "dtype": "torch.float16",
1806
+ "group_size": 256,
1807
+ "num_bits": 4,
1808
+ "num_sms": 128,
1809
+ "template_id": 37
1810
+ },
1811
+ "model.layers.3.mlp.gate_proj": {
1812
+ "K": 5120,
1813
+ "M": 1,
1814
+ "N": 13824,
1815
+ "device": "cuda:1",
1816
+ "dtype": "torch.float16",
1817
+ "group_size": 256,
1818
+ "num_bits": 4,
1819
+ "num_sms": 128,
1820
+ "template_id": 54
1821
+ },
1822
+ "model.layers.3.mlp.up_proj": {
1823
+ "K": 5120,
1824
+ "M": 1,
1825
+ "N": 13824,
1826
+ "device": "cuda:1",
1827
+ "dtype": "torch.float16",
1828
+ "group_size": 256,
1829
+ "num_bits": 4,
1830
+ "num_sms": 128,
1831
+ "template_id": 54
1832
+ },
1833
+ "model.layers.3.self_attn.k_proj": {
1834
+ "K": 5120,
1835
+ "M": 1,
1836
+ "N": 1024,
1837
+ "device": "cuda:1",
1838
+ "dtype": "torch.float16",
1839
+ "group_size": 256,
1840
+ "num_bits": 4,
1841
+ "num_sms": 128,
1842
+ "template_id": 59
1843
+ },
1844
+ "model.layers.3.self_attn.o_proj": {
1845
+ "K": 5120,
1846
+ "M": 1,
1847
+ "N": 5120,
1848
+ "device": "cuda:1",
1849
+ "dtype": "torch.float16",
1850
+ "group_size": 256,
1851
+ "num_bits": 4,
1852
+ "num_sms": 128,
1853
+ "template_id": 46
1854
+ },
1855
+ "model.layers.3.self_attn.q_proj": {
1856
+ "K": 5120,
1857
+ "M": 1,
1858
+ "N": 5120,
1859
+ "device": "cuda:1",
1860
+ "dtype": "torch.float16",
1861
+ "group_size": 256,
1862
+ "num_bits": 4,
1863
+ "num_sms": 128,
1864
+ "template_id": 46
1865
+ },
1866
+ "model.layers.3.self_attn.v_proj": {
1867
+ "K": 5120,
1868
+ "M": 1,
1869
+ "N": 1024,
1870
+ "device": "cuda:1",
1871
+ "dtype": "torch.float16",
1872
+ "group_size": 256,
1873
+ "num_bits": 4,
1874
+ "num_sms": 128,
1875
+ "template_id": 59
1876
+ },
1877
+ "model.layers.30.mlp.down_proj": {
1878
+ "K": 13824,
1879
+ "M": 1,
1880
+ "N": 5120,
1881
+ "device": "cuda:3",
1882
+ "dtype": "torch.float16",
1883
+ "group_size": 256,
1884
+ "num_bits": 4,
1885
+ "num_sms": 128,
1886
+ "template_id": 37
1887
+ },
1888
+ "model.layers.30.mlp.gate_proj": {
1889
+ "K": 5120,
1890
+ "M": 1,
1891
+ "N": 13824,
1892
+ "device": "cuda:3",
1893
+ "dtype": "torch.float16",
1894
+ "group_size": 256,
1895
+ "num_bits": 4,
1896
+ "num_sms": 128,
1897
+ "template_id": 54
1898
+ },
1899
+ "model.layers.30.mlp.up_proj": {
1900
+ "K": 5120,
1901
+ "M": 1,
1902
+ "N": 13824,
1903
+ "device": "cuda:3",
1904
+ "dtype": "torch.float16",
1905
+ "group_size": 256,
1906
+ "num_bits": 4,
1907
+ "num_sms": 128,
1908
+ "template_id": 54
1909
+ },
1910
+ "model.layers.30.self_attn.k_proj": {
1911
+ "K": 5120,
1912
+ "M": 1,
1913
+ "N": 1024,
1914
+ "device": "cuda:3",
1915
+ "dtype": "torch.float16",
1916
+ "group_size": 256,
1917
+ "num_bits": 4,
1918
+ "num_sms": 128,
1919
+ "template_id": 59
1920
+ },
1921
+ "model.layers.30.self_attn.o_proj": {
1922
+ "K": 5120,
1923
+ "M": 1,
1924
+ "N": 5120,
1925
+ "device": "cuda:3",
1926
+ "dtype": "torch.float16",
1927
+ "group_size": 256,
1928
+ "num_bits": 4,
1929
+ "num_sms": 128,
1930
+ "template_id": 46
1931
+ },
1932
+ "model.layers.30.self_attn.q_proj": {
1933
+ "K": 5120,
1934
+ "M": 1,
1935
+ "N": 5120,
1936
+ "device": "cuda:3",
1937
+ "dtype": "torch.float16",
1938
+ "group_size": 256,
1939
+ "num_bits": 4,
1940
+ "num_sms": 128,
1941
+ "template_id": 46
1942
+ },
1943
+ "model.layers.30.self_attn.v_proj": {
1944
+ "K": 5120,
1945
+ "M": 1,
1946
+ "N": 1024,
1947
+ "device": "cuda:3",
1948
+ "dtype": "torch.float16",
1949
+ "group_size": 256,
1950
+ "num_bits": 4,
1951
+ "num_sms": 128,
1952
+ "template_id": 59
1953
+ },
1954
+ "model.layers.31.mlp.down_proj": {
1955
+ "K": 13824,
1956
+ "M": 1,
1957
+ "N": 5120,
1958
+ "device": "cuda:3",
1959
+ "dtype": "torch.float16",
1960
+ "group_size": 256,
1961
+ "num_bits": 4,
1962
+ "num_sms": 128,
1963
+ "template_id": 37
1964
+ },
1965
+ "model.layers.31.mlp.gate_proj": {
1966
+ "K": 5120,
1967
+ "M": 1,
1968
+ "N": 13824,
1969
+ "device": "cuda:3",
1970
+ "dtype": "torch.float16",
1971
+ "group_size": 256,
1972
+ "num_bits": 4,
1973
+ "num_sms": 128,
1974
+ "template_id": 54
1975
+ },
1976
+ "model.layers.31.mlp.up_proj": {
1977
+ "K": 5120,
1978
+ "M": 1,
1979
+ "N": 13824,
1980
+ "device": "cuda:3",
1981
+ "dtype": "torch.float16",
1982
+ "group_size": 256,
1983
+ "num_bits": 4,
1984
+ "num_sms": 128,
1985
+ "template_id": 54
1986
+ },
1987
+ "model.layers.31.self_attn.k_proj": {
1988
+ "K": 5120,
1989
+ "M": 1,
1990
+ "N": 1024,
1991
+ "device": "cuda:3",
1992
+ "dtype": "torch.float16",
1993
+ "group_size": 256,
1994
+ "num_bits": 4,
1995
+ "num_sms": 128,
1996
+ "template_id": 59
1997
+ },
1998
+ "model.layers.31.self_attn.o_proj": {
1999
+ "K": 5120,
2000
+ "M": 1,
2001
+ "N": 5120,
2002
+ "device": "cuda:3",
2003
+ "dtype": "torch.float16",
2004
+ "group_size": 256,
2005
+ "num_bits": 4,
2006
+ "num_sms": 128,
2007
+ "template_id": 46
2008
+ },
2009
+ "model.layers.31.self_attn.q_proj": {
2010
+ "K": 5120,
2011
+ "M": 1,
2012
+ "N": 5120,
2013
+ "device": "cuda:3",
2014
+ "dtype": "torch.float16",
2015
+ "group_size": 256,
2016
+ "num_bits": 4,
2017
+ "num_sms": 128,
2018
+ "template_id": 46
2019
+ },
2020
+ "model.layers.31.self_attn.v_proj": {
2021
+ "K": 5120,
2022
+ "M": 1,
2023
+ "N": 1024,
2024
+ "device": "cuda:3",
2025
+ "dtype": "torch.float16",
2026
+ "group_size": 256,
2027
+ "num_bits": 4,
2028
+ "num_sms": 128,
2029
+ "template_id": 59
2030
+ },
2031
+ "model.layers.32.mlp.down_proj": {
2032
+ "K": 13824,
2033
+ "M": 1,
2034
+ "N": 5120,
2035
+ "device": "cuda:3",
2036
+ "dtype": "torch.float16",
2037
+ "group_size": 256,
2038
+ "num_bits": 4,
2039
+ "num_sms": 128,
2040
+ "template_id": 37
2041
+ },
2042
+ "model.layers.32.mlp.gate_proj": {
2043
+ "K": 5120,
2044
+ "M": 1,
2045
+ "N": 13824,
2046
+ "device": "cuda:3",
2047
+ "dtype": "torch.float16",
2048
+ "group_size": 256,
2049
+ "num_bits": 4,
2050
+ "num_sms": 128,
2051
+ "template_id": 54
2052
+ },
2053
+ "model.layers.32.mlp.up_proj": {
2054
+ "K": 5120,
2055
+ "M": 1,
2056
+ "N": 13824,
2057
+ "device": "cuda:3",
2058
+ "dtype": "torch.float16",
2059
+ "group_size": 256,
2060
+ "num_bits": 4,
2061
+ "num_sms": 128,
2062
+ "template_id": 54
2063
+ },
2064
+ "model.layers.32.self_attn.k_proj": {
2065
+ "K": 5120,
2066
+ "M": 1,
2067
+ "N": 1024,
2068
+ "device": "cuda:3",
2069
+ "dtype": "torch.float16",
2070
+ "group_size": 256,
2071
+ "num_bits": 4,
2072
+ "num_sms": 128,
2073
+ "template_id": 59
2074
+ },
2075
+ "model.layers.32.self_attn.o_proj": {
2076
+ "K": 5120,
2077
+ "M": 1,
2078
+ "N": 5120,
2079
+ "device": "cuda:3",
2080
+ "dtype": "torch.float16",
2081
+ "group_size": 256,
2082
+ "num_bits": 4,
2083
+ "num_sms": 128,
2084
+ "template_id": 46
2085
+ },
2086
+ "model.layers.32.self_attn.q_proj": {
2087
+ "K": 5120,
2088
+ "M": 1,
2089
+ "N": 5120,
2090
+ "device": "cuda:3",
2091
+ "dtype": "torch.float16",
2092
+ "group_size": 256,
2093
+ "num_bits": 4,
2094
+ "num_sms": 128,
2095
+ "template_id": 46
2096
+ },
2097
+ "model.layers.32.self_attn.v_proj": {
2098
+ "K": 5120,
2099
+ "M": 1,
2100
+ "N": 1024,
2101
+ "device": "cuda:3",
2102
+ "dtype": "torch.float16",
2103
+ "group_size": 256,
2104
+ "num_bits": 4,
2105
+ "num_sms": 128,
2106
+ "template_id": 59
2107
+ },
2108
+ "model.layers.33.mlp.down_proj": {
2109
+ "K": 13824,
2110
+ "M": 1,
2111
+ "N": 5120,
2112
+ "device": "cuda:3",
2113
+ "dtype": "torch.float16",
2114
+ "group_size": 256,
2115
+ "num_bits": 4,
2116
+ "num_sms": 128,
2117
+ "template_id": 37
2118
+ },
2119
+ "model.layers.33.mlp.gate_proj": {
2120
+ "K": 5120,
2121
+ "M": 1,
2122
+ "N": 13824,
2123
+ "device": "cuda:3",
2124
+ "dtype": "torch.float16",
2125
+ "group_size": 256,
2126
+ "num_bits": 4,
2127
+ "num_sms": 128,
2128
+ "template_id": 54
2129
+ },
2130
+ "model.layers.33.mlp.up_proj": {
2131
+ "K": 5120,
2132
+ "M": 1,
2133
+ "N": 13824,
2134
+ "device": "cuda:3",
2135
+ "dtype": "torch.float16",
2136
+ "group_size": 256,
2137
+ "num_bits": 4,
2138
+ "num_sms": 128,
2139
+ "template_id": 54
2140
+ },
2141
+ "model.layers.33.self_attn.k_proj": {
2142
+ "K": 5120,
2143
+ "M": 1,
2144
+ "N": 1024,
2145
+ "device": "cuda:3",
2146
+ "dtype": "torch.float16",
2147
+ "group_size": 256,
2148
+ "num_bits": 4,
2149
+ "num_sms": 128,
2150
+ "template_id": 59
2151
+ },
2152
+ "model.layers.33.self_attn.o_proj": {
2153
+ "K": 5120,
2154
+ "M": 1,
2155
+ "N": 5120,
2156
+ "device": "cuda:3",
2157
+ "dtype": "torch.float16",
2158
+ "group_size": 256,
2159
+ "num_bits": 4,
2160
+ "num_sms": 128,
2161
+ "template_id": 46
2162
+ },
2163
+ "model.layers.33.self_attn.q_proj": {
2164
+ "K": 5120,
2165
+ "M": 1,
2166
+ "N": 5120,
2167
+ "device": "cuda:3",
2168
+ "dtype": "torch.float16",
2169
+ "group_size": 256,
2170
+ "num_bits": 4,
2171
+ "num_sms": 128,
2172
+ "template_id": 46
2173
+ },
2174
+ "model.layers.33.self_attn.v_proj": {
2175
+ "K": 5120,
2176
+ "M": 1,
2177
+ "N": 1024,
2178
+ "device": "cuda:3",
2179
+ "dtype": "torch.float16",
2180
+ "group_size": 256,
2181
+ "num_bits": 4,
2182
+ "num_sms": 128,
2183
+ "template_id": 59
2184
+ },
2185
+ "model.layers.34.mlp.down_proj": {
2186
+ "K": 13824,
2187
+ "M": 1,
2188
+ "N": 5120,
2189
+ "device": "cuda:3",
2190
+ "dtype": "torch.float16",
2191
+ "group_size": 256,
2192
+ "num_bits": 4,
2193
+ "num_sms": 128,
2194
+ "template_id": 37
2195
+ },
2196
+ "model.layers.34.mlp.gate_proj": {
2197
+ "K": 5120,
2198
+ "M": 1,
2199
+ "N": 13824,
2200
+ "device": "cuda:3",
2201
+ "dtype": "torch.float16",
2202
+ "group_size": 256,
2203
+ "num_bits": 4,
2204
+ "num_sms": 128,
2205
+ "template_id": 54
2206
+ },
2207
+ "model.layers.34.mlp.up_proj": {
2208
+ "K": 5120,
2209
+ "M": 1,
2210
+ "N": 13824,
2211
+ "device": "cuda:3",
2212
+ "dtype": "torch.float16",
2213
+ "group_size": 256,
2214
+ "num_bits": 4,
2215
+ "num_sms": 128,
2216
+ "template_id": 54
2217
+ },
2218
+ "model.layers.34.self_attn.k_proj": {
2219
+ "K": 5120,
2220
+ "M": 1,
2221
+ "N": 1024,
2222
+ "device": "cuda:3",
2223
+ "dtype": "torch.float16",
2224
+ "group_size": 256,
2225
+ "num_bits": 4,
2226
+ "num_sms": 128,
2227
+ "template_id": 59
2228
+ },
2229
+ "model.layers.34.self_attn.o_proj": {
2230
+ "K": 5120,
2231
+ "M": 1,
2232
+ "N": 5120,
2233
+ "device": "cuda:3",
2234
+ "dtype": "torch.float16",
2235
+ "group_size": 256,
2236
+ "num_bits": 4,
2237
+ "num_sms": 128,
2238
+ "template_id": 46
2239
+ },
2240
+ "model.layers.34.self_attn.q_proj": {
2241
+ "K": 5120,
2242
+ "M": 1,
2243
+ "N": 5120,
2244
+ "device": "cuda:3",
2245
+ "dtype": "torch.float16",
2246
+ "group_size": 256,
2247
+ "num_bits": 4,
2248
+ "num_sms": 128,
2249
+ "template_id": 46
2250
+ },
2251
+ "model.layers.34.self_attn.v_proj": {
2252
+ "K": 5120,
2253
+ "M": 1,
2254
+ "N": 1024,
2255
+ "device": "cuda:3",
2256
+ "dtype": "torch.float16",
2257
+ "group_size": 256,
2258
+ "num_bits": 4,
2259
+ "num_sms": 128,
2260
+ "template_id": 59
2261
+ },
2262
+ "model.layers.35.mlp.down_proj": {
2263
+ "K": 13824,
2264
+ "M": 1,
2265
+ "N": 5120,
2266
+ "device": "cuda:3",
2267
+ "dtype": "torch.float16",
2268
+ "group_size": 256,
2269
+ "num_bits": 4,
2270
+ "num_sms": 128,
2271
+ "template_id": 37
2272
+ },
2273
+ "model.layers.35.mlp.gate_proj": {
2274
+ "K": 5120,
2275
+ "M": 1,
2276
+ "N": 13824,
2277
+ "device": "cuda:3",
2278
+ "dtype": "torch.float16",
2279
+ "group_size": 256,
2280
+ "num_bits": 4,
2281
+ "num_sms": 128,
2282
+ "template_id": 54
2283
+ },
2284
+ "model.layers.35.mlp.up_proj": {
2285
+ "K": 5120,
2286
+ "M": 1,
2287
+ "N": 13824,
2288
+ "device": "cuda:3",
2289
+ "dtype": "torch.float16",
2290
+ "group_size": 256,
2291
+ "num_bits": 4,
2292
+ "num_sms": 128,
2293
+ "template_id": 54
2294
+ },
2295
+ "model.layers.35.self_attn.k_proj": {
2296
+ "K": 5120,
2297
+ "M": 1,
2298
+ "N": 1024,
2299
+ "device": "cuda:3",
2300
+ "dtype": "torch.float16",
2301
+ "group_size": 256,
2302
+ "num_bits": 4,
2303
+ "num_sms": 128,
2304
+ "template_id": 59
2305
+ },
2306
+ "model.layers.35.self_attn.o_proj": {
2307
+ "K": 5120,
2308
+ "M": 1,
2309
+ "N": 5120,
2310
+ "device": "cuda:3",
2311
+ "dtype": "torch.float16",
2312
+ "group_size": 256,
2313
+ "num_bits": 4,
2314
+ "num_sms": 128,
2315
+ "template_id": 46
2316
+ },
2317
+ "model.layers.35.self_attn.q_proj": {
2318
+ "K": 5120,
2319
+ "M": 1,
2320
+ "N": 5120,
2321
+ "device": "cuda:3",
2322
+ "dtype": "torch.float16",
2323
+ "group_size": 256,
2324
+ "num_bits": 4,
2325
+ "num_sms": 128,
2326
+ "template_id": 46
2327
+ },
2328
+ "model.layers.35.self_attn.v_proj": {
2329
+ "K": 5120,
2330
+ "M": 1,
2331
+ "N": 1024,
2332
+ "device": "cuda:3",
2333
+ "dtype": "torch.float16",
2334
+ "group_size": 256,
2335
+ "num_bits": 4,
2336
+ "num_sms": 128,
2337
+ "template_id": 59
2338
+ },
2339
+ "model.layers.36.mlp.down_proj": {
2340
+ "K": 13824,
2341
+ "M": 1,
2342
+ "N": 5120,
2343
+ "device": "cuda:3",
2344
+ "dtype": "torch.float16",
2345
+ "group_size": 256,
2346
+ "num_bits": 4,
2347
+ "num_sms": 128,
2348
+ "template_id": 37
2349
+ },
2350
+ "model.layers.36.mlp.gate_proj": {
2351
+ "K": 5120,
2352
+ "M": 1,
2353
+ "N": 13824,
2354
+ "device": "cuda:3",
2355
+ "dtype": "torch.float16",
2356
+ "group_size": 256,
2357
+ "num_bits": 4,
2358
+ "num_sms": 128,
2359
+ "template_id": 54
2360
+ },
2361
+ "model.layers.36.mlp.up_proj": {
2362
+ "K": 5120,
2363
+ "M": 1,
2364
+ "N": 13824,
2365
+ "device": "cuda:3",
2366
+ "dtype": "torch.float16",
2367
+ "group_size": 256,
2368
+ "num_bits": 4,
2369
+ "num_sms": 128,
2370
+ "template_id": 54
2371
+ },
2372
+ "model.layers.36.self_attn.k_proj": {
2373
+ "K": 5120,
2374
+ "M": 1,
2375
+ "N": 1024,
2376
+ "device": "cuda:3",
2377
+ "dtype": "torch.float16",
2378
+ "group_size": 256,
2379
+ "num_bits": 4,
2380
+ "num_sms": 128,
2381
+ "template_id": 59
2382
+ },
2383
+ "model.layers.36.self_attn.o_proj": {
2384
+ "K": 5120,
2385
+ "M": 1,
2386
+ "N": 5120,
2387
+ "device": "cuda:3",
2388
+ "dtype": "torch.float16",
2389
+ "group_size": 256,
2390
+ "num_bits": 4,
2391
+ "num_sms": 128,
2392
+ "template_id": 46
2393
+ },
2394
+ "model.layers.36.self_attn.q_proj": {
2395
+ "K": 5120,
2396
+ "M": 1,
2397
+ "N": 5120,
2398
+ "device": "cuda:3",
2399
+ "dtype": "torch.float16",
2400
+ "group_size": 256,
2401
+ "num_bits": 4,
2402
+ "num_sms": 128,
2403
+ "template_id": 46
2404
+ },
2405
+ "model.layers.36.self_attn.v_proj": {
2406
+ "K": 5120,
2407
+ "M": 1,
2408
+ "N": 1024,
2409
+ "device": "cuda:3",
2410
+ "dtype": "torch.float16",
2411
+ "group_size": 256,
2412
+ "num_bits": 4,
2413
+ "num_sms": 128,
2414
+ "template_id": 59
2415
+ },
2416
+ "model.layers.37.mlp.down_proj": {
2417
+ "K": 13824,
2418
+ "M": 1,
2419
+ "N": 5120,
2420
+ "device": "cuda:3",
2421
+ "dtype": "torch.float16",
2422
+ "group_size": 256,
2423
+ "num_bits": 4,
2424
+ "num_sms": 128,
2425
+ "template_id": 37
2426
+ },
2427
+ "model.layers.37.mlp.gate_proj": {
2428
+ "K": 5120,
2429
+ "M": 1,
2430
+ "N": 13824,
2431
+ "device": "cuda:3",
2432
+ "dtype": "torch.float16",
2433
+ "group_size": 256,
2434
+ "num_bits": 4,
2435
+ "num_sms": 128,
2436
+ "template_id": 54
2437
+ },
2438
+ "model.layers.37.mlp.up_proj": {
2439
+ "K": 5120,
2440
+ "M": 1,
2441
+ "N": 13824,
2442
+ "device": "cuda:3",
2443
+ "dtype": "torch.float16",
2444
+ "group_size": 256,
2445
+ "num_bits": 4,
2446
+ "num_sms": 128,
2447
+ "template_id": 54
2448
+ },
2449
+ "model.layers.37.self_attn.k_proj": {
2450
+ "K": 5120,
2451
+ "M": 1,
2452
+ "N": 1024,
2453
+ "device": "cuda:3",
2454
+ "dtype": "torch.float16",
2455
+ "group_size": 256,
2456
+ "num_bits": 4,
2457
+ "num_sms": 128,
2458
+ "template_id": 59
2459
+ },
2460
+ "model.layers.37.self_attn.o_proj": {
2461
+ "K": 5120,
2462
+ "M": 1,
2463
+ "N": 5120,
2464
+ "device": "cuda:3",
2465
+ "dtype": "torch.float16",
2466
+ "group_size": 256,
2467
+ "num_bits": 4,
2468
+ "num_sms": 128,
2469
+ "template_id": 46
2470
+ },
2471
+ "model.layers.37.self_attn.q_proj": {
2472
+ "K": 5120,
2473
+ "M": 1,
2474
+ "N": 5120,
2475
+ "device": "cuda:3",
2476
+ "dtype": "torch.float16",
2477
+ "group_size": 256,
2478
+ "num_bits": 4,
2479
+ "num_sms": 128,
2480
+ "template_id": 46
2481
+ },
2482
+ "model.layers.37.self_attn.v_proj": {
2483
+ "K": 5120,
2484
+ "M": 1,
2485
+ "N": 1024,
2486
+ "device": "cuda:3",
2487
+ "dtype": "torch.float16",
2488
+ "group_size": 256,
2489
+ "num_bits": 4,
2490
+ "num_sms": 128,
2491
+ "template_id": 59
2492
+ },
2493
+ "model.layers.38.mlp.down_proj": {
2494
+ "K": 13824,
2495
+ "M": 1,
2496
+ "N": 5120,
2497
+ "device": "cuda:3",
2498
+ "dtype": "torch.float16",
2499
+ "group_size": 256,
2500
+ "num_bits": 4,
2501
+ "num_sms": 128,
2502
+ "template_id": 37
2503
+ },
2504
+ "model.layers.38.mlp.gate_proj": {
2505
+ "K": 5120,
2506
+ "M": 1,
2507
+ "N": 13824,
2508
+ "device": "cuda:3",
2509
+ "dtype": "torch.float16",
2510
+ "group_size": 256,
2511
+ "num_bits": 4,
2512
+ "num_sms": 128,
2513
+ "template_id": 54
2514
+ },
2515
+ "model.layers.38.mlp.up_proj": {
2516
+ "K": 5120,
2517
+ "M": 1,
2518
+ "N": 13824,
2519
+ "device": "cuda:3",
2520
+ "dtype": "torch.float16",
2521
+ "group_size": 256,
2522
+ "num_bits": 4,
2523
+ "num_sms": 128,
2524
+ "template_id": 54
2525
+ },
2526
+ "model.layers.38.self_attn.k_proj": {
2527
+ "K": 5120,
2528
+ "M": 1,
2529
+ "N": 1024,
2530
+ "device": "cuda:3",
2531
+ "dtype": "torch.float16",
2532
+ "group_size": 256,
2533
+ "num_bits": 4,
2534
+ "num_sms": 128,
2535
+ "template_id": 59
2536
+ },
2537
+ "model.layers.38.self_attn.o_proj": {
2538
+ "K": 5120,
2539
+ "M": 1,
2540
+ "N": 5120,
2541
+ "device": "cuda:3",
2542
+ "dtype": "torch.float16",
2543
+ "group_size": 256,
2544
+ "num_bits": 4,
2545
+ "num_sms": 128,
2546
+ "template_id": 46
2547
+ },
2548
+ "model.layers.38.self_attn.q_proj": {
2549
+ "K": 5120,
2550
+ "M": 1,
2551
+ "N": 5120,
2552
+ "device": "cuda:3",
2553
+ "dtype": "torch.float16",
2554
+ "group_size": 256,
2555
+ "num_bits": 4,
2556
+ "num_sms": 128,
2557
+ "template_id": 46
2558
+ },
2559
+ "model.layers.38.self_attn.v_proj": {
2560
+ "K": 5120,
2561
+ "M": 1,
2562
+ "N": 1024,
2563
+ "device": "cuda:3",
2564
+ "dtype": "torch.float16",
2565
+ "group_size": 256,
2566
+ "num_bits": 4,
2567
+ "num_sms": 128,
2568
+ "template_id": 59
2569
+ },
2570
+ "model.layers.39.mlp.down_proj": {
2571
+ "K": 13824,
2572
+ "M": 1,
2573
+ "N": 5120,
2574
+ "device": "cuda:3",
2575
+ "dtype": "torch.float16",
2576
+ "group_size": 256,
2577
+ "num_bits": 4,
2578
+ "num_sms": 128,
2579
+ "template_id": 37
2580
+ },
2581
+ "model.layers.39.mlp.gate_proj": {
2582
+ "K": 5120,
2583
+ "M": 1,
2584
+ "N": 13824,
2585
+ "device": "cuda:3",
2586
+ "dtype": "torch.float16",
2587
+ "group_size": 256,
2588
+ "num_bits": 4,
2589
+ "num_sms": 128,
2590
+ "template_id": 54
2591
+ },
2592
+ "model.layers.39.mlp.up_proj": {
2593
+ "K": 5120,
2594
+ "M": 1,
2595
+ "N": 13824,
2596
+ "device": "cuda:3",
2597
+ "dtype": "torch.float16",
2598
+ "group_size": 256,
2599
+ "num_bits": 4,
2600
+ "num_sms": 128,
2601
+ "template_id": 54
2602
+ },
2603
+ "model.layers.39.self_attn.k_proj": {
2604
+ "K": 5120,
2605
+ "M": 1,
2606
+ "N": 1024,
2607
+ "device": "cuda:3",
2608
+ "dtype": "torch.float16",
2609
+ "group_size": 256,
2610
+ "num_bits": 4,
2611
+ "num_sms": 128,
2612
+ "template_id": 59
2613
+ },
2614
+ "model.layers.39.self_attn.o_proj": {
2615
+ "K": 5120,
2616
+ "M": 1,
2617
+ "N": 5120,
2618
+ "device": "cuda:3",
2619
+ "dtype": "torch.float16",
2620
+ "group_size": 256,
2621
+ "num_bits": 4,
2622
+ "num_sms": 128,
2623
+ "template_id": 46
2624
+ },
2625
+ "model.layers.39.self_attn.q_proj": {
2626
+ "K": 5120,
2627
+ "M": 1,
2628
+ "N": 5120,
2629
+ "device": "cuda:3",
2630
+ "dtype": "torch.float16",
2631
+ "group_size": 256,
2632
+ "num_bits": 4,
2633
+ "num_sms": 128,
2634
+ "template_id": 46
2635
+ },
2636
+ "model.layers.39.self_attn.v_proj": {
2637
+ "K": 5120,
2638
+ "M": 1,
2639
+ "N": 1024,
2640
+ "device": "cuda:3",
2641
+ "dtype": "torch.float16",
2642
+ "group_size": 256,
2643
+ "num_bits": 4,
2644
+ "num_sms": 128,
2645
+ "template_id": 59
2646
+ },
2647
+ "model.layers.4.mlp.down_proj": {
2648
+ "K": 13824,
2649
+ "M": 1,
2650
+ "N": 5120,
2651
+ "device": "cuda:1",
2652
+ "dtype": "torch.float16",
2653
+ "group_size": 256,
2654
+ "num_bits": 4,
2655
+ "num_sms": 128,
2656
+ "template_id": 37
2657
+ },
2658
+ "model.layers.4.mlp.gate_proj": {
2659
+ "K": 5120,
2660
+ "M": 1,
2661
+ "N": 13824,
2662
+ "device": "cuda:1",
2663
+ "dtype": "torch.float16",
2664
+ "group_size": 256,
2665
+ "num_bits": 4,
2666
+ "num_sms": 128,
2667
+ "template_id": 54
2668
+ },
2669
+ "model.layers.4.mlp.up_proj": {
2670
+ "K": 5120,
2671
+ "M": 1,
2672
+ "N": 13824,
2673
+ "device": "cuda:1",
2674
+ "dtype": "torch.float16",
2675
+ "group_size": 256,
2676
+ "num_bits": 4,
2677
+ "num_sms": 128,
2678
+ "template_id": 54
2679
+ },
2680
+ "model.layers.4.self_attn.k_proj": {
2681
+ "K": 5120,
2682
+ "M": 1,
2683
+ "N": 1024,
2684
+ "device": "cuda:1",
2685
+ "dtype": "torch.float16",
2686
+ "group_size": 256,
2687
+ "num_bits": 4,
2688
+ "num_sms": 128,
2689
+ "template_id": 59
2690
+ },
2691
+ "model.layers.4.self_attn.o_proj": {
2692
+ "K": 5120,
2693
+ "M": 1,
2694
+ "N": 5120,
2695
+ "device": "cuda:1",
2696
+ "dtype": "torch.float16",
2697
+ "group_size": 256,
2698
+ "num_bits": 4,
2699
+ "num_sms": 128,
2700
+ "template_id": 46
2701
+ },
2702
+ "model.layers.4.self_attn.q_proj": {
2703
+ "K": 5120,
2704
+ "M": 1,
2705
+ "N": 5120,
2706
+ "device": "cuda:1",
2707
+ "dtype": "torch.float16",
2708
+ "group_size": 256,
2709
+ "num_bits": 4,
2710
+ "num_sms": 128,
2711
+ "template_id": 46
2712
+ },
2713
+ "model.layers.4.self_attn.v_proj": {
2714
+ "K": 5120,
2715
+ "M": 1,
2716
+ "N": 1024,
2717
+ "device": "cuda:1",
2718
+ "dtype": "torch.float16",
2719
+ "group_size": 256,
2720
+ "num_bits": 4,
2721
+ "num_sms": 128,
2722
+ "template_id": 59
2723
+ },
2724
+ "model.layers.40.mlp.down_proj": {
2725
+ "K": 13824,
2726
+ "M": 1,
2727
+ "N": 5120,
2728
+ "device": "cuda:3",
2729
+ "dtype": "torch.float16",
2730
+ "group_size": 256,
2731
+ "num_bits": 4,
2732
+ "num_sms": 128,
2733
+ "template_id": 37
2734
+ },
2735
+ "model.layers.40.mlp.gate_proj": {
2736
+ "K": 5120,
2737
+ "M": 1,
2738
+ "N": 13824,
2739
+ "device": "cuda:3",
2740
+ "dtype": "torch.float16",
2741
+ "group_size": 256,
2742
+ "num_bits": 4,
2743
+ "num_sms": 128,
2744
+ "template_id": 54
2745
+ },
2746
+ "model.layers.40.mlp.up_proj": {
2747
+ "K": 5120,
2748
+ "M": 1,
2749
+ "N": 13824,
2750
+ "device": "cuda:3",
2751
+ "dtype": "torch.float16",
2752
+ "group_size": 256,
2753
+ "num_bits": 4,
2754
+ "num_sms": 128,
2755
+ "template_id": 54
2756
+ },
2757
+ "model.layers.40.self_attn.k_proj": {
2758
+ "K": 5120,
2759
+ "M": 1,
2760
+ "N": 1024,
2761
+ "device": "cuda:3",
2762
+ "dtype": "torch.float16",
2763
+ "group_size": 256,
2764
+ "num_bits": 4,
2765
+ "num_sms": 128,
2766
+ "template_id": 59
2767
+ },
2768
+ "model.layers.40.self_attn.o_proj": {
2769
+ "K": 5120,
2770
+ "M": 1,
2771
+ "N": 5120,
2772
+ "device": "cuda:3",
2773
+ "dtype": "torch.float16",
2774
+ "group_size": 256,
2775
+ "num_bits": 4,
2776
+ "num_sms": 128,
2777
+ "template_id": 46
2778
+ },
2779
+ "model.layers.40.self_attn.q_proj": {
2780
+ "K": 5120,
2781
+ "M": 1,
2782
+ "N": 5120,
2783
+ "device": "cuda:3",
2784
+ "dtype": "torch.float16",
2785
+ "group_size": 256,
2786
+ "num_bits": 4,
2787
+ "num_sms": 128,
2788
+ "template_id": 46
2789
+ },
2790
+ "model.layers.40.self_attn.v_proj": {
2791
+ "K": 5120,
2792
+ "M": 1,
2793
+ "N": 1024,
2794
+ "device": "cuda:3",
2795
+ "dtype": "torch.float16",
2796
+ "group_size": 256,
2797
+ "num_bits": 4,
2798
+ "num_sms": 128,
2799
+ "template_id": 59
2800
+ },
2801
+ "model.layers.41.mlp.down_proj": {
2802
+ "K": 13824,
2803
+ "M": 1,
2804
+ "N": 5120,
2805
+ "device": "cuda:3",
2806
+ "dtype": "torch.float16",
2807
+ "group_size": 256,
2808
+ "num_bits": 4,
2809
+ "num_sms": 128,
2810
+ "template_id": 37
2811
+ },
2812
+ "model.layers.41.mlp.gate_proj": {
2813
+ "K": 5120,
2814
+ "M": 1,
2815
+ "N": 13824,
2816
+ "device": "cuda:3",
2817
+ "dtype": "torch.float16",
2818
+ "group_size": 256,
2819
+ "num_bits": 4,
2820
+ "num_sms": 128,
2821
+ "template_id": 54
2822
+ },
2823
+ "model.layers.41.mlp.up_proj": {
2824
+ "K": 5120,
2825
+ "M": 1,
2826
+ "N": 13824,
2827
+ "device": "cuda:3",
2828
+ "dtype": "torch.float16",
2829
+ "group_size": 256,
2830
+ "num_bits": 4,
2831
+ "num_sms": 128,
2832
+ "template_id": 54
2833
+ },
2834
+ "model.layers.41.self_attn.k_proj": {
2835
+ "K": 5120,
2836
+ "M": 1,
2837
+ "N": 1024,
2838
+ "device": "cuda:3",
2839
+ "dtype": "torch.float16",
2840
+ "group_size": 256,
2841
+ "num_bits": 4,
2842
+ "num_sms": 128,
2843
+ "template_id": 59
2844
+ },
2845
+ "model.layers.41.self_attn.o_proj": {
2846
+ "K": 5120,
2847
+ "M": 1,
2848
+ "N": 5120,
2849
+ "device": "cuda:3",
2850
+ "dtype": "torch.float16",
2851
+ "group_size": 256,
2852
+ "num_bits": 4,
2853
+ "num_sms": 128,
2854
+ "template_id": 46
2855
+ },
2856
+ "model.layers.41.self_attn.q_proj": {
2857
+ "K": 5120,
2858
+ "M": 1,
2859
+ "N": 5120,
2860
+ "device": "cuda:3",
2861
+ "dtype": "torch.float16",
2862
+ "group_size": 256,
2863
+ "num_bits": 4,
2864
+ "num_sms": 128,
2865
+ "template_id": 46
2866
+ },
2867
+ "model.layers.41.self_attn.v_proj": {
2868
+ "K": 5120,
2869
+ "M": 1,
2870
+ "N": 1024,
2871
+ "device": "cuda:3",
2872
+ "dtype": "torch.float16",
2873
+ "group_size": 256,
2874
+ "num_bits": 4,
2875
+ "num_sms": 128,
2876
+ "template_id": 59
2877
+ },
2878
+ "model.layers.42.mlp.down_proj": {
2879
+ "K": 13824,
2880
+ "M": 1,
2881
+ "N": 5120,
2882
+ "device": "cuda:3",
2883
+ "dtype": "torch.float16",
2884
+ "group_size": 256,
2885
+ "num_bits": 4,
2886
+ "num_sms": 128,
2887
+ "template_id": 37
2888
+ },
2889
+ "model.layers.42.mlp.gate_proj": {
2890
+ "K": 5120,
2891
+ "M": 1,
2892
+ "N": 13824,
2893
+ "device": "cuda:3",
2894
+ "dtype": "torch.float16",
2895
+ "group_size": 256,
2896
+ "num_bits": 4,
2897
+ "num_sms": 128,
2898
+ "template_id": 54
2899
+ },
2900
+ "model.layers.42.mlp.up_proj": {
2901
+ "K": 5120,
2902
+ "M": 1,
2903
+ "N": 13824,
2904
+ "device": "cuda:3",
2905
+ "dtype": "torch.float16",
2906
+ "group_size": 256,
2907
+ "num_bits": 4,
2908
+ "num_sms": 128,
2909
+ "template_id": 54
2910
+ },
2911
+ "model.layers.42.self_attn.k_proj": {
2912
+ "K": 5120,
2913
+ "M": 1,
2914
+ "N": 1024,
2915
+ "device": "cuda:3",
2916
+ "dtype": "torch.float16",
2917
+ "group_size": 256,
2918
+ "num_bits": 4,
2919
+ "num_sms": 128,
2920
+ "template_id": 59
2921
+ },
2922
+ "model.layers.42.self_attn.o_proj": {
2923
+ "K": 5120,
2924
+ "M": 1,
2925
+ "N": 5120,
2926
+ "device": "cuda:3",
2927
+ "dtype": "torch.float16",
2928
+ "group_size": 256,
2929
+ "num_bits": 4,
2930
+ "num_sms": 128,
2931
+ "template_id": 46
2932
+ },
2933
+ "model.layers.42.self_attn.q_proj": {
2934
+ "K": 5120,
2935
+ "M": 1,
2936
+ "N": 5120,
2937
+ "device": "cuda:3",
2938
+ "dtype": "torch.float16",
2939
+ "group_size": 256,
2940
+ "num_bits": 4,
2941
+ "num_sms": 128,
2942
+ "template_id": 46
2943
+ },
2944
+ "model.layers.42.self_attn.v_proj": {
2945
+ "K": 5120,
2946
+ "M": 1,
2947
+ "N": 1024,
2948
+ "device": "cuda:3",
2949
+ "dtype": "torch.float16",
2950
+ "group_size": 256,
2951
+ "num_bits": 4,
2952
+ "num_sms": 128,
2953
+ "template_id": 59
2954
+ },
2955
+ "model.layers.43.mlp.down_proj": {
2956
+ "K": 13824,
2957
+ "M": 1,
2958
+ "N": 5120,
2959
+ "device": "cuda:3",
2960
+ "dtype": "torch.float16",
2961
+ "group_size": 256,
2962
+ "num_bits": 4,
2963
+ "num_sms": 128,
2964
+ "template_id": 37
2965
+ },
2966
+ "model.layers.43.mlp.gate_proj": {
2967
+ "K": 5120,
2968
+ "M": 1,
2969
+ "N": 13824,
2970
+ "device": "cuda:3",
2971
+ "dtype": "torch.float16",
2972
+ "group_size": 256,
2973
+ "num_bits": 4,
2974
+ "num_sms": 128,
2975
+ "template_id": 54
2976
+ },
2977
+ "model.layers.43.mlp.up_proj": {
2978
+ "K": 5120,
2979
+ "M": 1,
2980
+ "N": 13824,
2981
+ "device": "cuda:3",
2982
+ "dtype": "torch.float16",
2983
+ "group_size": 256,
2984
+ "num_bits": 4,
2985
+ "num_sms": 128,
2986
+ "template_id": 54
2987
+ },
2988
+ "model.layers.43.self_attn.k_proj": {
2989
+ "K": 5120,
2990
+ "M": 1,
2991
+ "N": 1024,
2992
+ "device": "cuda:3",
2993
+ "dtype": "torch.float16",
2994
+ "group_size": 256,
2995
+ "num_bits": 4,
2996
+ "num_sms": 128,
2997
+ "template_id": 59
2998
+ },
2999
+ "model.layers.43.self_attn.o_proj": {
3000
+ "K": 5120,
3001
+ "M": 1,
3002
+ "N": 5120,
3003
+ "device": "cuda:3",
3004
+ "dtype": "torch.float16",
3005
+ "group_size": 256,
3006
+ "num_bits": 4,
3007
+ "num_sms": 128,
3008
+ "template_id": 46
3009
+ },
3010
+ "model.layers.43.self_attn.q_proj": {
3011
+ "K": 5120,
3012
+ "M": 1,
3013
+ "N": 5120,
3014
+ "device": "cuda:3",
3015
+ "dtype": "torch.float16",
3016
+ "group_size": 256,
3017
+ "num_bits": 4,
3018
+ "num_sms": 128,
3019
+ "template_id": 46
3020
+ },
3021
+ "model.layers.43.self_attn.v_proj": {
3022
+ "K": 5120,
3023
+ "M": 1,
3024
+ "N": 1024,
3025
+ "device": "cuda:3",
3026
+ "dtype": "torch.float16",
3027
+ "group_size": 256,
3028
+ "num_bits": 4,
3029
+ "num_sms": 128,
3030
+ "template_id": 59
3031
+ },
3032
+ "model.layers.44.mlp.down_proj": {
3033
+ "K": 13824,
3034
+ "M": 1,
3035
+ "N": 5120,
3036
+ "device": "cuda:3",
3037
+ "dtype": "torch.float16",
3038
+ "group_size": 256,
3039
+ "num_bits": 4,
3040
+ "num_sms": 128,
3041
+ "template_id": 37
3042
+ },
3043
+ "model.layers.44.mlp.gate_proj": {
3044
+ "K": 5120,
3045
+ "M": 1,
3046
+ "N": 13824,
3047
+ "device": "cuda:3",
3048
+ "dtype": "torch.float16",
3049
+ "group_size": 256,
3050
+ "num_bits": 4,
3051
+ "num_sms": 128,
3052
+ "template_id": 54
3053
+ },
3054
+ "model.layers.44.mlp.up_proj": {
3055
+ "K": 5120,
3056
+ "M": 1,
3057
+ "N": 13824,
3058
+ "device": "cuda:3",
3059
+ "dtype": "torch.float16",
3060
+ "group_size": 256,
3061
+ "num_bits": 4,
3062
+ "num_sms": 128,
3063
+ "template_id": 54
3064
+ },
3065
+ "model.layers.44.self_attn.k_proj": {
3066
+ "K": 5120,
3067
+ "M": 1,
3068
+ "N": 1024,
3069
+ "device": "cuda:3",
3070
+ "dtype": "torch.float16",
3071
+ "group_size": 256,
3072
+ "num_bits": 4,
3073
+ "num_sms": 128,
3074
+ "template_id": 59
3075
+ },
3076
+ "model.layers.44.self_attn.o_proj": {
3077
+ "K": 5120,
3078
+ "M": 1,
3079
+ "N": 5120,
3080
+ "device": "cuda:3",
3081
+ "dtype": "torch.float16",
3082
+ "group_size": 256,
3083
+ "num_bits": 4,
3084
+ "num_sms": 128,
3085
+ "template_id": 46
3086
+ },
3087
+ "model.layers.44.self_attn.q_proj": {
3088
+ "K": 5120,
3089
+ "M": 1,
3090
+ "N": 5120,
3091
+ "device": "cuda:3",
3092
+ "dtype": "torch.float16",
3093
+ "group_size": 256,
3094
+ "num_bits": 4,
3095
+ "num_sms": 128,
3096
+ "template_id": 46
3097
+ },
3098
+ "model.layers.44.self_attn.v_proj": {
3099
+ "K": 5120,
3100
+ "M": 1,
3101
+ "N": 1024,
3102
+ "device": "cuda:3",
3103
+ "dtype": "torch.float16",
3104
+ "group_size": 256,
3105
+ "num_bits": 4,
3106
+ "num_sms": 128,
3107
+ "template_id": 59
3108
+ },
3109
+ "model.layers.45.mlp.down_proj": {
3110
+ "K": 13824,
3111
+ "M": 1,
3112
+ "N": 5120,
3113
+ "device": "cuda:3",
3114
+ "dtype": "torch.float16",
3115
+ "group_size": 256,
3116
+ "num_bits": 4,
3117
+ "num_sms": 128,
3118
+ "template_id": 37
3119
+ },
3120
+ "model.layers.45.mlp.gate_proj": {
3121
+ "K": 5120,
3122
+ "M": 1,
3123
+ "N": 13824,
3124
+ "device": "cuda:3",
3125
+ "dtype": "torch.float16",
3126
+ "group_size": 256,
3127
+ "num_bits": 4,
3128
+ "num_sms": 128,
3129
+ "template_id": 54
3130
+ },
3131
+ "model.layers.45.mlp.up_proj": {
3132
+ "K": 5120,
3133
+ "M": 1,
3134
+ "N": 13824,
3135
+ "device": "cuda:3",
3136
+ "dtype": "torch.float16",
3137
+ "group_size": 256,
3138
+ "num_bits": 4,
3139
+ "num_sms": 128,
3140
+ "template_id": 54
3141
+ },
3142
+ "model.layers.45.self_attn.k_proj": {
3143
+ "K": 5120,
3144
+ "M": 1,
3145
+ "N": 1024,
3146
+ "device": "cuda:3",
3147
+ "dtype": "torch.float16",
3148
+ "group_size": 256,
3149
+ "num_bits": 4,
3150
+ "num_sms": 128,
3151
+ "template_id": 59
3152
+ },
3153
+ "model.layers.45.self_attn.o_proj": {
3154
+ "K": 5120,
3155
+ "M": 1,
3156
+ "N": 5120,
3157
+ "device": "cuda:3",
3158
+ "dtype": "torch.float16",
3159
+ "group_size": 256,
3160
+ "num_bits": 4,
3161
+ "num_sms": 128,
3162
+ "template_id": 46
3163
+ },
3164
+ "model.layers.45.self_attn.q_proj": {
3165
+ "K": 5120,
3166
+ "M": 1,
3167
+ "N": 5120,
3168
+ "device": "cuda:3",
3169
+ "dtype": "torch.float16",
3170
+ "group_size": 256,
3171
+ "num_bits": 4,
3172
+ "num_sms": 128,
3173
+ "template_id": 46
3174
+ },
3175
+ "model.layers.45.self_attn.v_proj": {
3176
+ "K": 5120,
3177
+ "M": 1,
3178
+ "N": 1024,
3179
+ "device": "cuda:3",
3180
+ "dtype": "torch.float16",
3181
+ "group_size": 256,
3182
+ "num_bits": 4,
3183
+ "num_sms": 128,
3184
+ "template_id": 59
3185
+ },
3186
+ "model.layers.46.mlp.down_proj": {
3187
+ "K": 13824,
3188
+ "M": 1,
3189
+ "N": 5120,
3190
+ "device": "cuda:3",
3191
+ "dtype": "torch.float16",
3192
+ "group_size": 256,
3193
+ "num_bits": 4,
3194
+ "num_sms": 128,
3195
+ "template_id": 37
3196
+ },
3197
+ "model.layers.46.mlp.gate_proj": {
3198
+ "K": 5120,
3199
+ "M": 1,
3200
+ "N": 13824,
3201
+ "device": "cuda:3",
3202
+ "dtype": "torch.float16",
3203
+ "group_size": 256,
3204
+ "num_bits": 4,
3205
+ "num_sms": 128,
3206
+ "template_id": 54
3207
+ },
3208
+ "model.layers.46.mlp.up_proj": {
3209
+ "K": 5120,
3210
+ "M": 1,
3211
+ "N": 13824,
3212
+ "device": "cuda:3",
3213
+ "dtype": "torch.float16",
3214
+ "group_size": 256,
3215
+ "num_bits": 4,
3216
+ "num_sms": 128,
3217
+ "template_id": 54
3218
+ },
3219
+ "model.layers.46.self_attn.k_proj": {
3220
+ "K": 5120,
3221
+ "M": 1,
3222
+ "N": 1024,
3223
+ "device": "cuda:3",
3224
+ "dtype": "torch.float16",
3225
+ "group_size": 256,
3226
+ "num_bits": 4,
3227
+ "num_sms": 128,
3228
+ "template_id": 59
3229
+ },
3230
+ "model.layers.46.self_attn.o_proj": {
3231
+ "K": 5120,
3232
+ "M": 1,
3233
+ "N": 5120,
3234
+ "device": "cuda:3",
3235
+ "dtype": "torch.float16",
3236
+ "group_size": 256,
3237
+ "num_bits": 4,
3238
+ "num_sms": 128,
3239
+ "template_id": 46
3240
+ },
3241
+ "model.layers.46.self_attn.q_proj": {
3242
+ "K": 5120,
3243
+ "M": 1,
3244
+ "N": 5120,
3245
+ "device": "cuda:3",
3246
+ "dtype": "torch.float16",
3247
+ "group_size": 256,
3248
+ "num_bits": 4,
3249
+ "num_sms": 128,
3250
+ "template_id": 46
3251
+ },
3252
+ "model.layers.46.self_attn.v_proj": {
3253
+ "K": 5120,
3254
+ "M": 1,
3255
+ "N": 1024,
3256
+ "device": "cuda:3",
3257
+ "dtype": "torch.float16",
3258
+ "group_size": 256,
3259
+ "num_bits": 4,
3260
+ "num_sms": 128,
3261
+ "template_id": 59
3262
+ },
3263
+ "model.layers.47.mlp.down_proj": {
3264
+ "K": 13824,
3265
+ "M": 1,
3266
+ "N": 5120,
3267
+ "device": "cuda:3",
3268
+ "dtype": "torch.float16",
3269
+ "group_size": 256,
3270
+ "num_bits": 4,
3271
+ "num_sms": 128,
3272
+ "template_id": 37
3273
+ },
3274
+ "model.layers.47.mlp.gate_proj": {
3275
+ "K": 5120,
3276
+ "M": 1,
3277
+ "N": 13824,
3278
+ "device": "cuda:3",
3279
+ "dtype": "torch.float16",
3280
+ "group_size": 256,
3281
+ "num_bits": 4,
3282
+ "num_sms": 128,
3283
+ "template_id": 54
3284
+ },
3285
+ "model.layers.47.mlp.up_proj": {
3286
+ "K": 5120,
3287
+ "M": 1,
3288
+ "N": 13824,
3289
+ "device": "cuda:3",
3290
+ "dtype": "torch.float16",
3291
+ "group_size": 256,
3292
+ "num_bits": 4,
3293
+ "num_sms": 128,
3294
+ "template_id": 54
3295
+ },
3296
+ "model.layers.47.self_attn.k_proj": {
3297
+ "K": 5120,
3298
+ "M": 1,
3299
+ "N": 1024,
3300
+ "device": "cuda:3",
3301
+ "dtype": "torch.float16",
3302
+ "group_size": 256,
3303
+ "num_bits": 4,
3304
+ "num_sms": 128,
3305
+ "template_id": 59
3306
+ },
3307
+ "model.layers.47.self_attn.o_proj": {
3308
+ "K": 5120,
3309
+ "M": 1,
3310
+ "N": 5120,
3311
+ "device": "cuda:3",
3312
+ "dtype": "torch.float16",
3313
+ "group_size": 256,
3314
+ "num_bits": 4,
3315
+ "num_sms": 128,
3316
+ "template_id": 46
3317
+ },
3318
+ "model.layers.47.self_attn.q_proj": {
3319
+ "K": 5120,
3320
+ "M": 1,
3321
+ "N": 5120,
3322
+ "device": "cuda:3",
3323
+ "dtype": "torch.float16",
3324
+ "group_size": 256,
3325
+ "num_bits": 4,
3326
+ "num_sms": 128,
3327
+ "template_id": 46
3328
+ },
3329
+ "model.layers.47.self_attn.v_proj": {
3330
+ "K": 5120,
3331
+ "M": 1,
3332
+ "N": 1024,
3333
+ "device": "cuda:3",
3334
+ "dtype": "torch.float16",
3335
+ "group_size": 256,
3336
+ "num_bits": 4,
3337
+ "num_sms": 128,
3338
+ "template_id": 59
3339
+ },
3340
+ "model.layers.5.mlp.down_proj": {
3341
+ "K": 13824,
3342
+ "M": 1,
3343
+ "N": 5120,
3344
+ "device": "cuda:1",
3345
+ "dtype": "torch.float16",
3346
+ "group_size": 256,
3347
+ "num_bits": 4,
3348
+ "num_sms": 128,
3349
+ "template_id": 37
3350
+ },
3351
+ "model.layers.5.mlp.gate_proj": {
3352
+ "K": 5120,
3353
+ "M": 1,
3354
+ "N": 13824,
3355
+ "device": "cuda:1",
3356
+ "dtype": "torch.float16",
3357
+ "group_size": 256,
3358
+ "num_bits": 4,
3359
+ "num_sms": 128,
3360
+ "template_id": 54
3361
+ },
3362
+ "model.layers.5.mlp.up_proj": {
3363
+ "K": 5120,
3364
+ "M": 1,
3365
+ "N": 13824,
3366
+ "device": "cuda:1",
3367
+ "dtype": "torch.float16",
3368
+ "group_size": 256,
3369
+ "num_bits": 4,
3370
+ "num_sms": 128,
3371
+ "template_id": 54
3372
+ },
3373
+ "model.layers.5.self_attn.k_proj": {
3374
+ "K": 5120,
3375
+ "M": 1,
3376
+ "N": 1024,
3377
+ "device": "cuda:1",
3378
+ "dtype": "torch.float16",
3379
+ "group_size": 256,
3380
+ "num_bits": 4,
3381
+ "num_sms": 128,
3382
+ "template_id": 59
3383
+ },
3384
+ "model.layers.5.self_attn.o_proj": {
3385
+ "K": 5120,
3386
+ "M": 1,
3387
+ "N": 5120,
3388
+ "device": "cuda:1",
3389
+ "dtype": "torch.float16",
3390
+ "group_size": 256,
3391
+ "num_bits": 4,
3392
+ "num_sms": 128,
3393
+ "template_id": 46
3394
+ },
3395
+ "model.layers.5.self_attn.q_proj": {
3396
+ "K": 5120,
3397
+ "M": 1,
3398
+ "N": 5120,
3399
+ "device": "cuda:1",
3400
+ "dtype": "torch.float16",
3401
+ "group_size": 256,
3402
+ "num_bits": 4,
3403
+ "num_sms": 128,
3404
+ "template_id": 46
3405
+ },
3406
+ "model.layers.5.self_attn.v_proj": {
3407
+ "K": 5120,
3408
+ "M": 1,
3409
+ "N": 1024,
3410
+ "device": "cuda:1",
3411
+ "dtype": "torch.float16",
3412
+ "group_size": 256,
3413
+ "num_bits": 4,
3414
+ "num_sms": 128,
3415
+ "template_id": 59
3416
+ },
3417
+ "model.layers.6.mlp.down_proj": {
3418
+ "K": 13824,
3419
+ "M": 1,
3420
+ "N": 5120,
3421
+ "device": "cuda:1",
3422
+ "dtype": "torch.float16",
3423
+ "group_size": 256,
3424
+ "num_bits": 4,
3425
+ "num_sms": 128,
3426
+ "template_id": 37
3427
+ },
3428
+ "model.layers.6.mlp.gate_proj": {
3429
+ "K": 5120,
3430
+ "M": 1,
3431
+ "N": 13824,
3432
+ "device": "cuda:1",
3433
+ "dtype": "torch.float16",
3434
+ "group_size": 256,
3435
+ "num_bits": 4,
3436
+ "num_sms": 128,
3437
+ "template_id": 54
3438
+ },
3439
+ "model.layers.6.mlp.up_proj": {
3440
+ "K": 5120,
3441
+ "M": 1,
3442
+ "N": 13824,
3443
+ "device": "cuda:1",
3444
+ "dtype": "torch.float16",
3445
+ "group_size": 256,
3446
+ "num_bits": 4,
3447
+ "num_sms": 128,
3448
+ "template_id": 54
3449
+ },
3450
+ "model.layers.6.self_attn.k_proj": {
3451
+ "K": 5120,
3452
+ "M": 1,
3453
+ "N": 1024,
3454
+ "device": "cuda:1",
3455
+ "dtype": "torch.float16",
3456
+ "group_size": 256,
3457
+ "num_bits": 4,
3458
+ "num_sms": 128,
3459
+ "template_id": 59
3460
+ },
3461
+ "model.layers.6.self_attn.o_proj": {
3462
+ "K": 5120,
3463
+ "M": 1,
3464
+ "N": 5120,
3465
+ "device": "cuda:1",
3466
+ "dtype": "torch.float16",
3467
+ "group_size": 256,
3468
+ "num_bits": 4,
3469
+ "num_sms": 128,
3470
+ "template_id": 46
3471
+ },
3472
+ "model.layers.6.self_attn.q_proj": {
3473
+ "K": 5120,
3474
+ "M": 1,
3475
+ "N": 5120,
3476
+ "device": "cuda:1",
3477
+ "dtype": "torch.float16",
3478
+ "group_size": 256,
3479
+ "num_bits": 4,
3480
+ "num_sms": 128,
3481
+ "template_id": 46
3482
+ },
3483
+ "model.layers.6.self_attn.v_proj": {
3484
+ "K": 5120,
3485
+ "M": 1,
3486
+ "N": 1024,
3487
+ "device": "cuda:1",
3488
+ "dtype": "torch.float16",
3489
+ "group_size": 256,
3490
+ "num_bits": 4,
3491
+ "num_sms": 128,
3492
+ "template_id": 59
3493
+ },
3494
+ "model.layers.7.mlp.down_proj": {
3495
+ "K": 13824,
3496
+ "M": 1,
3497
+ "N": 5120,
3498
+ "device": "cuda:2",
3499
+ "dtype": "torch.float16",
3500
+ "group_size": 256,
3501
+ "num_bits": 4,
3502
+ "num_sms": 128,
3503
+ "template_id": 37
3504
+ },
3505
+ "model.layers.7.mlp.gate_proj": {
3506
+ "K": 5120,
3507
+ "M": 1,
3508
+ "N": 13824,
3509
+ "device": "cuda:2",
3510
+ "dtype": "torch.float16",
3511
+ "group_size": 256,
3512
+ "num_bits": 4,
3513
+ "num_sms": 128,
3514
+ "template_id": 54
3515
+ },
3516
+ "model.layers.7.mlp.up_proj": {
3517
+ "K": 5120,
3518
+ "M": 1,
3519
+ "N": 13824,
3520
+ "device": "cuda:2",
3521
+ "dtype": "torch.float16",
3522
+ "group_size": 256,
3523
+ "num_bits": 4,
3524
+ "num_sms": 128,
3525
+ "template_id": 54
3526
+ },
3527
+ "model.layers.7.self_attn.k_proj": {
3528
+ "K": 5120,
3529
+ "M": 1,
3530
+ "N": 1024,
3531
+ "device": "cuda:2",
3532
+ "dtype": "torch.float16",
3533
+ "group_size": 256,
3534
+ "num_bits": 4,
3535
+ "num_sms": 128,
3536
+ "template_id": 59
3537
+ },
3538
+ "model.layers.7.self_attn.o_proj": {
3539
+ "K": 5120,
3540
+ "M": 1,
3541
+ "N": 5120,
3542
+ "device": "cuda:2",
3543
+ "dtype": "torch.float16",
3544
+ "group_size": 256,
3545
+ "num_bits": 4,
3546
+ "num_sms": 128,
3547
+ "template_id": 46
3548
+ },
3549
+ "model.layers.7.self_attn.q_proj": {
3550
+ "K": 5120,
3551
+ "M": 1,
3552
+ "N": 5120,
3553
+ "device": "cuda:2",
3554
+ "dtype": "torch.float16",
3555
+ "group_size": 256,
3556
+ "num_bits": 4,
3557
+ "num_sms": 128,
3558
+ "template_id": 46
3559
+ },
3560
+ "model.layers.7.self_attn.v_proj": {
3561
+ "K": 5120,
3562
+ "M": 1,
3563
+ "N": 1024,
3564
+ "device": "cuda:2",
3565
+ "dtype": "torch.float16",
3566
+ "group_size": 256,
3567
+ "num_bits": 4,
3568
+ "num_sms": 128,
3569
+ "template_id": 59
3570
+ },
3571
+ "model.layers.8.mlp.down_proj": {
3572
+ "K": 13824,
3573
+ "M": 1,
3574
+ "N": 5120,
3575
+ "device": "cuda:2",
3576
+ "dtype": "torch.float16",
3577
+ "group_size": 256,
3578
+ "num_bits": 4,
3579
+ "num_sms": 128,
3580
+ "template_id": 37
3581
+ },
3582
+ "model.layers.8.mlp.gate_proj": {
3583
+ "K": 5120,
3584
+ "M": 1,
3585
+ "N": 13824,
3586
+ "device": "cuda:2",
3587
+ "dtype": "torch.float16",
3588
+ "group_size": 256,
3589
+ "num_bits": 4,
3590
+ "num_sms": 128,
3591
+ "template_id": 54
3592
+ },
3593
+ "model.layers.8.mlp.up_proj": {
3594
+ "K": 5120,
3595
+ "M": 1,
3596
+ "N": 13824,
3597
+ "device": "cuda:2",
3598
+ "dtype": "torch.float16",
3599
+ "group_size": 256,
3600
+ "num_bits": 4,
3601
+ "num_sms": 128,
3602
+ "template_id": 54
3603
+ },
3604
+ "model.layers.8.self_attn.k_proj": {
3605
+ "K": 5120,
3606
+ "M": 1,
3607
+ "N": 1024,
3608
+ "device": "cuda:2",
3609
+ "dtype": "torch.float16",
3610
+ "group_size": 256,
3611
+ "num_bits": 4,
3612
+ "num_sms": 128,
3613
+ "template_id": 59
3614
+ },
3615
+ "model.layers.8.self_attn.o_proj": {
3616
+ "K": 5120,
3617
+ "M": 1,
3618
+ "N": 5120,
3619
+ "device": "cuda:2",
3620
+ "dtype": "torch.float16",
3621
+ "group_size": 256,
3622
+ "num_bits": 4,
3623
+ "num_sms": 128,
3624
+ "template_id": 46
3625
+ },
3626
+ "model.layers.8.self_attn.q_proj": {
3627
+ "K": 5120,
3628
+ "M": 1,
3629
+ "N": 5120,
3630
+ "device": "cuda:2",
3631
+ "dtype": "torch.float16",
3632
+ "group_size": 256,
3633
+ "num_bits": 4,
3634
+ "num_sms": 128,
3635
+ "template_id": 46
3636
+ },
3637
+ "model.layers.8.self_attn.v_proj": {
3638
+ "K": 5120,
3639
+ "M": 1,
3640
+ "N": 1024,
3641
+ "device": "cuda:2",
3642
+ "dtype": "torch.float16",
3643
+ "group_size": 256,
3644
+ "num_bits": 4,
3645
+ "num_sms": 128,
3646
+ "template_id": 59
3647
+ },
3648
+ "model.layers.9.mlp.down_proj": {
3649
+ "K": 13824,
3650
+ "M": 1,
3651
+ "N": 5120,
3652
+ "device": "cuda:2",
3653
+ "dtype": "torch.float16",
3654
+ "group_size": 256,
3655
+ "num_bits": 4,
3656
+ "num_sms": 128,
3657
+ "template_id": 37
3658
+ },
3659
+ "model.layers.9.mlp.gate_proj": {
3660
+ "K": 5120,
3661
+ "M": 1,
3662
+ "N": 13824,
3663
+ "device": "cuda:2",
3664
+ "dtype": "torch.float16",
3665
+ "group_size": 256,
3666
+ "num_bits": 4,
3667
+ "num_sms": 128,
3668
+ "template_id": 54
3669
+ },
3670
+ "model.layers.9.mlp.up_proj": {
3671
+ "K": 5120,
3672
+ "M": 1,
3673
+ "N": 13824,
3674
+ "device": "cuda:2",
3675
+ "dtype": "torch.float16",
3676
+ "group_size": 256,
3677
+ "num_bits": 4,
3678
+ "num_sms": 128,
3679
+ "template_id": 54
3680
+ },
3681
+ "model.layers.9.self_attn.k_proj": {
3682
+ "K": 5120,
3683
+ "M": 1,
3684
+ "N": 1024,
3685
+ "device": "cuda:2",
3686
+ "dtype": "torch.float16",
3687
+ "group_size": 256,
3688
+ "num_bits": 4,
3689
+ "num_sms": 128,
3690
+ "template_id": 59
3691
+ },
3692
+ "model.layers.9.self_attn.o_proj": {
3693
+ "K": 5120,
3694
+ "M": 1,
3695
+ "N": 5120,
3696
+ "device": "cuda:2",
3697
+ "dtype": "torch.float16",
3698
+ "group_size": 256,
3699
+ "num_bits": 4,
3700
+ "num_sms": 128,
3701
+ "template_id": 46
3702
+ },
3703
+ "model.layers.9.self_attn.q_proj": {
3704
+ "K": 5120,
3705
+ "M": 1,
3706
+ "N": 5120,
3707
+ "device": "cuda:2",
3708
+ "dtype": "torch.float16",
3709
+ "group_size": 256,
3710
+ "num_bits": 4,
3711
+ "num_sms": 128,
3712
+ "template_id": 46
3713
+ },
3714
+ "model.layers.9.self_attn.v_proj": {
3715
+ "K": 5120,
3716
+ "M": 1,
3717
+ "N": 1024,
3718
+ "device": "cuda:2",
3719
+ "dtype": "torch.float16",
3720
+ "group_size": 256,
3721
+ "num_bits": 4,
3722
+ "num_sms": 128,
3723
+ "template_id": 59
3724
+ }
3725
+ }
3726
+ },
3727
+ "rms_norm_eps": 1e-05,
3728
+ "rope_scaling": null,
3729
+ "rope_theta": 1000000.0,
3730
+ "sliding_window": null,
3731
+ "tie_word_embeddings": false,
3732
+ "torch_dtype": "float16",
3733
+ "transformers_version": "4.49.0.dev0",
3734
+ "use_cache": true,
3735
+ "use_sliding_window": false,
3736
+ "vocab_size": 152064
3737
+ }
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 151646,
4
+ "do_sample": true,
5
+ "eos_token_id": 151643,
6
+ "temperature": 0.6,
7
+ "top_p": 0.95,
8
+ "transformers_version": "4.49.0.dev0"
9
+ }
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:261d01fa113e5f617b4734621ab28170af477c5bd532a86dd4945bb944fd50ed
3
+ size 4980774960
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09965be090f0ec83cf4f2de8d78696feb8f0548928acf6ed7cabf067290f0652
3
+ size 4844958760
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff