Model card side-bar missing?
Heya mradermacher, thanks for all the amazing quants. I was doing a comparison across the most likely top four V3-0324
quants in the ~230GiB size class and wanted to add your info to the table. However, for some reason the gguf details are not showing up?
I might be able to download the Q2_K
and use gguf-py/gguf/gguf-reader.py
or similar to print out the tensor data to add to my table.
I have some discussion on it here on the ik_llama.cpp fork.
Cheers and happy cookin'!
hf only supports non-split quants, that's probably why they don't show up. that sidebar is not provided by us, we have no influence over it.
Thanks, yeah makes sense. I wonder if it is because the split names are DeepSeek-V3-0324.i1-Q2_K.gguf.part1of5
instead of DeepSeek-V3-0324.i1-Q2_K.part1of5.gguf
as maybe huggingface is webapp is looking at file extension...
I think I can get the info with:
$ python gguf-py/scripts/gguf_dump.py --markdown some_model.gguf
Cheers!
I don't think hf supports split gguf files at all. Would probably be easy to improve on their side - since they only parse the header, they could go with the .gguf.part\d*1of\d+ file (which contains the header). Clearly, it's not a priority for them, and that's fine with me, too. I don't think it's the file extension alone, though, as they do not get confused by the multi-header llama format (which uses .gguf)
PS: part1of5.gguf wouldn't be correct
Ahh, thanks for the details. I guess I don't fully understand what a "split gguf" is exactly. In my testing I can run this.
$ du -h DeepSeek-V3-0324-IQ2_K_R4.gguf
227G DeepSeek-V3-0324-IQ2_K_R4.gguf
$ ./build/bin/llama-gguf-split \
--split \
--split-max-size 50G \
./DeepSeek-V3-0324-IQ2_K_R4.gguf \
/models/DeepSeek-V3-0324-IQ2_K_R4/DeepSeek-V3-0324-IQ2_K_R4
$ du -hc /models/DeepSeek-V3-0324-IQ2_K_R4/DeepSeek-V3-0324-IQ2_K_R4/*.gguf
46G DeepSeek-V3-0324-IQ2_K_R4-00001-of-00005.gguf
47G DeepSeek-V3-0324-IQ2_K_R4-00002-of-00005.gguf
47G DeepSeek-V3-0324-IQ2_K_R4-00003-of-00005.gguf
47G DeepSeek-V3-0324-IQ2_K_R4-00004-of-00005.gguf
43G DeepSeek-V3-0324-IQ2_K_R4-00005-of-00005.gguf
227G total
And huggingface works fine with this as seen in ubergarm/DeepSeek-V3-0324-GGUF as well as bartowski's and unsloth's repos as well.
Anyway, I'm just trying to see what exactly is in your quant before downloading the entire thing. I downloaded just the first part and gguf-py/scripts/gguf_dump.py
isn't working, so I'll try to hexedit or find another tool that can print out the header information at least.
Or if you have the time to run this on the folder containing your splits and copy paste it here. No pressure at all, I know y'all keeping busy!
pip install 'numpy<2.0.0'
python llama.cpp/gguf-py/scripts/gguf_dump.py \
--markdown \
DeepSeek-V3-0324-i1-GGUF/DeepSeek-V3-0324.i1-Q2_K.gguf.part1of5
Thanks again!
@ubergarm You only need the first few megabytes of the first part to get the metadata. Just use gguf-parser-windows-amd64.exe as I did under https://huggingface.co/mradermacher/model_requests/discussions/797#67e2c2975baf8e70d6e63d99
I guess I don't fully understand what a "split gguf" is exactly.
Right - a split gguf is simply a single gguf file split into multiple parts. That is opposed to a model in multi-part format, which is multiple gguf files, which, unfortunately, is also something like a gguf split into multiple gguf files. It's just not a split gguf file, but multiple ones.
It is very confusing.
Or if you have the time to run this on the folder containing your splits and copy paste it here.
I don't have such a folder, but if you tell me what info you need, I might be able to provide it, if you can't get nico's method to work.
No pressure to look at this, I know u busy cooking! haha...
It is very confusing.
Ahh, I see now. It is literally the original .gguf binary data cut into pieces with somewhat arbitrary length. Perhaps this precedent was set by TheBloke as suggested by this gist script.
I would just download it and merge it myself but have to check with my server guy on bandwidth usage haha...
Basically I want to find out if you use different quantizations for different layers similar to how unsloth is doing it e.g. Q6_0
for attention and Q2_K
for routed expert layers etc. As of now I believe bartowski is using the same smaller quantization across all the layers except keeping token embedding at Q8_0
.
Assuming you have a merged gguf, this would show the information missing from the huggingface model card side-bar (as it can't handle the "split gguf"). (assuming WSL or Linux shell)
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
# curl -LsSf https://astral.sh/uv/install.sh | sh # install uv if needed
uv venv ./venv --python 3.12 --python-preference=only-managed
source venv/bin/activate
uv pip install 'numpy<2.0.0' sentencepiece pyyaml
python gguf-py/gguf/scripts/gguf_dump.py \
--markdown \
DeepSeek-V3-0324-i1-GGUF/DeepSeek-V3-0324.i1-Q2_K.gguf
It would print something out like this (taken from unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf
) to show the various quantizations. The first ouput and token embedding part, block 0 (dense layers), and for example block 14 (experts) is plenty.
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
0 | output.weight | Output (W) | (~927M) 926679040 | 7168 x 129280 x 1 x 1 | Q6_K |
1 | output_norm.weight | Output Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
2 | token_embd.weight | Token Embedding (W) | (~927M) 926679040 | 7168 x 129280 x 1 x 1 | Q4_K |
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
3 | blk.0.attn_kv_a_mqa.weight | Block 0 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | Q6_K |
4 | blk.0.attn_kv_a_norm.weight | Block 0 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
5 | blk.0.attn_kv_b.weight | Block 0 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | Q6_K |
6 | blk.0.attn_norm.weight | Block 0 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
7 | blk.0.attn_output.weight | Block 0 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | Q4_K |
8 | blk.0.attn_q_a.weight | Block 0 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | Q4_K |
9 | blk.0.attn_q_a_norm.weight | Block 0 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
10 | blk.0.attn_q_b.weight | Block 0 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | Q4_K |
11 | blk.0.ffn_down.weight | Block 0 Feed-Forward Network "Down" (W) | (~132M) 132120576 | 18432 x 7168 x 1 x 1 | Q6_K |
12 | blk.0.ffn_gate.weight | Block 0 Feed-Forward Network "Gate" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | Q4_K |
13 | blk.0.ffn_norm.weight | Block 0 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
14 | blk.0.ffn_up.weight | Block 0 Feed-Forward Network "Up" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | Q4_K |
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
226 | blk.14.attn_kv_a_mqa.weight | Block 14 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | Q6_K |
227 | blk.14.attn_kv_a_norm.weight | Block 14 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
228 | blk.14.attn_kv_b.weight | Block 14 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | Q6_K |
229 | blk.14.attn_norm.weight | Block 14 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
230 | blk.14.attn_output.weight | Block 14 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | Q4_K |
231 | blk.14.attn_q_a.weight | Block 14 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | Q4_K |
232 | blk.14.attn_q_a_norm.weight | Block 14 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
233 | blk.14.attn_q_b.weight | Block 14 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | Q4_K |
234 | blk.14.exp_probs_b.bias | Block 14 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
235 | blk.14.ffn_down_exps.weight | Block 14 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | Q2_K |
236 | blk.14.ffn_down_shexp.weight | Block 14 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | Q6_K |
237 | blk.14.ffn_gate_exps.weight | Block 14 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | Q2_K |
238 | blk.14.ffn_gate_inp.weight | Block 14 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
239 | blk.14.ffn_gate_shexp.weight | Block 14 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | Q4_K |
240 | blk.14.ffn_norm.weight | Block 14 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
241 | blk.14.ffn_up_exps.weight | Block 14 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | Q2_K |
242 | blk.14.ffn_up_shexp.weight | Block 14 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | Q4_K |
@ubergarm I ran it for DeepSeek-V3-0324.i1-IQ2_S.gguf
(venv) root@AI:/apool/Meta/llama.cpp# python gguf-py/gguf/scripts/gguf_dump.py
--markdown
/mradermacher/root/DeepSeek-V3-0324.i1-IQ2_S.gguf
/mradermacher/root/DeepSeek-V3-0324.i1-IQ2_S.gguf - GGUF Internal File Dump
- Endian: LITTLE endian
Key Value Metadata Store
There are 58 key-value pairs in this file
POS | TYPE | Count | Key | Value |
---|---|---|---|---|
1 | UINT32 | 1 | GGUF.version | 3 |
2 | UINT64 | 1 | GGUF.tensor_count | 1025 |
3 | UINT64 | 1 | GGUF.kv_count | 55 |
4 | STRING | 1 | general.architecture | deepseek2 |
5 | STRING | 1 | general.type | model |
6 | STRING | 1 | general.name | DeepSeek V3 0324 Bf16 |
7 | STRING | 1 | general.size_label | 256x20B |
8 | STRING | 1 | general.license | mit |
9 | UINT32 | 1 | deepseek2.block_count | 61 |
10 | UINT32 | 1 | deepseek2.context_length | 163840 |
11 | UINT32 | 1 | deepseek2.embedding_length | 7168 |
12 | UINT32 | 1 | deepseek2.feed_forward_length | 18432 |
13 | UINT32 | 1 | deepseek2.attention.head_count | 128 |
14 | UINT32 | 1 | deepseek2.attention.head_count_kv | 128 |
15 | FLOAT32 | 1 | deepseek2.rope.freq_base | 10000.0 |
16 | FLOAT32 | 1 | deepseek2.attention.layer_norm_rms_epsilon | 1e-06 |
17 | UINT32 | 1 | deepseek2.expert_used_count | 8 |
18 | UINT32 | 1 | deepseek2.leading_dense_block_count | 3 |
19 | UINT32 | 1 | deepseek2.vocab_size | 129280 |
20 | UINT32 | 1 | deepseek2.attention.q_lora_rank | 1536 |
21 | UINT32 | 1 | deepseek2.attention.kv_lora_rank | 512 |
22 | UINT32 | 1 | deepseek2.attention.key_length | 192 |
23 | UINT32 | 1 | deepseek2.attention.value_length | 128 |
24 | UINT32 | 1 | deepseek2.expert_feed_forward_length | 2048 |
25 | UINT32 | 1 | deepseek2.expert_count | 256 |
26 | UINT32 | 1 | deepseek2.expert_shared_count | 1 |
27 | FLOAT32 | 1 | deepseek2.expert_weights_scale | 2.5 |
28 | BOOL | 1 | deepseek2.expert_weights_norm | True |
29 | UINT32 | 1 | deepseek2.expert_gating_func | 2 |
30 | UINT32 | 1 | deepseek2.rope.dimension_count | 64 |
31 | STRING | 1 | deepseek2.rope.scaling.type | yarn |
32 | FLOAT32 | 1 | deepseek2.rope.scaling.factor | 40.0 |
33 | UINT32 | 1 | deepseek2.rope.scaling.original_context_length | 4096 |
34 | FLOAT32 | 1 | deepseek2.rope.scaling.yarn_log_multiplier | 0.1 |
35 | STRING | 1 | tokenizer.ggml.model | gpt2 |
36 | STRING | 1 | tokenizer.ggml.pre | deepseek-v3 |
37 | [STRING] | 129280 | tokenizer.ggml.tokens | [ <|begin▁of▁sentence|> , <|end▁of▁sentence|> , <|▁pad▁|> , ! , " , ... ] |
38 | [INT32] | 129280 | tokenizer.ggml.token_type | [ 3, 3, 3, 1, 1, 1, 1, ... ] |
39 | [STRING] | 127741 | tokenizer.ggml.merges | [ Ġ t , Ġ a , i n , Ġ Ġ , h e , ... ] |
40 | UINT32 | 1 | tokenizer.ggml.bos_token_id | 0 |
41 | UINT32 | 1 | tokenizer.ggml.eos_token_id | 1 |
42 | UINT32 | 1 | tokenizer.ggml.padding_token_id | 1 |
43 | BOOL | 1 | tokenizer.ggml.add_bos_token | True |
44 | BOOL | 1 | tokenizer.ggml.add_eos_token | False |
45 | STRING | 1 | tokenizer.chat_template | {% if not add_generation_promp ...{{'<|Assistant|>'}}{% endif %} |
46 | UINT32 | 1 | general.quantization_version | 2 |
47 | UINT32 | 1 | general.file_type | 28 |
48 | STRING | 1 | general.url | https://huggingface.co/mradermacher/DeepSeek-V3-0324-i1-GGUF |
49 | STRING | 1 | mradermacher.quantize_version | 2 |
50 | STRING | 1 | mradermacher.quantized_by | mradermacher |
51 | STRING | 1 | mradermacher.quantized_at | 2025-03-31T15:31:56+02:00 |
52 | STRING | 1 | mradermacher.quantized_on | nico1 |
53 | STRING | 1 | general.source.url | https://huggingface.co/deepseek-ai/DeepSeek-V3-0324 |
54 | STRING | 1 | mradermacher.convert_type | hf |
55 | STRING | 1 | quantize.imatrix.file | DeepSeek-V3-0324-i1-GGUF/imatrix.dat |
56 | STRING | 1 | quantize.imatrix.dataset | imatrix-training-full-3 |
57 | INT32 | 1 | quantize.imatrix.entries_count | 720 |
58 | INT32 | 1 | quantize.imatrix.chunks_count | 315 |
Base Tensor Group : ~2B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
0 | output.weight | Output (W) | (~927M) 926679040 | 7168 x 129280 x 1 x 1 | Q5_K |
1 | output_norm.weight | Output Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
2 | token_embd.weight | Token Embedding (W) | (~927M) 926679040 | 7168 x 129280 x 1 x 1 | IQ3_S |
- Total elements in base: ( ~2B) 1853365248
- Percentage of total elements: 0.28%
Block 0 Tensor Group : ~583M Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
3 | blk.0.attn_kv_a_mqa.weight | Block 0 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
4 | blk.0.attn_kv_a_norm.weight | Block 0 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
5 | blk.0.attn_kv_b.weight | Block 0 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
6 | blk.0.attn_norm.weight | Block 0 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
7 | blk.0.attn_output.weight | Block 0 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
8 | blk.0.attn_q_a.weight | Block 0 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
9 | blk.0.attn_q_a_norm.weight | Block 0 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
10 | blk.0.attn_q_b.weight | Block 0 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
11 | blk.0.ffn_down.weight | Block 0 Feed-Forward Network "Down" (W) | (~132M) 132120576 | 18432 x 7168 x 1 x 1 | IQ3_S |
12 | blk.0.ffn_gate.weight | Block 0 Feed-Forward Network "Gate" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
13 | blk.0.ffn_norm.weight | Block 0 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
14 | blk.0.ffn_up.weight | Block 0 Feed-Forward Network "Up" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
- Total elements in blk.0: (~583M) 583483392
- Percentage of total elements: 0.09%
Block 1 Tensor Group : ~583M Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
15 | blk.1.attn_kv_a_mqa.weight | Block 1 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
16 | blk.1.attn_kv_a_norm.weight | Block 1 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
17 | blk.1.attn_kv_b.weight | Block 1 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
18 | blk.1.attn_norm.weight | Block 1 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
19 | blk.1.attn_output.weight | Block 1 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
20 | blk.1.attn_q_a.weight | Block 1 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
21 | blk.1.attn_q_a_norm.weight | Block 1 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
22 | blk.1.attn_q_b.weight | Block 1 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
23 | blk.1.ffn_down.weight | Block 1 Feed-Forward Network "Down" (W) | (~132M) 132120576 | 18432 x 7168 x 1 x 1 | IQ3_S |
24 | blk.1.ffn_gate.weight | Block 1 Feed-Forward Network "Gate" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
25 | blk.1.ffn_norm.weight | Block 1 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
26 | blk.1.ffn_up.weight | Block 1 Feed-Forward Network "Up" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
- Total elements in blk.1: (~583M) 583483392
- Percentage of total elements: 0.09%
Block 2 Tensor Group : ~583M Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
27 | blk.2.attn_kv_a_mqa.weight | Block 2 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
28 | blk.2.attn_kv_a_norm.weight | Block 2 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
29 | blk.2.attn_kv_b.weight | Block 2 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
30 | blk.2.attn_norm.weight | Block 2 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
31 | blk.2.attn_output.weight | Block 2 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
32 | blk.2.attn_q_a.weight | Block 2 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
33 | blk.2.attn_q_a_norm.weight | Block 2 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
34 | blk.2.attn_q_b.weight | Block 2 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
35 | blk.2.ffn_down.weight | Block 2 Feed-Forward Network "Down" (W) | (~132M) 132120576 | 18432 x 7168 x 1 x 1 | IQ3_S |
36 | blk.2.ffn_gate.weight | Block 2 Feed-Forward Network "Gate" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
37 | blk.2.ffn_norm.weight | Block 2 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
38 | blk.2.ffn_up.weight | Block 2 Feed-Forward Network "Up" (W) | (~132M) 132120576 | 7168 x 18432 x 1 x 1 | IQ2_XS |
- Total elements in blk.2: (~583M) 583483392
- Percentage of total elements: 0.09%
Block 3 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
39 | blk.3.attn_kv_a_mqa.weight | Block 3 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
40 | blk.3.attn_kv_a_norm.weight | Block 3 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
41 | blk.3.attn_kv_b.weight | Block 3 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
42 | blk.3.attn_norm.weight | Block 3 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
43 | blk.3.attn_output.weight | Block 3 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
44 | blk.3.attn_q_a.weight | Block 3 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
45 | blk.3.attn_q_a_norm.weight | Block 3 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
46 | blk.3.attn_q_b.weight | Block 3 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
47 | blk.3.exp_probs_b.bias | Block 3 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
48 | blk.3.ffn_down_exps.weight | Block 3 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ3_S |
49 | blk.3.ffn_down_shexp.weight | Block 3 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ3_S |
50 | blk.3.ffn_gate_exps.weight | Block 3 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
51 | blk.3.ffn_gate_inp.weight | Block 3 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
52 | blk.3.ffn_gate_shexp.weight | Block 3 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
53 | blk.3.ffn_norm.weight | Block 3 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
54 | blk.3.ffn_up_exps.weight | Block 3 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
55 | blk.3.ffn_up_shexp.weight | Block 3 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.3: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 4 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
56 | blk.4.attn_kv_a_mqa.weight | Block 4 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
57 | blk.4.attn_kv_a_norm.weight | Block 4 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
58 | blk.4.attn_kv_b.weight | Block 4 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
59 | blk.4.attn_norm.weight | Block 4 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
60 | blk.4.attn_output.weight | Block 4 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
61 | blk.4.attn_q_a.weight | Block 4 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
62 | blk.4.attn_q_a_norm.weight | Block 4 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
63 | blk.4.attn_q_b.weight | Block 4 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
64 | blk.4.exp_probs_b.bias | Block 4 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
65 | blk.4.ffn_down_exps.weight | Block 4 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ3_S |
66 | blk.4.ffn_down_shexp.weight | Block 4 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ3_S |
67 | blk.4.ffn_gate_exps.weight | Block 4 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
68 | blk.4.ffn_gate_inp.weight | Block 4 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
69 | blk.4.ffn_gate_shexp.weight | Block 4 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
70 | blk.4.ffn_norm.weight | Block 4 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
71 | blk.4.ffn_up_exps.weight | Block 4 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
72 | blk.4.ffn_up_shexp.weight | Block 4 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.4: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 5 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
73 | blk.5.attn_kv_a_mqa.weight | Block 5 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
74 | blk.5.attn_kv_a_norm.weight | Block 5 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
75 | blk.5.attn_kv_b.weight | Block 5 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
76 | blk.5.attn_norm.weight | Block 5 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
77 | blk.5.attn_output.weight | Block 5 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
78 | blk.5.attn_q_a.weight | Block 5 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
79 | blk.5.attn_q_a_norm.weight | Block 5 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
80 | blk.5.attn_q_b.weight | Block 5 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
81 | blk.5.exp_probs_b.bias | Block 5 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
82 | blk.5.ffn_down_exps.weight | Block 5 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
83 | blk.5.ffn_down_shexp.weight | Block 5 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
84 | blk.5.ffn_gate_exps.weight | Block 5 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
85 | blk.5.ffn_gate_inp.weight | Block 5 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
86 | blk.5.ffn_gate_shexp.weight | Block 5 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
87 | blk.5.ffn_norm.weight | Block 5 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
88 | blk.5.ffn_up_exps.weight | Block 5 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
89 | blk.5.ffn_up_shexp.weight | Block 5 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.5: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 6 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
90 | blk.6.attn_kv_a_mqa.weight | Block 6 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
91 | blk.6.attn_kv_a_norm.weight | Block 6 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
92 | blk.6.attn_kv_b.weight | Block 6 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
93 | blk.6.attn_norm.weight | Block 6 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
94 | blk.6.attn_output.weight | Block 6 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
95 | blk.6.attn_q_a.weight | Block 6 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
96 | blk.6.attn_q_a_norm.weight | Block 6 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
97 | blk.6.attn_q_b.weight | Block 6 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
98 | blk.6.exp_probs_b.bias | Block 6 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
99 | blk.6.ffn_down_exps.weight | Block 6 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
100 | blk.6.ffn_down_shexp.weight | Block 6 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
101 | blk.6.ffn_gate_exps.weight | Block 6 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
102 | blk.6.ffn_gate_inp.weight | Block 6 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
103 | blk.6.ffn_gate_shexp.weight | Block 6 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
104 | blk.6.ffn_norm.weight | Block 6 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
105 | blk.6.ffn_up_exps.weight | Block 6 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
106 | blk.6.ffn_up_shexp.weight | Block 6 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.6: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 7 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
107 | blk.7.attn_kv_a_mqa.weight | Block 7 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
108 | blk.7.attn_kv_a_norm.weight | Block 7 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
109 | blk.7.attn_kv_b.weight | Block 7 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
110 | blk.7.attn_norm.weight | Block 7 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
111 | blk.7.attn_output.weight | Block 7 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
112 | blk.7.attn_q_a.weight | Block 7 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
113 | blk.7.attn_q_a_norm.weight | Block 7 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
114 | blk.7.attn_q_b.weight | Block 7 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
115 | blk.7.exp_probs_b.bias | Block 7 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
116 | blk.7.ffn_down_exps.weight | Block 7 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
117 | blk.7.ffn_down_shexp.weight | Block 7 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
118 | blk.7.ffn_gate_exps.weight | Block 7 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
119 | blk.7.ffn_gate_inp.weight | Block 7 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
120 | blk.7.ffn_gate_shexp.weight | Block 7 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
121 | blk.7.ffn_norm.weight | Block 7 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
122 | blk.7.ffn_up_exps.weight | Block 7 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
123 | blk.7.ffn_up_shexp.weight | Block 7 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.7: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 8 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
124 | blk.8.attn_kv_a_mqa.weight | Block 8 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
125 | blk.8.attn_kv_a_norm.weight | Block 8 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
126 | blk.8.attn_kv_b.weight | Block 8 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
127 | blk.8.attn_norm.weight | Block 8 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
128 | blk.8.attn_output.weight | Block 8 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
129 | blk.8.attn_q_a.weight | Block 8 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
130 | blk.8.attn_q_a_norm.weight | Block 8 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
131 | blk.8.attn_q_b.weight | Block 8 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
132 | blk.8.exp_probs_b.bias | Block 8 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
133 | blk.8.ffn_down_exps.weight | Block 8 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
134 | blk.8.ffn_down_shexp.weight | Block 8 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
135 | blk.8.ffn_gate_exps.weight | Block 8 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
136 | blk.8.ffn_gate_inp.weight | Block 8 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
137 | blk.8.ffn_gate_shexp.weight | Block 8 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
138 | blk.8.ffn_norm.weight | Block 8 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
139 | blk.8.ffn_up_exps.weight | Block 8 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
140 | blk.8.ffn_up_shexp.weight | Block 8 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.8: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 9 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
141 | blk.9.attn_kv_a_mqa.weight | Block 9 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
142 | blk.9.attn_kv_a_norm.weight | Block 9 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
143 | blk.9.attn_kv_b.weight | Block 9 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
144 | blk.9.attn_norm.weight | Block 9 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
145 | blk.9.attn_output.weight | Block 9 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
146 | blk.9.attn_q_a.weight | Block 9 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
147 | blk.9.attn_q_a_norm.weight | Block 9 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
148 | blk.9.attn_q_b.weight | Block 9 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
149 | blk.9.exp_probs_b.bias | Block 9 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
150 | blk.9.ffn_down_exps.weight | Block 9 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
151 | blk.9.ffn_down_shexp.weight | Block 9 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
152 | blk.9.ffn_gate_exps.weight | Block 9 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
153 | blk.9.ffn_gate_inp.weight | Block 9 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
154 | blk.9.ffn_gate_shexp.weight | Block 9 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
155 | blk.9.ffn_norm.weight | Block 9 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
156 | blk.9.ffn_up_exps.weight | Block 9 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
157 | blk.9.ffn_up_shexp.weight | Block 9 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.9: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 10 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
158 | blk.10.attn_kv_a_mqa.weight | Block 10 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
159 | blk.10.attn_kv_a_norm.weight | Block 10 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
160 | blk.10.attn_kv_b.weight | Block 10 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
161 | blk.10.attn_norm.weight | Block 10 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
162 | blk.10.attn_output.weight | Block 10 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
163 | blk.10.attn_q_a.weight | Block 10 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
164 | blk.10.attn_q_a_norm.weight | Block 10 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
165 | blk.10.attn_q_b.weight | Block 10 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
166 | blk.10.exp_probs_b.bias | Block 10 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
167 | blk.10.ffn_down_exps.weight | Block 10 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
168 | blk.10.ffn_down_shexp.weight | Block 10 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
169 | blk.10.ffn_gate_exps.weight | Block 10 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
170 | blk.10.ffn_gate_inp.weight | Block 10 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
171 | blk.10.ffn_gate_shexp.weight | Block 10 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
172 | blk.10.ffn_norm.weight | Block 10 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
173 | blk.10.ffn_up_exps.weight | Block 10 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
174 | blk.10.ffn_up_shexp.weight | Block 10 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.10: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 11 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
175 | blk.11.attn_kv_a_mqa.weight | Block 11 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
176 | blk.11.attn_kv_a_norm.weight | Block 11 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
177 | blk.11.attn_kv_b.weight | Block 11 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
178 | blk.11.attn_norm.weight | Block 11 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
179 | blk.11.attn_output.weight | Block 11 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
180 | blk.11.attn_q_a.weight | Block 11 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
181 | blk.11.attn_q_a_norm.weight | Block 11 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
182 | blk.11.attn_q_b.weight | Block 11 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
183 | blk.11.exp_probs_b.bias | Block 11 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
184 | blk.11.ffn_down_exps.weight | Block 11 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
185 | blk.11.ffn_down_shexp.weight | Block 11 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
186 | blk.11.ffn_gate_exps.weight | Block 11 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
187 | blk.11.ffn_gate_inp.weight | Block 11 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
188 | blk.11.ffn_gate_shexp.weight | Block 11 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
189 | blk.11.ffn_norm.weight | Block 11 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
190 | blk.11.ffn_up_exps.weight | Block 11 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
191 | blk.11.ffn_up_shexp.weight | Block 11 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.11: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 12 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
192 | blk.12.attn_kv_a_mqa.weight | Block 12 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
193 | blk.12.attn_kv_a_norm.weight | Block 12 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
194 | blk.12.attn_kv_b.weight | Block 12 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
195 | blk.12.attn_norm.weight | Block 12 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
196 | blk.12.attn_output.weight | Block 12 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
197 | blk.12.attn_q_a.weight | Block 12 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
198 | blk.12.attn_q_a_norm.weight | Block 12 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
199 | blk.12.attn_q_b.weight | Block 12 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
200 | blk.12.exp_probs_b.bias | Block 12 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
201 | blk.12.ffn_down_exps.weight | Block 12 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
202 | blk.12.ffn_down_shexp.weight | Block 12 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
203 | blk.12.ffn_gate_exps.weight | Block 12 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
204 | blk.12.ffn_gate_inp.weight | Block 12 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
205 | blk.12.ffn_gate_shexp.weight | Block 12 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
206 | blk.12.ffn_norm.weight | Block 12 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
207 | blk.12.ffn_up_exps.weight | Block 12 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
208 | blk.12.ffn_up_shexp.weight | Block 12 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.12: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 13 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
209 | blk.13.attn_kv_a_mqa.weight | Block 13 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
210 | blk.13.attn_kv_a_norm.weight | Block 13 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
211 | blk.13.attn_kv_b.weight | Block 13 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
212 | blk.13.attn_norm.weight | Block 13 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
213 | blk.13.attn_output.weight | Block 13 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
214 | blk.13.attn_q_a.weight | Block 13 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
215 | blk.13.attn_q_a_norm.weight | Block 13 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
216 | blk.13.attn_q_b.weight | Block 13 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
217 | blk.13.exp_probs_b.bias | Block 13 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
218 | blk.13.ffn_down_exps.weight | Block 13 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
219 | blk.13.ffn_down_shexp.weight | Block 13 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
220 | blk.13.ffn_gate_exps.weight | Block 13 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
221 | blk.13.ffn_gate_inp.weight | Block 13 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
222 | blk.13.ffn_gate_shexp.weight | Block 13 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
223 | blk.13.ffn_norm.weight | Block 13 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
224 | blk.13.ffn_up_exps.weight | Block 13 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
225 | blk.13.ffn_up_shexp.weight | Block 13 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.13: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 14 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
226 | blk.14.attn_kv_a_mqa.weight | Block 14 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
227 | blk.14.attn_kv_a_norm.weight | Block 14 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
228 | blk.14.attn_kv_b.weight | Block 14 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
229 | blk.14.attn_norm.weight | Block 14 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
230 | blk.14.attn_output.weight | Block 14 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
231 | blk.14.attn_q_a.weight | Block 14 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
232 | blk.14.attn_q_a_norm.weight | Block 14 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
233 | blk.14.attn_q_b.weight | Block 14 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
234 | blk.14.exp_probs_b.bias | Block 14 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
235 | blk.14.ffn_down_exps.weight | Block 14 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
236 | blk.14.ffn_down_shexp.weight | Block 14 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
237 | blk.14.ffn_gate_exps.weight | Block 14 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
238 | blk.14.ffn_gate_inp.weight | Block 14 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
239 | blk.14.ffn_gate_shexp.weight | Block 14 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
240 | blk.14.ffn_norm.weight | Block 14 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
241 | blk.14.ffn_up_exps.weight | Block 14 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
242 | blk.14.ffn_up_shexp.weight | Block 14 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.14: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 15 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
243 | blk.15.attn_kv_a_mqa.weight | Block 15 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
244 | blk.15.attn_kv_a_norm.weight | Block 15 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
245 | blk.15.attn_kv_b.weight | Block 15 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
246 | blk.15.attn_norm.weight | Block 15 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
247 | blk.15.attn_output.weight | Block 15 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
248 | blk.15.attn_q_a.weight | Block 15 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
249 | blk.15.attn_q_a_norm.weight | Block 15 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
250 | blk.15.attn_q_b.weight | Block 15 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
251 | blk.15.exp_probs_b.bias | Block 15 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
252 | blk.15.ffn_down_exps.weight | Block 15 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
253 | blk.15.ffn_down_shexp.weight | Block 15 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
254 | blk.15.ffn_gate_exps.weight | Block 15 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
255 | blk.15.ffn_gate_inp.weight | Block 15 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
256 | blk.15.ffn_gate_shexp.weight | Block 15 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
257 | blk.15.ffn_norm.weight | Block 15 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
258 | blk.15.ffn_up_exps.weight | Block 15 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
259 | blk.15.ffn_up_shexp.weight | Block 15 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.15: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 16 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
260 | blk.16.attn_kv_a_mqa.weight | Block 16 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
261 | blk.16.attn_kv_a_norm.weight | Block 16 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
262 | blk.16.attn_kv_b.weight | Block 16 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
263 | blk.16.attn_norm.weight | Block 16 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
264 | blk.16.attn_output.weight | Block 16 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
265 | blk.16.attn_q_a.weight | Block 16 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
266 | blk.16.attn_q_a_norm.weight | Block 16 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
267 | blk.16.attn_q_b.weight | Block 16 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
268 | blk.16.exp_probs_b.bias | Block 16 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
269 | blk.16.ffn_down_exps.weight | Block 16 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
270 | blk.16.ffn_down_shexp.weight | Block 16 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
271 | blk.16.ffn_gate_exps.weight | Block 16 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
272 | blk.16.ffn_gate_inp.weight | Block 16 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
273 | blk.16.ffn_gate_shexp.weight | Block 16 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
274 | blk.16.ffn_norm.weight | Block 16 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
275 | blk.16.ffn_up_exps.weight | Block 16 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
276 | blk.16.ffn_up_shexp.weight | Block 16 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.16: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 17 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
277 | blk.17.attn_kv_a_mqa.weight | Block 17 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
278 | blk.17.attn_kv_a_norm.weight | Block 17 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
279 | blk.17.attn_kv_b.weight | Block 17 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
280 | blk.17.attn_norm.weight | Block 17 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
281 | blk.17.attn_output.weight | Block 17 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
282 | blk.17.attn_q_a.weight | Block 17 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
283 | blk.17.attn_q_a_norm.weight | Block 17 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
284 | blk.17.attn_q_b.weight | Block 17 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
285 | blk.17.exp_probs_b.bias | Block 17 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
286 | blk.17.ffn_down_exps.weight | Block 17 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
287 | blk.17.ffn_down_shexp.weight | Block 17 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
288 | blk.17.ffn_gate_exps.weight | Block 17 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
289 | blk.17.ffn_gate_inp.weight | Block 17 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
290 | blk.17.ffn_gate_shexp.weight | Block 17 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
291 | blk.17.ffn_norm.weight | Block 17 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
292 | blk.17.ffn_up_exps.weight | Block 17 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
293 | blk.17.ffn_up_shexp.weight | Block 17 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.17: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 18 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
294 | blk.18.attn_kv_a_mqa.weight | Block 18 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
295 | blk.18.attn_kv_a_norm.weight | Block 18 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
296 | blk.18.attn_kv_b.weight | Block 18 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
297 | blk.18.attn_norm.weight | Block 18 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
298 | blk.18.attn_output.weight | Block 18 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
299 | blk.18.attn_q_a.weight | Block 18 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
300 | blk.18.attn_q_a_norm.weight | Block 18 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
301 | blk.18.attn_q_b.weight | Block 18 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
302 | blk.18.exp_probs_b.bias | Block 18 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
303 | blk.18.ffn_down_exps.weight | Block 18 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
304 | blk.18.ffn_down_shexp.weight | Block 18 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
305 | blk.18.ffn_gate_exps.weight | Block 18 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
306 | blk.18.ffn_gate_inp.weight | Block 18 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
307 | blk.18.ffn_gate_shexp.weight | Block 18 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
308 | blk.18.ffn_norm.weight | Block 18 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
309 | blk.18.ffn_up_exps.weight | Block 18 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
310 | blk.18.ffn_up_shexp.weight | Block 18 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.18: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 19 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
311 | blk.19.attn_kv_a_mqa.weight | Block 19 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
312 | blk.19.attn_kv_a_norm.weight | Block 19 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
313 | blk.19.attn_kv_b.weight | Block 19 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
314 | blk.19.attn_norm.weight | Block 19 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
315 | blk.19.attn_output.weight | Block 19 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
316 | blk.19.attn_q_a.weight | Block 19 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
317 | blk.19.attn_q_a_norm.weight | Block 19 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
318 | blk.19.attn_q_b.weight | Block 19 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
319 | blk.19.exp_probs_b.bias | Block 19 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
320 | blk.19.ffn_down_exps.weight | Block 19 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
321 | blk.19.ffn_down_shexp.weight | Block 19 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
322 | blk.19.ffn_gate_exps.weight | Block 19 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
323 | blk.19.ffn_gate_inp.weight | Block 19 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
324 | blk.19.ffn_gate_shexp.weight | Block 19 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
325 | blk.19.ffn_norm.weight | Block 19 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
326 | blk.19.ffn_up_exps.weight | Block 19 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
327 | blk.19.ffn_up_shexp.weight | Block 19 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.19: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 20 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
328 | blk.20.attn_kv_a_mqa.weight | Block 20 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
329 | blk.20.attn_kv_a_norm.weight | Block 20 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
330 | blk.20.attn_kv_b.weight | Block 20 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
331 | blk.20.attn_norm.weight | Block 20 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
332 | blk.20.attn_output.weight | Block 20 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
333 | blk.20.attn_q_a.weight | Block 20 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
334 | blk.20.attn_q_a_norm.weight | Block 20 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
335 | blk.20.attn_q_b.weight | Block 20 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
336 | blk.20.exp_probs_b.bias | Block 20 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
337 | blk.20.ffn_down_exps.weight | Block 20 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
338 | blk.20.ffn_down_shexp.weight | Block 20 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
339 | blk.20.ffn_gate_exps.weight | Block 20 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
340 | blk.20.ffn_gate_inp.weight | Block 20 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
341 | blk.20.ffn_gate_shexp.weight | Block 20 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
342 | blk.20.ffn_norm.weight | Block 20 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
343 | blk.20.ffn_up_exps.weight | Block 20 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
344 | blk.20.ffn_up_shexp.weight | Block 20 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.20: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 21 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
345 | blk.21.attn_kv_a_mqa.weight | Block 21 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
346 | blk.21.attn_kv_a_norm.weight | Block 21 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
347 | blk.21.attn_kv_b.weight | Block 21 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
348 | blk.21.attn_norm.weight | Block 21 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
349 | blk.21.attn_output.weight | Block 21 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
350 | blk.21.attn_q_a.weight | Block 21 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
351 | blk.21.attn_q_a_norm.weight | Block 21 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
352 | blk.21.attn_q_b.weight | Block 21 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
353 | blk.21.exp_probs_b.bias | Block 21 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
354 | blk.21.ffn_down_exps.weight | Block 21 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
355 | blk.21.ffn_down_shexp.weight | Block 21 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
356 | blk.21.ffn_gate_exps.weight | Block 21 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
357 | blk.21.ffn_gate_inp.weight | Block 21 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
358 | blk.21.ffn_gate_shexp.weight | Block 21 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
359 | blk.21.ffn_norm.weight | Block 21 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
360 | blk.21.ffn_up_exps.weight | Block 21 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
361 | blk.21.ffn_up_shexp.weight | Block 21 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.21: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 22 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
362 | blk.22.attn_kv_a_mqa.weight | Block 22 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
363 | blk.22.attn_kv_a_norm.weight | Block 22 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
364 | blk.22.attn_kv_b.weight | Block 22 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
365 | blk.22.attn_norm.weight | Block 22 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
366 | blk.22.attn_output.weight | Block 22 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
367 | blk.22.attn_q_a.weight | Block 22 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
368 | blk.22.attn_q_a_norm.weight | Block 22 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
369 | blk.22.attn_q_b.weight | Block 22 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
370 | blk.22.exp_probs_b.bias | Block 22 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
371 | blk.22.ffn_down_exps.weight | Block 22 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
372 | blk.22.ffn_down_shexp.weight | Block 22 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
373 | blk.22.ffn_gate_exps.weight | Block 22 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
374 | blk.22.ffn_gate_inp.weight | Block 22 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
375 | blk.22.ffn_gate_shexp.weight | Block 22 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
376 | blk.22.ffn_norm.weight | Block 22 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
377 | blk.22.ffn_up_exps.weight | Block 22 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
378 | blk.22.ffn_up_shexp.weight | Block 22 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.22: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 23 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
379 | blk.23.attn_kv_a_mqa.weight | Block 23 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
380 | blk.23.attn_kv_a_norm.weight | Block 23 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
381 | blk.23.attn_kv_b.weight | Block 23 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
382 | blk.23.attn_norm.weight | Block 23 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
383 | blk.23.attn_output.weight | Block 23 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
384 | blk.23.attn_q_a.weight | Block 23 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
385 | blk.23.attn_q_a_norm.weight | Block 23 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
386 | blk.23.attn_q_b.weight | Block 23 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
387 | blk.23.exp_probs_b.bias | Block 23 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
388 | blk.23.ffn_down_exps.weight | Block 23 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
389 | blk.23.ffn_down_shexp.weight | Block 23 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
390 | blk.23.ffn_gate_exps.weight | Block 23 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
391 | blk.23.ffn_gate_inp.weight | Block 23 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
392 | blk.23.ffn_gate_shexp.weight | Block 23 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
393 | blk.23.ffn_norm.weight | Block 23 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
394 | blk.23.ffn_up_exps.weight | Block 23 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
395 | blk.23.ffn_up_shexp.weight | Block 23 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.23: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 24 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
396 | blk.24.attn_kv_a_mqa.weight | Block 24 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
397 | blk.24.attn_kv_a_norm.weight | Block 24 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
398 | blk.24.attn_kv_b.weight | Block 24 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
399 | blk.24.attn_norm.weight | Block 24 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
400 | blk.24.attn_output.weight | Block 24 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
401 | blk.24.attn_q_a.weight | Block 24 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
402 | blk.24.attn_q_a_norm.weight | Block 24 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
403 | blk.24.attn_q_b.weight | Block 24 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
404 | blk.24.exp_probs_b.bias | Block 24 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
405 | blk.24.ffn_down_exps.weight | Block 24 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
406 | blk.24.ffn_down_shexp.weight | Block 24 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
407 | blk.24.ffn_gate_exps.weight | Block 24 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
408 | blk.24.ffn_gate_inp.weight | Block 24 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
409 | blk.24.ffn_gate_shexp.weight | Block 24 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
410 | blk.24.ffn_norm.weight | Block 24 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
411 | blk.24.ffn_up_exps.weight | Block 24 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
412 | blk.24.ffn_up_shexp.weight | Block 24 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.24: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 25 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
413 | blk.25.attn_kv_a_mqa.weight | Block 25 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
414 | blk.25.attn_kv_a_norm.weight | Block 25 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
415 | blk.25.attn_kv_b.weight | Block 25 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
416 | blk.25.attn_norm.weight | Block 25 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
417 | blk.25.attn_output.weight | Block 25 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
418 | blk.25.attn_q_a.weight | Block 25 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
419 | blk.25.attn_q_a_norm.weight | Block 25 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
420 | blk.25.attn_q_b.weight | Block 25 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
421 | blk.25.exp_probs_b.bias | Block 25 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
422 | blk.25.ffn_down_exps.weight | Block 25 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
423 | blk.25.ffn_down_shexp.weight | Block 25 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
424 | blk.25.ffn_gate_exps.weight | Block 25 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
425 | blk.25.ffn_gate_inp.weight | Block 25 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
426 | blk.25.ffn_gate_shexp.weight | Block 25 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
427 | blk.25.ffn_norm.weight | Block 25 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
428 | blk.25.ffn_up_exps.weight | Block 25 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
429 | blk.25.ffn_up_shexp.weight | Block 25 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.25: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 26 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
430 | blk.26.attn_kv_a_mqa.weight | Block 26 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
431 | blk.26.attn_kv_a_norm.weight | Block 26 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
432 | blk.26.attn_kv_b.weight | Block 26 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
433 | blk.26.attn_norm.weight | Block 26 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
434 | blk.26.attn_output.weight | Block 26 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
435 | blk.26.attn_q_a.weight | Block 26 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
436 | blk.26.attn_q_a_norm.weight | Block 26 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
437 | blk.26.attn_q_b.weight | Block 26 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
438 | blk.26.exp_probs_b.bias | Block 26 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
439 | blk.26.ffn_down_exps.weight | Block 26 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
440 | blk.26.ffn_down_shexp.weight | Block 26 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
441 | blk.26.ffn_gate_exps.weight | Block 26 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
442 | blk.26.ffn_gate_inp.weight | Block 26 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
443 | blk.26.ffn_gate_shexp.weight | Block 26 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
444 | blk.26.ffn_norm.weight | Block 26 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
445 | blk.26.ffn_up_exps.weight | Block 26 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
446 | blk.26.ffn_up_shexp.weight | Block 26 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.26: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 27 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
447 | blk.27.attn_kv_a_mqa.weight | Block 27 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
448 | blk.27.attn_kv_a_norm.weight | Block 27 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
449 | blk.27.attn_kv_b.weight | Block 27 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
450 | blk.27.attn_norm.weight | Block 27 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
451 | blk.27.attn_output.weight | Block 27 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
452 | blk.27.attn_q_a.weight | Block 27 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
453 | blk.27.attn_q_a_norm.weight | Block 27 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
454 | blk.27.attn_q_b.weight | Block 27 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
455 | blk.27.exp_probs_b.bias | Block 27 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
456 | blk.27.ffn_down_exps.weight | Block 27 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
457 | blk.27.ffn_down_shexp.weight | Block 27 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
458 | blk.27.ffn_gate_exps.weight | Block 27 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
459 | blk.27.ffn_gate_inp.weight | Block 27 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
460 | blk.27.ffn_gate_shexp.weight | Block 27 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
461 | blk.27.ffn_norm.weight | Block 27 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
462 | blk.27.ffn_up_exps.weight | Block 27 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
463 | blk.27.ffn_up_shexp.weight | Block 27 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.27: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 28 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
464 | blk.28.attn_kv_a_mqa.weight | Block 28 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
465 | blk.28.attn_kv_a_norm.weight | Block 28 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
466 | blk.28.attn_kv_b.weight | Block 28 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
467 | blk.28.attn_norm.weight | Block 28 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
468 | blk.28.attn_output.weight | Block 28 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
469 | blk.28.attn_q_a.weight | Block 28 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
470 | blk.28.attn_q_a_norm.weight | Block 28 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
471 | blk.28.attn_q_b.weight | Block 28 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
472 | blk.28.exp_probs_b.bias | Block 28 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
473 | blk.28.ffn_down_exps.weight | Block 28 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
474 | blk.28.ffn_down_shexp.weight | Block 28 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
475 | blk.28.ffn_gate_exps.weight | Block 28 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
476 | blk.28.ffn_gate_inp.weight | Block 28 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
477 | blk.28.ffn_gate_shexp.weight | Block 28 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
478 | blk.28.ffn_norm.weight | Block 28 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
479 | blk.28.ffn_up_exps.weight | Block 28 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
480 | blk.28.ffn_up_shexp.weight | Block 28 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.28: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 29 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
481 | blk.29.attn_kv_a_mqa.weight | Block 29 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
482 | blk.29.attn_kv_a_norm.weight | Block 29 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
483 | blk.29.attn_kv_b.weight | Block 29 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
484 | blk.29.attn_norm.weight | Block 29 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
485 | blk.29.attn_output.weight | Block 29 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
486 | blk.29.attn_q_a.weight | Block 29 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
487 | blk.29.attn_q_a_norm.weight | Block 29 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
488 | blk.29.attn_q_b.weight | Block 29 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
489 | blk.29.exp_probs_b.bias | Block 29 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
490 | blk.29.ffn_down_exps.weight | Block 29 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
491 | blk.29.ffn_down_shexp.weight | Block 29 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
492 | blk.29.ffn_gate_exps.weight | Block 29 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
493 | blk.29.ffn_gate_inp.weight | Block 29 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
494 | blk.29.ffn_gate_shexp.weight | Block 29 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
495 | blk.29.ffn_norm.weight | Block 29 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
496 | blk.29.ffn_up_exps.weight | Block 29 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
497 | blk.29.ffn_up_shexp.weight | Block 29 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.29: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 30 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
498 | blk.30.attn_kv_a_mqa.weight | Block 30 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
499 | blk.30.attn_kv_a_norm.weight | Block 30 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
500 | blk.30.attn_kv_b.weight | Block 30 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
501 | blk.30.attn_norm.weight | Block 30 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
502 | blk.30.attn_output.weight | Block 30 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
503 | blk.30.attn_q_a.weight | Block 30 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
504 | blk.30.attn_q_a_norm.weight | Block 30 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
505 | blk.30.attn_q_b.weight | Block 30 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
506 | blk.30.exp_probs_b.bias | Block 30 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
507 | blk.30.ffn_down_exps.weight | Block 30 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
508 | blk.30.ffn_down_shexp.weight | Block 30 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
509 | blk.30.ffn_gate_exps.weight | Block 30 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
510 | blk.30.ffn_gate_inp.weight | Block 30 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
511 | blk.30.ffn_gate_shexp.weight | Block 30 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
512 | blk.30.ffn_norm.weight | Block 30 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
513 | blk.30.ffn_up_exps.weight | Block 30 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
514 | blk.30.ffn_up_shexp.weight | Block 30 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.30: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 31 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
515 | blk.31.attn_kv_a_mqa.weight | Block 31 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
516 | blk.31.attn_kv_a_norm.weight | Block 31 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
517 | blk.31.attn_kv_b.weight | Block 31 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
518 | blk.31.attn_norm.weight | Block 31 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
519 | blk.31.attn_output.weight | Block 31 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
520 | blk.31.attn_q_a.weight | Block 31 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
521 | blk.31.attn_q_a_norm.weight | Block 31 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
522 | blk.31.attn_q_b.weight | Block 31 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
523 | blk.31.exp_probs_b.bias | Block 31 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
524 | blk.31.ffn_down_exps.weight | Block 31 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
525 | blk.31.ffn_down_shexp.weight | Block 31 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
526 | blk.31.ffn_gate_exps.weight | Block 31 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
527 | blk.31.ffn_gate_inp.weight | Block 31 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
528 | blk.31.ffn_gate_shexp.weight | Block 31 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
529 | blk.31.ffn_norm.weight | Block 31 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
530 | blk.31.ffn_up_exps.weight | Block 31 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
531 | blk.31.ffn_up_shexp.weight | Block 31 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.31: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 32 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
532 | blk.32.attn_kv_a_mqa.weight | Block 32 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
533 | blk.32.attn_kv_a_norm.weight | Block 32 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
534 | blk.32.attn_kv_b.weight | Block 32 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
535 | blk.32.attn_norm.weight | Block 32 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
536 | blk.32.attn_output.weight | Block 32 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
537 | blk.32.attn_q_a.weight | Block 32 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
538 | blk.32.attn_q_a_norm.weight | Block 32 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
539 | blk.32.attn_q_b.weight | Block 32 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
540 | blk.32.exp_probs_b.bias | Block 32 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
541 | blk.32.ffn_down_exps.weight | Block 32 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
542 | blk.32.ffn_down_shexp.weight | Block 32 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
543 | blk.32.ffn_gate_exps.weight | Block 32 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
544 | blk.32.ffn_gate_inp.weight | Block 32 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
545 | blk.32.ffn_gate_shexp.weight | Block 32 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
546 | blk.32.ffn_norm.weight | Block 32 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
547 | blk.32.ffn_up_exps.weight | Block 32 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
548 | blk.32.ffn_up_shexp.weight | Block 32 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.32: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 33 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
549 | blk.33.attn_kv_a_mqa.weight | Block 33 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
550 | blk.33.attn_kv_a_norm.weight | Block 33 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
551 | blk.33.attn_kv_b.weight | Block 33 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
552 | blk.33.attn_norm.weight | Block 33 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
553 | blk.33.attn_output.weight | Block 33 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
554 | blk.33.attn_q_a.weight | Block 33 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
555 | blk.33.attn_q_a_norm.weight | Block 33 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
556 | blk.33.attn_q_b.weight | Block 33 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
557 | blk.33.exp_probs_b.bias | Block 33 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
558 | blk.33.ffn_down_exps.weight | Block 33 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
559 | blk.33.ffn_down_shexp.weight | Block 33 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
560 | blk.33.ffn_gate_exps.weight | Block 33 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
561 | blk.33.ffn_gate_inp.weight | Block 33 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
562 | blk.33.ffn_gate_shexp.weight | Block 33 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
563 | blk.33.ffn_norm.weight | Block 33 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
564 | blk.33.ffn_up_exps.weight | Block 33 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
565 | blk.33.ffn_up_shexp.weight | Block 33 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.33: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 34 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
566 | blk.34.attn_kv_a_mqa.weight | Block 34 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
567 | blk.34.attn_kv_a_norm.weight | Block 34 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
568 | blk.34.attn_kv_b.weight | Block 34 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
569 | blk.34.attn_norm.weight | Block 34 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
570 | blk.34.attn_output.weight | Block 34 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
571 | blk.34.attn_q_a.weight | Block 34 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
572 | blk.34.attn_q_a_norm.weight | Block 34 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
573 | blk.34.attn_q_b.weight | Block 34 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
574 | blk.34.exp_probs_b.bias | Block 34 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
575 | blk.34.ffn_down_exps.weight | Block 34 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
576 | blk.34.ffn_down_shexp.weight | Block 34 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
577 | blk.34.ffn_gate_exps.weight | Block 34 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
578 | blk.34.ffn_gate_inp.weight | Block 34 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
579 | blk.34.ffn_gate_shexp.weight | Block 34 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
580 | blk.34.ffn_norm.weight | Block 34 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
581 | blk.34.ffn_up_exps.weight | Block 34 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
582 | blk.34.ffn_up_shexp.weight | Block 34 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.34: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 35 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
583 | blk.35.attn_kv_a_mqa.weight | Block 35 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
584 | blk.35.attn_kv_a_norm.weight | Block 35 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
585 | blk.35.attn_kv_b.weight | Block 35 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
586 | blk.35.attn_norm.weight | Block 35 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
587 | blk.35.attn_output.weight | Block 35 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
588 | blk.35.attn_q_a.weight | Block 35 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
589 | blk.35.attn_q_a_norm.weight | Block 35 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
590 | blk.35.attn_q_b.weight | Block 35 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
591 | blk.35.exp_probs_b.bias | Block 35 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
592 | blk.35.ffn_down_exps.weight | Block 35 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
593 | blk.35.ffn_down_shexp.weight | Block 35 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
594 | blk.35.ffn_gate_exps.weight | Block 35 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
595 | blk.35.ffn_gate_inp.weight | Block 35 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
596 | blk.35.ffn_gate_shexp.weight | Block 35 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
597 | blk.35.ffn_norm.weight | Block 35 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
598 | blk.35.ffn_up_exps.weight | Block 35 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
599 | blk.35.ffn_up_shexp.weight | Block 35 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.35: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 36 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
600 | blk.36.attn_kv_a_mqa.weight | Block 36 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
601 | blk.36.attn_kv_a_norm.weight | Block 36 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
602 | blk.36.attn_kv_b.weight | Block 36 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
603 | blk.36.attn_norm.weight | Block 36 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
604 | blk.36.attn_output.weight | Block 36 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
605 | blk.36.attn_q_a.weight | Block 36 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
606 | blk.36.attn_q_a_norm.weight | Block 36 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
607 | blk.36.attn_q_b.weight | Block 36 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
608 | blk.36.exp_probs_b.bias | Block 36 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
609 | blk.36.ffn_down_exps.weight | Block 36 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
610 | blk.36.ffn_down_shexp.weight | Block 36 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
611 | blk.36.ffn_gate_exps.weight | Block 36 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
612 | blk.36.ffn_gate_inp.weight | Block 36 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
613 | blk.36.ffn_gate_shexp.weight | Block 36 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
614 | blk.36.ffn_norm.weight | Block 36 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
615 | blk.36.ffn_up_exps.weight | Block 36 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
616 | blk.36.ffn_up_shexp.weight | Block 36 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.36: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 37 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
617 | blk.37.attn_kv_a_mqa.weight | Block 37 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
618 | blk.37.attn_kv_a_norm.weight | Block 37 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
619 | blk.37.attn_kv_b.weight | Block 37 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
620 | blk.37.attn_norm.weight | Block 37 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
621 | blk.37.attn_output.weight | Block 37 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
622 | blk.37.attn_q_a.weight | Block 37 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
623 | blk.37.attn_q_a_norm.weight | Block 37 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
624 | blk.37.attn_q_b.weight | Block 37 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
625 | blk.37.exp_probs_b.bias | Block 37 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
626 | blk.37.ffn_down_exps.weight | Block 37 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
627 | blk.37.ffn_down_shexp.weight | Block 37 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
628 | blk.37.ffn_gate_exps.weight | Block 37 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
629 | blk.37.ffn_gate_inp.weight | Block 37 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
630 | blk.37.ffn_gate_shexp.weight | Block 37 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
631 | blk.37.ffn_norm.weight | Block 37 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
632 | blk.37.ffn_up_exps.weight | Block 37 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
633 | blk.37.ffn_up_shexp.weight | Block 37 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.37: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 38 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
634 | blk.38.attn_kv_a_mqa.weight | Block 38 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
635 | blk.38.attn_kv_a_norm.weight | Block 38 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
636 | blk.38.attn_kv_b.weight | Block 38 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
637 | blk.38.attn_norm.weight | Block 38 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
638 | blk.38.attn_output.weight | Block 38 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
639 | blk.38.attn_q_a.weight | Block 38 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
640 | blk.38.attn_q_a_norm.weight | Block 38 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
641 | blk.38.attn_q_b.weight | Block 38 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
642 | blk.38.exp_probs_b.bias | Block 38 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
643 | blk.38.ffn_down_exps.weight | Block 38 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
644 | blk.38.ffn_down_shexp.weight | Block 38 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
645 | blk.38.ffn_gate_exps.weight | Block 38 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
646 | blk.38.ffn_gate_inp.weight | Block 38 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
647 | blk.38.ffn_gate_shexp.weight | Block 38 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
648 | blk.38.ffn_norm.weight | Block 38 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
649 | blk.38.ffn_up_exps.weight | Block 38 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
650 | blk.38.ffn_up_shexp.weight | Block 38 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.38: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 39 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
651 | blk.39.attn_kv_a_mqa.weight | Block 39 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
652 | blk.39.attn_kv_a_norm.weight | Block 39 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
653 | blk.39.attn_kv_b.weight | Block 39 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
654 | blk.39.attn_norm.weight | Block 39 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
655 | blk.39.attn_output.weight | Block 39 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
656 | blk.39.attn_q_a.weight | Block 39 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
657 | blk.39.attn_q_a_norm.weight | Block 39 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
658 | blk.39.attn_q_b.weight | Block 39 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
659 | blk.39.exp_probs_b.bias | Block 39 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
660 | blk.39.ffn_down_exps.weight | Block 39 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
661 | blk.39.ffn_down_shexp.weight | Block 39 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
662 | blk.39.ffn_gate_exps.weight | Block 39 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
663 | blk.39.ffn_gate_inp.weight | Block 39 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
664 | blk.39.ffn_gate_shexp.weight | Block 39 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
665 | blk.39.ffn_norm.weight | Block 39 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
666 | blk.39.ffn_up_exps.weight | Block 39 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
667 | blk.39.ffn_up_shexp.weight | Block 39 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.39: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 40 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
668 | blk.40.attn_kv_a_mqa.weight | Block 40 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
669 | blk.40.attn_kv_a_norm.weight | Block 40 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
670 | blk.40.attn_kv_b.weight | Block 40 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
671 | blk.40.attn_norm.weight | Block 40 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
672 | blk.40.attn_output.weight | Block 40 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
673 | blk.40.attn_q_a.weight | Block 40 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
674 | blk.40.attn_q_a_norm.weight | Block 40 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
675 | blk.40.attn_q_b.weight | Block 40 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
676 | blk.40.exp_probs_b.bias | Block 40 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
677 | blk.40.ffn_down_exps.weight | Block 40 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
678 | blk.40.ffn_down_shexp.weight | Block 40 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
679 | blk.40.ffn_gate_exps.weight | Block 40 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
680 | blk.40.ffn_gate_inp.weight | Block 40 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
681 | blk.40.ffn_gate_shexp.weight | Block 40 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
682 | blk.40.ffn_norm.weight | Block 40 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
683 | blk.40.ffn_up_exps.weight | Block 40 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
684 | blk.40.ffn_up_shexp.weight | Block 40 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.40: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 41 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
685 | blk.41.attn_kv_a_mqa.weight | Block 41 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
686 | blk.41.attn_kv_a_norm.weight | Block 41 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
687 | blk.41.attn_kv_b.weight | Block 41 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
688 | blk.41.attn_norm.weight | Block 41 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
689 | blk.41.attn_output.weight | Block 41 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
690 | blk.41.attn_q_a.weight | Block 41 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
691 | blk.41.attn_q_a_norm.weight | Block 41 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
692 | blk.41.attn_q_b.weight | Block 41 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
693 | blk.41.exp_probs_b.bias | Block 41 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
694 | blk.41.ffn_down_exps.weight | Block 41 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
695 | blk.41.ffn_down_shexp.weight | Block 41 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
696 | blk.41.ffn_gate_exps.weight | Block 41 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
697 | blk.41.ffn_gate_inp.weight | Block 41 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
698 | blk.41.ffn_gate_shexp.weight | Block 41 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
699 | blk.41.ffn_norm.weight | Block 41 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
700 | blk.41.ffn_up_exps.weight | Block 41 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
701 | blk.41.ffn_up_shexp.weight | Block 41 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.41: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 42 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
702 | blk.42.attn_kv_a_mqa.weight | Block 42 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
703 | blk.42.attn_kv_a_norm.weight | Block 42 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
704 | blk.42.attn_kv_b.weight | Block 42 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
705 | blk.42.attn_norm.weight | Block 42 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
706 | blk.42.attn_output.weight | Block 42 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
707 | blk.42.attn_q_a.weight | Block 42 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
708 | blk.42.attn_q_a_norm.weight | Block 42 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
709 | blk.42.attn_q_b.weight | Block 42 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
710 | blk.42.exp_probs_b.bias | Block 42 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
711 | blk.42.ffn_down_exps.weight | Block 42 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
712 | blk.42.ffn_down_shexp.weight | Block 42 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
713 | blk.42.ffn_gate_exps.weight | Block 42 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
714 | blk.42.ffn_gate_inp.weight | Block 42 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
715 | blk.42.ffn_gate_shexp.weight | Block 42 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
716 | blk.42.ffn_norm.weight | Block 42 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
717 | blk.42.ffn_up_exps.weight | Block 42 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
718 | blk.42.ffn_up_shexp.weight | Block 42 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.42: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 43 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
719 | blk.43.attn_kv_a_mqa.weight | Block 43 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
720 | blk.43.attn_kv_a_norm.weight | Block 43 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
721 | blk.43.attn_kv_b.weight | Block 43 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
722 | blk.43.attn_norm.weight | Block 43 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
723 | blk.43.attn_output.weight | Block 43 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
724 | blk.43.attn_q_a.weight | Block 43 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
725 | blk.43.attn_q_a_norm.weight | Block 43 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
726 | blk.43.attn_q_b.weight | Block 43 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
727 | blk.43.exp_probs_b.bias | Block 43 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
728 | blk.43.ffn_down_exps.weight | Block 43 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
729 | blk.43.ffn_down_shexp.weight | Block 43 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
730 | blk.43.ffn_gate_exps.weight | Block 43 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
731 | blk.43.ffn_gate_inp.weight | Block 43 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
732 | blk.43.ffn_gate_shexp.weight | Block 43 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
733 | blk.43.ffn_norm.weight | Block 43 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
734 | blk.43.ffn_up_exps.weight | Block 43 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
735 | blk.43.ffn_up_shexp.weight | Block 43 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.43: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 44 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
736 | blk.44.attn_kv_a_mqa.weight | Block 44 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
737 | blk.44.attn_kv_a_norm.weight | Block 44 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
738 | blk.44.attn_kv_b.weight | Block 44 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
739 | blk.44.attn_norm.weight | Block 44 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
740 | blk.44.attn_output.weight | Block 44 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
741 | blk.44.attn_q_a.weight | Block 44 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
742 | blk.44.attn_q_a_norm.weight | Block 44 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
743 | blk.44.attn_q_b.weight | Block 44 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
744 | blk.44.exp_probs_b.bias | Block 44 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
745 | blk.44.ffn_down_exps.weight | Block 44 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
746 | blk.44.ffn_down_shexp.weight | Block 44 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
747 | blk.44.ffn_gate_exps.weight | Block 44 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
748 | blk.44.ffn_gate_inp.weight | Block 44 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
749 | blk.44.ffn_gate_shexp.weight | Block 44 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
750 | blk.44.ffn_norm.weight | Block 44 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
751 | blk.44.ffn_up_exps.weight | Block 44 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
752 | blk.44.ffn_up_shexp.weight | Block 44 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.44: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 45 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
753 | blk.45.attn_kv_a_mqa.weight | Block 45 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
754 | blk.45.attn_kv_a_norm.weight | Block 45 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
755 | blk.45.attn_kv_b.weight | Block 45 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
756 | blk.45.attn_norm.weight | Block 45 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
757 | blk.45.attn_output.weight | Block 45 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
758 | blk.45.attn_q_a.weight | Block 45 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
759 | blk.45.attn_q_a_norm.weight | Block 45 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
760 | blk.45.attn_q_b.weight | Block 45 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
761 | blk.45.exp_probs_b.bias | Block 45 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
762 | blk.45.ffn_down_exps.weight | Block 45 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
763 | blk.45.ffn_down_shexp.weight | Block 45 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
764 | blk.45.ffn_gate_exps.weight | Block 45 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
765 | blk.45.ffn_gate_inp.weight | Block 45 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
766 | blk.45.ffn_gate_shexp.weight | Block 45 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
767 | blk.45.ffn_norm.weight | Block 45 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
768 | blk.45.ffn_up_exps.weight | Block 45 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
769 | blk.45.ffn_up_shexp.weight | Block 45 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.45: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 46 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
770 | blk.46.attn_kv_a_mqa.weight | Block 46 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
771 | blk.46.attn_kv_a_norm.weight | Block 46 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
772 | blk.46.attn_kv_b.weight | Block 46 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
773 | blk.46.attn_norm.weight | Block 46 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
774 | blk.46.attn_output.weight | Block 46 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
775 | blk.46.attn_q_a.weight | Block 46 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
776 | blk.46.attn_q_a_norm.weight | Block 46 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
777 | blk.46.attn_q_b.weight | Block 46 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
778 | blk.46.exp_probs_b.bias | Block 46 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
779 | blk.46.ffn_down_exps.weight | Block 46 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
780 | blk.46.ffn_down_shexp.weight | Block 46 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
781 | blk.46.ffn_gate_exps.weight | Block 46 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
782 | blk.46.ffn_gate_inp.weight | Block 46 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
783 | blk.46.ffn_gate_shexp.weight | Block 46 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
784 | blk.46.ffn_norm.weight | Block 46 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
785 | blk.46.ffn_up_exps.weight | Block 46 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
786 | blk.46.ffn_up_shexp.weight | Block 46 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.46: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 47 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
787 | blk.47.attn_kv_a_mqa.weight | Block 47 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
788 | blk.47.attn_kv_a_norm.weight | Block 47 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
789 | blk.47.attn_kv_b.weight | Block 47 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
790 | blk.47.attn_norm.weight | Block 47 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
791 | blk.47.attn_output.weight | Block 47 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
792 | blk.47.attn_q_a.weight | Block 47 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
793 | blk.47.attn_q_a_norm.weight | Block 47 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
794 | blk.47.attn_q_b.weight | Block 47 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
795 | blk.47.exp_probs_b.bias | Block 47 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
796 | blk.47.ffn_down_exps.weight | Block 47 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
797 | blk.47.ffn_down_shexp.weight | Block 47 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
798 | blk.47.ffn_gate_exps.weight | Block 47 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
799 | blk.47.ffn_gate_inp.weight | Block 47 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
800 | blk.47.ffn_gate_shexp.weight | Block 47 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
801 | blk.47.ffn_norm.weight | Block 47 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
802 | blk.47.ffn_up_exps.weight | Block 47 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
803 | blk.47.ffn_up_shexp.weight | Block 47 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.47: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 48 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
804 | blk.48.attn_kv_a_mqa.weight | Block 48 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
805 | blk.48.attn_kv_a_norm.weight | Block 48 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
806 | blk.48.attn_kv_b.weight | Block 48 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
807 | blk.48.attn_norm.weight | Block 48 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
808 | blk.48.attn_output.weight | Block 48 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
809 | blk.48.attn_q_a.weight | Block 48 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
810 | blk.48.attn_q_a_norm.weight | Block 48 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
811 | blk.48.attn_q_b.weight | Block 48 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
812 | blk.48.exp_probs_b.bias | Block 48 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
813 | blk.48.ffn_down_exps.weight | Block 48 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
814 | blk.48.ffn_down_shexp.weight | Block 48 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
815 | blk.48.ffn_gate_exps.weight | Block 48 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
816 | blk.48.ffn_gate_inp.weight | Block 48 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
817 | blk.48.ffn_gate_shexp.weight | Block 48 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
818 | blk.48.ffn_norm.weight | Block 48 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
819 | blk.48.ffn_up_exps.weight | Block 48 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
820 | blk.48.ffn_up_shexp.weight | Block 48 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.48: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 49 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
821 | blk.49.attn_kv_a_mqa.weight | Block 49 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
822 | blk.49.attn_kv_a_norm.weight | Block 49 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
823 | blk.49.attn_kv_b.weight | Block 49 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
824 | blk.49.attn_norm.weight | Block 49 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
825 | blk.49.attn_output.weight | Block 49 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
826 | blk.49.attn_q_a.weight | Block 49 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
827 | blk.49.attn_q_a_norm.weight | Block 49 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
828 | blk.49.attn_q_b.weight | Block 49 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
829 | blk.49.exp_probs_b.bias | Block 49 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
830 | blk.49.ffn_down_exps.weight | Block 49 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
831 | blk.49.ffn_down_shexp.weight | Block 49 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
832 | blk.49.ffn_gate_exps.weight | Block 49 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
833 | blk.49.ffn_gate_inp.weight | Block 49 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
834 | blk.49.ffn_gate_shexp.weight | Block 49 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
835 | blk.49.ffn_norm.weight | Block 49 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
836 | blk.49.ffn_up_exps.weight | Block 49 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
837 | blk.49.ffn_up_shexp.weight | Block 49 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.49: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 50 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
838 | blk.50.attn_kv_a_mqa.weight | Block 50 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
839 | blk.50.attn_kv_a_norm.weight | Block 50 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
840 | blk.50.attn_kv_b.weight | Block 50 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
841 | blk.50.attn_norm.weight | Block 50 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
842 | blk.50.attn_output.weight | Block 50 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
843 | blk.50.attn_q_a.weight | Block 50 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
844 | blk.50.attn_q_a_norm.weight | Block 50 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
845 | blk.50.attn_q_b.weight | Block 50 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
846 | blk.50.exp_probs_b.bias | Block 50 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
847 | blk.50.ffn_down_exps.weight | Block 50 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
848 | blk.50.ffn_down_shexp.weight | Block 50 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
849 | blk.50.ffn_gate_exps.weight | Block 50 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
850 | blk.50.ffn_gate_inp.weight | Block 50 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
851 | blk.50.ffn_gate_shexp.weight | Block 50 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
852 | blk.50.ffn_norm.weight | Block 50 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
853 | blk.50.ffn_up_exps.weight | Block 50 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
854 | blk.50.ffn_up_shexp.weight | Block 50 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.50: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 51 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
855 | blk.51.attn_kv_a_mqa.weight | Block 51 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
856 | blk.51.attn_kv_a_norm.weight | Block 51 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
857 | blk.51.attn_kv_b.weight | Block 51 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
858 | blk.51.attn_norm.weight | Block 51 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
859 | blk.51.attn_output.weight | Block 51 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
860 | blk.51.attn_q_a.weight | Block 51 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
861 | blk.51.attn_q_a_norm.weight | Block 51 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
862 | blk.51.attn_q_b.weight | Block 51 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
863 | blk.51.exp_probs_b.bias | Block 51 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
864 | blk.51.ffn_down_exps.weight | Block 51 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
865 | blk.51.ffn_down_shexp.weight | Block 51 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
866 | blk.51.ffn_gate_exps.weight | Block 51 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
867 | blk.51.ffn_gate_inp.weight | Block 51 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
868 | blk.51.ffn_gate_shexp.weight | Block 51 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
869 | blk.51.ffn_norm.weight | Block 51 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
870 | blk.51.ffn_up_exps.weight | Block 51 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
871 | blk.51.ffn_up_shexp.weight | Block 51 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.51: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 52 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
872 | blk.52.attn_kv_a_mqa.weight | Block 52 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
873 | blk.52.attn_kv_a_norm.weight | Block 52 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
874 | blk.52.attn_kv_b.weight | Block 52 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
875 | blk.52.attn_norm.weight | Block 52 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
876 | blk.52.attn_output.weight | Block 52 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
877 | blk.52.attn_q_a.weight | Block 52 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
878 | blk.52.attn_q_a_norm.weight | Block 52 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
879 | blk.52.attn_q_b.weight | Block 52 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
880 | blk.52.exp_probs_b.bias | Block 52 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
881 | blk.52.ffn_down_exps.weight | Block 52 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
882 | blk.52.ffn_down_shexp.weight | Block 52 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
883 | blk.52.ffn_gate_exps.weight | Block 52 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
884 | blk.52.ffn_gate_inp.weight | Block 52 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
885 | blk.52.ffn_gate_shexp.weight | Block 52 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
886 | blk.52.ffn_norm.weight | Block 52 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
887 | blk.52.ffn_up_exps.weight | Block 52 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
888 | blk.52.ffn_up_shexp.weight | Block 52 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.52: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 53 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
889 | blk.53.attn_kv_a_mqa.weight | Block 53 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
890 | blk.53.attn_kv_a_norm.weight | Block 53 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
891 | blk.53.attn_kv_b.weight | Block 53 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
892 | blk.53.attn_norm.weight | Block 53 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
893 | blk.53.attn_output.weight | Block 53 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
894 | blk.53.attn_q_a.weight | Block 53 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
895 | blk.53.attn_q_a_norm.weight | Block 53 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
896 | blk.53.attn_q_b.weight | Block 53 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
897 | blk.53.exp_probs_b.bias | Block 53 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
898 | blk.53.ffn_down_exps.weight | Block 53 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
899 | blk.53.ffn_down_shexp.weight | Block 53 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
900 | blk.53.ffn_gate_exps.weight | Block 53 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
901 | blk.53.ffn_gate_inp.weight | Block 53 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
902 | blk.53.ffn_gate_shexp.weight | Block 53 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
903 | blk.53.ffn_norm.weight | Block 53 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
904 | blk.53.ffn_up_exps.weight | Block 53 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
905 | blk.53.ffn_up_shexp.weight | Block 53 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.53: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 54 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
906 | blk.54.attn_kv_a_mqa.weight | Block 54 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
907 | blk.54.attn_kv_a_norm.weight | Block 54 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
908 | blk.54.attn_kv_b.weight | Block 54 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
909 | blk.54.attn_norm.weight | Block 54 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
910 | blk.54.attn_output.weight | Block 54 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
911 | blk.54.attn_q_a.weight | Block 54 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
912 | blk.54.attn_q_a_norm.weight | Block 54 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
913 | blk.54.attn_q_b.weight | Block 54 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
914 | blk.54.exp_probs_b.bias | Block 54 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
915 | blk.54.ffn_down_exps.weight | Block 54 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
916 | blk.54.ffn_down_shexp.weight | Block 54 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
917 | blk.54.ffn_gate_exps.weight | Block 54 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
918 | blk.54.ffn_gate_inp.weight | Block 54 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
919 | blk.54.ffn_gate_shexp.weight | Block 54 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
920 | blk.54.ffn_norm.weight | Block 54 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
921 | blk.54.ffn_up_exps.weight | Block 54 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
922 | blk.54.ffn_up_shexp.weight | Block 54 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.54: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 55 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
923 | blk.55.attn_kv_a_mqa.weight | Block 55 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
924 | blk.55.attn_kv_a_norm.weight | Block 55 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
925 | blk.55.attn_kv_b.weight | Block 55 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
926 | blk.55.attn_norm.weight | Block 55 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
927 | blk.55.attn_output.weight | Block 55 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
928 | blk.55.attn_q_a.weight | Block 55 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
929 | blk.55.attn_q_a_norm.weight | Block 55 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
930 | blk.55.attn_q_b.weight | Block 55 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
931 | blk.55.exp_probs_b.bias | Block 55 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
932 | blk.55.ffn_down_exps.weight | Block 55 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
933 | blk.55.ffn_down_shexp.weight | Block 55 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
934 | blk.55.ffn_gate_exps.weight | Block 55 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
935 | blk.55.ffn_gate_inp.weight | Block 55 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
936 | blk.55.ffn_gate_shexp.weight | Block 55 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
937 | blk.55.ffn_norm.weight | Block 55 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
938 | blk.55.ffn_up_exps.weight | Block 55 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
939 | blk.55.ffn_up_shexp.weight | Block 55 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.55: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 56 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
940 | blk.56.attn_kv_a_mqa.weight | Block 56 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
941 | blk.56.attn_kv_a_norm.weight | Block 56 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
942 | blk.56.attn_kv_b.weight | Block 56 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
943 | blk.56.attn_norm.weight | Block 56 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
944 | blk.56.attn_output.weight | Block 56 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
945 | blk.56.attn_q_a.weight | Block 56 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
946 | blk.56.attn_q_a_norm.weight | Block 56 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
947 | blk.56.attn_q_b.weight | Block 56 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
948 | blk.56.exp_probs_b.bias | Block 56 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
949 | blk.56.ffn_down_exps.weight | Block 56 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
950 | blk.56.ffn_down_shexp.weight | Block 56 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
951 | blk.56.ffn_gate_exps.weight | Block 56 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
952 | blk.56.ffn_gate_inp.weight | Block 56 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
953 | blk.56.ffn_gate_shexp.weight | Block 56 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
954 | blk.56.ffn_norm.weight | Block 56 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
955 | blk.56.ffn_up_exps.weight | Block 56 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
956 | blk.56.ffn_up_shexp.weight | Block 56 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.56: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 57 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
957 | blk.57.attn_kv_a_mqa.weight | Block 57 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
958 | blk.57.attn_kv_a_norm.weight | Block 57 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
959 | blk.57.attn_kv_b.weight | Block 57 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
960 | blk.57.attn_norm.weight | Block 57 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
961 | blk.57.attn_output.weight | Block 57 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
962 | blk.57.attn_q_a.weight | Block 57 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
963 | blk.57.attn_q_a_norm.weight | Block 57 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
964 | blk.57.attn_q_b.weight | Block 57 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
965 | blk.57.exp_probs_b.bias | Block 57 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
966 | blk.57.ffn_down_exps.weight | Block 57 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
967 | blk.57.ffn_down_shexp.weight | Block 57 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
968 | blk.57.ffn_gate_exps.weight | Block 57 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
969 | blk.57.ffn_gate_inp.weight | Block 57 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
970 | blk.57.ffn_gate_shexp.weight | Block 57 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
971 | blk.57.ffn_norm.weight | Block 57 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
972 | blk.57.ffn_up_exps.weight | Block 57 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
973 | blk.57.ffn_up_shexp.weight | Block 57 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.57: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 58 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
974 | blk.58.attn_kv_a_mqa.weight | Block 58 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
975 | blk.58.attn_kv_a_norm.weight | Block 58 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
976 | blk.58.attn_kv_b.weight | Block 58 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
977 | blk.58.attn_norm.weight | Block 58 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
978 | blk.58.attn_output.weight | Block 58 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
979 | blk.58.attn_q_a.weight | Block 58 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
980 | blk.58.attn_q_a_norm.weight | Block 58 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
981 | blk.58.attn_q_b.weight | Block 58 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
982 | blk.58.exp_probs_b.bias | Block 58 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
983 | blk.58.ffn_down_exps.weight | Block 58 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
984 | blk.58.ffn_down_shexp.weight | Block 58 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
985 | blk.58.ffn_gate_exps.weight | Block 58 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
986 | blk.58.ffn_gate_inp.weight | Block 58 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
987 | blk.58.ffn_gate_shexp.weight | Block 58 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
988 | blk.58.ffn_norm.weight | Block 58 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
989 | blk.58.ffn_up_exps.weight | Block 58 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
990 | blk.58.ffn_up_shexp.weight | Block 58 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.58: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 59 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
991 | blk.59.attn_kv_a_mqa.weight | Block 59 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
992 | blk.59.attn_kv_a_norm.weight | Block 59 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
993 | blk.59.attn_kv_b.weight | Block 59 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
994 | blk.59.attn_norm.weight | Block 59 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
995 | blk.59.attn_output.weight | Block 59 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
996 | blk.59.attn_q_a.weight | Block 59 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
997 | blk.59.attn_q_a_norm.weight | Block 59 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
998 | blk.59.attn_q_b.weight | Block 59 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
999 | blk.59.exp_probs_b.bias | Block 59 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
1000 | blk.59.ffn_down_exps.weight | Block 59 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
1001 | blk.59.ffn_down_shexp.weight | Block 59 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
1002 | blk.59.ffn_gate_exps.weight | Block 59 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
1003 | blk.59.ffn_gate_inp.weight | Block 59 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
1004 | blk.59.ffn_gate_shexp.weight | Block 59 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
1005 | blk.59.ffn_norm.weight | Block 59 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
1006 | blk.59.ffn_up_exps.weight | Block 59 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
1007 | blk.59.ffn_up_shexp.weight | Block 59 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.59: (~12B) 11507286272
- Percentage of total elements: 1.71%
Block 60 Tensor Group : ~12B Elements
T_ID | Tensor Layer Name | Human Friendly Tensor Layer Name | Elements | Shape | Type |
---|---|---|---|---|---|
1008 | blk.60.attn_kv_a_mqa.weight | Block 60 Attn_Kv_A_Mqa (W) | ( ~4M) 4128768 | 7168 x 576 x 1 x 1 | IQ2_XS |
1009 | blk.60.attn_kv_a_norm.weight | Block 60 Attn_Kv_A_Norm (W) | ( 512) 512 | 512 x 1 x 1 x 1 | F32 |
1010 | blk.60.attn_kv_b.weight | Block 60 Attn_Kv_B (W) | ( ~17M) 16777216 | 512 x 32768 x 1 x 1 | IQ2_XS |
1011 | blk.60.attn_norm.weight | Block 60 Attention Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
1012 | blk.60.attn_output.weight | Block 60 Attention Output (W) | (~117M) 117440512 | 16384 x 7168 x 1 x 1 | IQ3_S |
1013 | blk.60.attn_q_a.weight | Block 60 Attn_Q_A (W) | ( ~11M) 11010048 | 7168 x 1536 x 1 x 1 | IQ2_XS |
1014 | blk.60.attn_q_a_norm.weight | Block 60 Attn_Q_A_Norm (W) | ( ~2K) 1536 | 1536 x 1 x 1 x 1 | F32 |
1015 | blk.60.attn_q_b.weight | Block 60 Attn_Q_B (W) | ( ~38M) 37748736 | 1536 x 24576 x 1 x 1 | IQ2_XS |
1016 | blk.60.exp_probs_b.bias | Block 60 Exp_Probs_B (B) | ( 256) 256 | 256 x 1 x 1 x 1 | F32 |
1017 | blk.60.ffn_down_exps.weight | Block 60 Ffn_Down_Exps (W) | ( ~4B) 3758096384 | 2048 x 7168 x 256 x 1 | IQ2_XS |
1018 | blk.60.ffn_down_shexp.weight | Block 60 Ffn_Down_Shexp (W) | ( ~15M) 14680064 | 2048 x 7168 x 1 x 1 | IQ2_XS |
1019 | blk.60.ffn_gate_exps.weight | Block 60 Ffn_Gate_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
1020 | blk.60.ffn_gate_inp.weight | Block 60 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W) | ( ~2M) 1835008 | 7168 x 256 x 1 x 1 | F32 |
1021 | blk.60.ffn_gate_shexp.weight | Block 60 Ffn_Gate_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
1022 | blk.60.ffn_norm.weight | Block 60 Feed-Forward Network Normalization (W) | ( ~7K) 7168 | 7168 x 1 x 1 x 1 | F32 |
1023 | blk.60.ffn_up_exps.weight | Block 60 Ffn_Up_Exps (W) | ( ~4B) 3758096384 | 7168 x 2048 x 256 x 1 | IQ2_XS |
1024 | blk.60.ffn_up_shexp.weight | Block 60 Ffn_Up_Shexp (W) | ( ~15M) 14680064 | 7168 x 2048 x 1 x 1 | IQ2_XS |
- Total elements in blk.60: (~12B) 11507286272
- Percentage of total elements: 1.71%
Perhaps this precedent was set by
No, it's forced by the 50GB file size limit on hf. I don't know how thebloke split his files, but we split them so you could load mmap the parts directly. I think it's the most common format on hf as well. We tried hard to provide the newer format (even trying to patch gguf-split, but the usage of C++ iostreams makes it pretty much impossible). We simply don't have the resources for this format on most servers.
It's not an issue for most models and users, fortunately, but your use case of course is such an example.
As of now I believe bartowski is using the same smaller quantization across all the layers except keeping token embedding at Q8_0.
How could they even differ? If bartowski's quants are in a nicer format for you, there should not be a reason not to use those. The mix should be the same (unless I am mistaken).
It would print something out like this
Good that nico apparently did it. I could have provided the info from the quants (as json), but not the extra info the tool prints.
Heya really appreciate it both of you! Okay yes I can see exactly which quant was used for each layer completely! That helps me to compare across the available quants in this size class. Maybe some time I can try to compare perplexity across each model to get a rough estimate of "Perplexity per GiB" or something haha...
How could they even differ? If bartowski's quants are in a nicer format for you, there should not be a reason not to use those. The mix should be the same (unless I am mistaken).
So I made a comparison chart here and yes your mix has the same ratios as bartowski. You also seem to use a high quality imatrix mix. unsloth has a custom fork they use which changes a few layers to be higher quality. I'm using ik_llama.cpp
fork and a convenient bash script to map each layer to a desired quantization level.
I'm still experimenting on how the various mixes perform in terms of quality (perplexity) and speed (llama-bench) for prompt processing and token generation. Amazing how fast this stuff is moving! Thanks for all your help and all the quants!
So I made a comparison chart here
Awesome. Thanks lot for collecting and visualizing all this data!
your mix has the same ratios as bartowski.
That was expected as we use the standardized llama.cpp mix and so does bartowski.
You also seem to use a high quality imatrix mix.
Yes we do. Awesome that you figured this out. Our matrix training is superior compared to bartowski. Our imatrix dataset is around double the size with the first half of it beeing bartowski's imatrix dataset while the other half consists of proprietary high-quality data covering all common use cases of LLMs that are missing in bartowski's imatrix dataset like story writing and roleplay. mradermacher put a lot of effort into creating the best imatrix dataset possible last spring before we scaled up our quantization throughput our current almost industrial scale. We also train our imatrix in F16 for all models other than r1 for which we use Q8 while many other quarters use less percussion for imatrix computation. We are perfectionists and value quality above almost everything. @ubergarm Did you actually measure any real-world difference between our and bartowski's imatrix quants? I don't see you would see one unless you test all kind of different real world use cases unless just having a larger imatrix and so doing more imatrix training has a measurable effect.
I'm still experimenting on how the various mixes perform in terms of quality (perplexity) and speed (llama-bench) for prompt processing and token generation. Amazing how fast this stuff is moving!
I highly recommend you meassure kv-divergence, top token probability and same token probability instead of perplexity to get much better data.
Thanks for all your help and all the quants!
No problem. Glad I was able to help. If you need anything else please just let me know.