4 days ago

Heya mradermacher, thanks for all the amazing quants. I was doing a comparison across the most likely top four V3-0324 quants in the ~230GiB size class and wanted to add your info to the table. However, for some reason the gguf details are not showing up?

I might be able to download the Q2_K and use gguf-py/gguf/gguf-reader.py or similar to print out the tensor data to add to my table.

I have some discussion on it here on the ik_llama.cpp fork.

Cheers and happy cookin'!

mradermacher

Owner 4 days ago

hf only supports non-split quants, that's probably why they don't show up. that sidebar is not provided by us, we have no influence over it.

ubergarm

4 days ago

•

edited 4 days ago

Thanks, yeah makes sense. I wonder if it is because the split names are DeepSeek-V3-0324.i1-Q2_K.gguf.part1of5 instead of DeepSeek-V3-0324.i1-Q2_K.part1of5.gguf as maybe huggingface is webapp is looking at file extension...

I think I can get the info with:

$ python gguf-py/scripts/gguf_dump.py --markdown some_model.gguf

Cheers!

ubergarm changed discussion status to closed 4 days ago

mradermacher

Owner 4 days ago

•

edited 4 days ago

I don't think hf supports split gguf files at all. Would probably be easy to improve on their side - since they only parse the header, they could go with the .gguf.part\d*1of\d+ file (which contains the header). Clearly, it's not a priority for them, and that's fine with me, too. I don't think it's the file extension alone, though, as they do not get confused by the multi-header llama format (which uses .gguf)

PS: part1of5.gguf wouldn't be correct

ubergarm

3 days ago

Ahh, thanks for the details. I guess I don't fully understand what a "split gguf" is exactly. In my testing I can run this.

$ du -h DeepSeek-V3-0324-IQ2_K_R4.gguf
227G    DeepSeek-V3-0324-IQ2_K_R4.gguf

$ ./build/bin/llama-gguf-split \
    --split \
    --split-max-size 50G \
    ./DeepSeek-V3-0324-IQ2_K_R4.gguf \
    /models/DeepSeek-V3-0324-IQ2_K_R4/DeepSeek-V3-0324-IQ2_K_R4

$ du -hc /models/DeepSeek-V3-0324-IQ2_K_R4/DeepSeek-V3-0324-IQ2_K_R4/*.gguf
46G     DeepSeek-V3-0324-IQ2_K_R4-00001-of-00005.gguf
47G     DeepSeek-V3-0324-IQ2_K_R4-00002-of-00005.gguf
47G     DeepSeek-V3-0324-IQ2_K_R4-00003-of-00005.gguf
47G     DeepSeek-V3-0324-IQ2_K_R4-00004-of-00005.gguf
43G     DeepSeek-V3-0324-IQ2_K_R4-00005-of-00005.gguf
227G    total

And huggingface works fine with this as seen in ubergarm/DeepSeek-V3-0324-GGUF as well as bartowski's and unsloth's repos as well.

Anyway, I'm just trying to see what exactly is in your quant before downloading the entire thing. I downloaded just the first part and gguf-py/scripts/gguf_dump.py isn't working, so I'll try to hexedit or find another tool that can print out the header information at least.

Or if you have the time to run this on the folder containing your splits and copy paste it here. No pressure at all, I know y'all keeping busy!

pip install 'numpy<2.0.0'
python llama.cpp/gguf-py/scripts/gguf_dump.py \
    --markdown \
    DeepSeek-V3-0324-i1-GGUF/DeepSeek-V3-0324.i1-Q2_K.gguf.part1of5

Thanks again!

nicoboss

3 days ago

@ubergarm You only need the first few megabytes of the first part to get the metadata. Just use gguf-parser-windows-amd64.exe as I did under https://huggingface.co/mradermacher/model_requests/discussions/797#67e2c2975baf8e70d6e63d99

mradermacher

Owner 3 days ago

I guess I don't fully understand what a "split gguf" is exactly.

Right - a split gguf is simply a single gguf file split into multiple parts. That is opposed to a model in multi-part format, which is multiple gguf files, which, unfortunately, is also something like a gguf split into multiple gguf files. It's just not a split gguf file, but multiple ones.

It is very confusing.

Or if you have the time to run this on the folder containing your splits and copy paste it here.

I don't have such a folder, but if you tell me what info you need, I might be able to provide it, if you can't get nico's method to work.

ubergarm

2 days ago

•

edited 2 days ago

No pressure to look at this, I know u busy cooking! haha...

It is very confusing.

Ahh, I see now. It is literally the original .gguf binary data cut into pieces with somewhat arbitrary length. Perhaps this precedent was set by TheBloke as suggested by this gist script.

I would just download it and merge it myself but have to check with my server guy on bandwidth usage haha...

Basically I want to find out if you use different quantizations for different layers similar to how unsloth is doing it e.g. Q6_0 for attention and Q2_K for routed expert layers etc. As of now I believe bartowski is using the same smaller quantization across all the layers except keeping token embedding at Q8_0.

Assuming you have a merged gguf, this would show the information missing from the huggingface model card side-bar (as it can't handle the "split gguf"). (assuming WSL or Linux shell)

git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
# curl -LsSf https://astral.sh/uv/install.sh | sh # install uv if needed
uv venv ./venv --python 3.12 --python-preference=only-managed
source venv/bin/activate
uv pip install 'numpy<2.0.0' sentencepiece pyyaml
python gguf-py/gguf/scripts/gguf_dump.py \
    --markdown \
    DeepSeek-V3-0324-i1-GGUF/DeepSeek-V3-0324.i1-Q2_K.gguf

It would print something out like this (taken from unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf) to show the various quantizations. The first ouput and token embedding part, block 0 (dense layers), and for example block 14 (experts) is plenty.

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
0	output.weight	Output (W)	(~927M) 926679040	7168 x 129280 x 1 x 1	Q6_K
1	output_norm.weight	Output Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
2	token_embd.weight	Token Embedding (W)	(~927M) 926679040	7168 x 129280 x 1 x 1	Q4_K

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
3	blk.0.attn_kv_a_mqa.weight	Block 0 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	Q6_K
4	blk.0.attn_kv_a_norm.weight	Block 0 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
5	blk.0.attn_kv_b.weight	Block 0 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	Q6_K
6	blk.0.attn_norm.weight	Block 0 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
7	blk.0.attn_output.weight	Block 0 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	Q4_K
8	blk.0.attn_q_a.weight	Block 0 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	Q4_K
9	blk.0.attn_q_a_norm.weight	Block 0 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
10	blk.0.attn_q_b.weight	Block 0 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	Q4_K
11	blk.0.ffn_down.weight	Block 0 Feed-Forward Network "Down" (W)	(~132M) 132120576	18432 x 7168 x 1 x 1	Q6_K
12	blk.0.ffn_gate.weight	Block 0 Feed-Forward Network "Gate" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	Q4_K
13	blk.0.ffn_norm.weight	Block 0 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
14	blk.0.ffn_up.weight	Block 0 Feed-Forward Network "Up" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	Q4_K

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
226	blk.14.attn_kv_a_mqa.weight	Block 14 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	Q6_K
227	blk.14.attn_kv_a_norm.weight	Block 14 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
228	blk.14.attn_kv_b.weight	Block 14 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	Q6_K
229	blk.14.attn_norm.weight	Block 14 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
230	blk.14.attn_output.weight	Block 14 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	Q4_K
231	blk.14.attn_q_a.weight	Block 14 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	Q4_K
232	blk.14.attn_q_a_norm.weight	Block 14 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
233	blk.14.attn_q_b.weight	Block 14 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	Q4_K
234	blk.14.exp_probs_b.bias	Block 14 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
235	blk.14.ffn_down_exps.weight	Block 14 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	Q2_K
236	blk.14.ffn_down_shexp.weight	Block 14 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	Q6_K
237	blk.14.ffn_gate_exps.weight	Block 14 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	Q2_K
238	blk.14.ffn_gate_inp.weight	Block 14 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
239	blk.14.ffn_gate_shexp.weight	Block 14 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	Q4_K
240	blk.14.ffn_norm.weight	Block 14 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
241	blk.14.ffn_up_exps.weight	Block 14 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	Q2_K
242	blk.14.ffn_up_shexp.weight	Block 14 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	Q4_K

nicoboss

2 days ago

@ubergarm I ran it for DeepSeek-V3-0324.i1-IQ2_S.gguf

(venv) root@AI:/apool/Meta/llama.cpp# python gguf-py/gguf/scripts/gguf_dump.py
--markdown
/mradermacher/root/DeepSeek-V3-0324.i1-IQ2_S.gguf

/mradermacher/root/DeepSeek-V3-0324.i1-IQ2_S.gguf - GGUF Internal File Dump

Endian: LITTLE endian

Key Value Metadata Store

There are 58 key-value pairs in this file

POS	TYPE	Count	Key	Value
1	UINT32	1	GGUF.version	3
2	UINT64	1	GGUF.tensor_count	1025
3	UINT64	1	GGUF.kv_count	55
4	STRING	1	general.architecture	`deepseek2`
5	STRING	1	general.type	`model`
6	STRING	1	general.name	`DeepSeek V3 0324 Bf16`
7	STRING	1	general.size_label	`256x20B`
8	STRING	1	general.license	`mit`
9	UINT32	1	deepseek2.block_count	61
10	UINT32	1	deepseek2.context_length	163840
11	UINT32	1	deepseek2.embedding_length	7168
12	UINT32	1	deepseek2.feed_forward_length	18432
13	UINT32	1	deepseek2.attention.head_count	128
14	UINT32	1	deepseek2.attention.head_count_kv	128
15	FLOAT32	1	deepseek2.rope.freq_base	10000.0
16	FLOAT32	1	deepseek2.attention.layer_norm_rms_epsilon	1e-06
17	UINT32	1	deepseek2.expert_used_count	8
18	UINT32	1	deepseek2.leading_dense_block_count	3
19	UINT32	1	deepseek2.vocab_size	129280
20	UINT32	1	deepseek2.attention.q_lora_rank	1536
21	UINT32	1	deepseek2.attention.kv_lora_rank	512
22	UINT32	1	deepseek2.attention.key_length	192
23	UINT32	1	deepseek2.attention.value_length	128
24	UINT32	1	deepseek2.expert_feed_forward_length	2048
25	UINT32	1	deepseek2.expert_count	256
26	UINT32	1	deepseek2.expert_shared_count	1
27	FLOAT32	1	deepseek2.expert_weights_scale	2.5
28	BOOL	1	deepseek2.expert_weights_norm	True
29	UINT32	1	deepseek2.expert_gating_func	2
30	UINT32	1	deepseek2.rope.dimension_count	64
31	STRING	1	deepseek2.rope.scaling.type	`yarn`
32	FLOAT32	1	deepseek2.rope.scaling.factor	40.0
33	UINT32	1	deepseek2.rope.scaling.original_context_length	4096
34	FLOAT32	1	deepseek2.rope.scaling.yarn_log_multiplier	0.1
35	STRING	1	tokenizer.ggml.model	`gpt2`
36	STRING	1	tokenizer.ggml.pre	`deepseek-v3`
37	[STRING]	129280	tokenizer.ggml.tokens	[ `<｜begin▁of▁sentence｜>`, `<｜end▁of▁sentence｜>`, `<｜▁pad▁｜>`, `!`, `"`, ... ]
38	[INT32]	129280	tokenizer.ggml.token_type	[ 3, 3, 3, 1, 1, 1, 1, ... ]
39	[STRING]	127741	tokenizer.ggml.merges	[ `Ġ t`, `Ġ a`, `i n`, `Ġ Ġ`, `h e`, ... ]
40	UINT32	1	tokenizer.ggml.bos_token_id	0
41	UINT32	1	tokenizer.ggml.eos_token_id	1
42	UINT32	1	tokenizer.ggml.padding_token_id	1
43	BOOL	1	tokenizer.ggml.add_bos_token	True
44	BOOL	1	tokenizer.ggml.add_eos_token	False
45	STRING	1	tokenizer.chat_template	`{% if not add_generation_promp`...`{{'<｜Assistant｜>'}}{% endif %}`
46	UINT32	1	general.quantization_version	2
47	UINT32	1	general.file_type	28
48	STRING	1	general.url	`https://huggingface.co/mradermacher/DeepSeek-V3-0324-i1-GGUF`
49	STRING	1	mradermacher.quantize_version	`2`
50	STRING	1	mradermacher.quantized_by	`mradermacher`
51	STRING	1	mradermacher.quantized_at	`2025-03-31T15:31:56+02:00`
52	STRING	1	mradermacher.quantized_on	`nico1`
53	STRING	1	general.source.url	`https://huggingface.co/deepseek-ai/DeepSeek-V3-0324`
54	STRING	1	mradermacher.convert_type	`hf`
55	STRING	1	quantize.imatrix.file	`DeepSeek-V3-0324-i1-GGUF/imatrix.dat`
56	STRING	1	quantize.imatrix.dataset	`imatrix-training-full-3`
57	INT32	1	quantize.imatrix.entries_count	720
58	INT32	1	quantize.imatrix.chunks_count	315

nicoboss

2 days ago

•

edited 2 days ago

Base Tensor Group : ~2B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
0	output.weight	Output (W)	(~927M) 926679040	7168 x 129280 x 1 x 1	Q5_K
1	output_norm.weight	Output Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
2	token_embd.weight	Token Embedding (W)	(~927M) 926679040	7168 x 129280 x 1 x 1	IQ3_S

Total elements in base: ( ~2B) 1853365248
Percentage of total elements: 0.28%

Block 0 Tensor Group : ~583M Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
3	blk.0.attn_kv_a_mqa.weight	Block 0 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
4	blk.0.attn_kv_a_norm.weight	Block 0 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
5	blk.0.attn_kv_b.weight	Block 0 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
6	blk.0.attn_norm.weight	Block 0 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
7	blk.0.attn_output.weight	Block 0 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
8	blk.0.attn_q_a.weight	Block 0 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
9	blk.0.attn_q_a_norm.weight	Block 0 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
10	blk.0.attn_q_b.weight	Block 0 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
11	blk.0.ffn_down.weight	Block 0 Feed-Forward Network "Down" (W)	(~132M) 132120576	18432 x 7168 x 1 x 1	IQ3_S
12	blk.0.ffn_gate.weight	Block 0 Feed-Forward Network "Gate" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS
13	blk.0.ffn_norm.weight	Block 0 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
14	blk.0.ffn_up.weight	Block 0 Feed-Forward Network "Up" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS

Total elements in blk.0: (~583M) 583483392
Percentage of total elements: 0.09%

Block 1 Tensor Group : ~583M Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
15	blk.1.attn_kv_a_mqa.weight	Block 1 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
16	blk.1.attn_kv_a_norm.weight	Block 1 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
17	blk.1.attn_kv_b.weight	Block 1 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
18	blk.1.attn_norm.weight	Block 1 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
19	blk.1.attn_output.weight	Block 1 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
20	blk.1.attn_q_a.weight	Block 1 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
21	blk.1.attn_q_a_norm.weight	Block 1 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
22	blk.1.attn_q_b.weight	Block 1 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
23	blk.1.ffn_down.weight	Block 1 Feed-Forward Network "Down" (W)	(~132M) 132120576	18432 x 7168 x 1 x 1	IQ3_S
24	blk.1.ffn_gate.weight	Block 1 Feed-Forward Network "Gate" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS
25	blk.1.ffn_norm.weight	Block 1 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
26	blk.1.ffn_up.weight	Block 1 Feed-Forward Network "Up" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS

Total elements in blk.1: (~583M) 583483392
Percentage of total elements: 0.09%

Block 2 Tensor Group : ~583M Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
27	blk.2.attn_kv_a_mqa.weight	Block 2 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
28	blk.2.attn_kv_a_norm.weight	Block 2 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
29	blk.2.attn_kv_b.weight	Block 2 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
30	blk.2.attn_norm.weight	Block 2 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
31	blk.2.attn_output.weight	Block 2 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
32	blk.2.attn_q_a.weight	Block 2 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
33	blk.2.attn_q_a_norm.weight	Block 2 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
34	blk.2.attn_q_b.weight	Block 2 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
35	blk.2.ffn_down.weight	Block 2 Feed-Forward Network "Down" (W)	(~132M) 132120576	18432 x 7168 x 1 x 1	IQ3_S
36	blk.2.ffn_gate.weight	Block 2 Feed-Forward Network "Gate" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS
37	blk.2.ffn_norm.weight	Block 2 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
38	blk.2.ffn_up.weight	Block 2 Feed-Forward Network "Up" (W)	(~132M) 132120576	7168 x 18432 x 1 x 1	IQ2_XS

Total elements in blk.2: (~583M) 583483392
Percentage of total elements: 0.09%

Block 3 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
39	blk.3.attn_kv_a_mqa.weight	Block 3 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
40	blk.3.attn_kv_a_norm.weight	Block 3 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
41	blk.3.attn_kv_b.weight	Block 3 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
42	blk.3.attn_norm.weight	Block 3 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
43	blk.3.attn_output.weight	Block 3 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
44	blk.3.attn_q_a.weight	Block 3 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
45	blk.3.attn_q_a_norm.weight	Block 3 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
46	blk.3.attn_q_b.weight	Block 3 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
47	blk.3.exp_probs_b.bias	Block 3 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
48	blk.3.ffn_down_exps.weight	Block 3 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ3_S
49	blk.3.ffn_down_shexp.weight	Block 3 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ3_S
50	blk.3.ffn_gate_exps.weight	Block 3 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
51	blk.3.ffn_gate_inp.weight	Block 3 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
52	blk.3.ffn_gate_shexp.weight	Block 3 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
53	blk.3.ffn_norm.weight	Block 3 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
54	blk.3.ffn_up_exps.weight	Block 3 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
55	blk.3.ffn_up_shexp.weight	Block 3 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.3: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 4 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
56	blk.4.attn_kv_a_mqa.weight	Block 4 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
57	blk.4.attn_kv_a_norm.weight	Block 4 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
58	blk.4.attn_kv_b.weight	Block 4 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
59	blk.4.attn_norm.weight	Block 4 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
60	blk.4.attn_output.weight	Block 4 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
61	blk.4.attn_q_a.weight	Block 4 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
62	blk.4.attn_q_a_norm.weight	Block 4 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
63	blk.4.attn_q_b.weight	Block 4 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
64	blk.4.exp_probs_b.bias	Block 4 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
65	blk.4.ffn_down_exps.weight	Block 4 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ3_S
66	blk.4.ffn_down_shexp.weight	Block 4 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ3_S
67	blk.4.ffn_gate_exps.weight	Block 4 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
68	blk.4.ffn_gate_inp.weight	Block 4 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
69	blk.4.ffn_gate_shexp.weight	Block 4 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
70	blk.4.ffn_norm.weight	Block 4 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
71	blk.4.ffn_up_exps.weight	Block 4 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
72	blk.4.ffn_up_shexp.weight	Block 4 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.4: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 5 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
73	blk.5.attn_kv_a_mqa.weight	Block 5 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
74	blk.5.attn_kv_a_norm.weight	Block 5 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
75	blk.5.attn_kv_b.weight	Block 5 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
76	blk.5.attn_norm.weight	Block 5 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
77	blk.5.attn_output.weight	Block 5 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
78	blk.5.attn_q_a.weight	Block 5 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
79	blk.5.attn_q_a_norm.weight	Block 5 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
80	blk.5.attn_q_b.weight	Block 5 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
81	blk.5.exp_probs_b.bias	Block 5 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
82	blk.5.ffn_down_exps.weight	Block 5 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
83	blk.5.ffn_down_shexp.weight	Block 5 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
84	blk.5.ffn_gate_exps.weight	Block 5 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
85	blk.5.ffn_gate_inp.weight	Block 5 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
86	blk.5.ffn_gate_shexp.weight	Block 5 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
87	blk.5.ffn_norm.weight	Block 5 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
88	blk.5.ffn_up_exps.weight	Block 5 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
89	blk.5.ffn_up_shexp.weight	Block 5 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.5: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 6 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
90	blk.6.attn_kv_a_mqa.weight	Block 6 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
91	blk.6.attn_kv_a_norm.weight	Block 6 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
92	blk.6.attn_kv_b.weight	Block 6 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
93	blk.6.attn_norm.weight	Block 6 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
94	blk.6.attn_output.weight	Block 6 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
95	blk.6.attn_q_a.weight	Block 6 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
96	blk.6.attn_q_a_norm.weight	Block 6 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
97	blk.6.attn_q_b.weight	Block 6 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
98	blk.6.exp_probs_b.bias	Block 6 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
99	blk.6.ffn_down_exps.weight	Block 6 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
100	blk.6.ffn_down_shexp.weight	Block 6 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
101	blk.6.ffn_gate_exps.weight	Block 6 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
102	blk.6.ffn_gate_inp.weight	Block 6 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
103	blk.6.ffn_gate_shexp.weight	Block 6 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
104	blk.6.ffn_norm.weight	Block 6 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
105	blk.6.ffn_up_exps.weight	Block 6 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
106	blk.6.ffn_up_shexp.weight	Block 6 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.6: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 7 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
107	blk.7.attn_kv_a_mqa.weight	Block 7 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
108	blk.7.attn_kv_a_norm.weight	Block 7 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
109	blk.7.attn_kv_b.weight	Block 7 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
110	blk.7.attn_norm.weight	Block 7 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
111	blk.7.attn_output.weight	Block 7 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
112	blk.7.attn_q_a.weight	Block 7 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
113	blk.7.attn_q_a_norm.weight	Block 7 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
114	blk.7.attn_q_b.weight	Block 7 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
115	blk.7.exp_probs_b.bias	Block 7 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
116	blk.7.ffn_down_exps.weight	Block 7 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
117	blk.7.ffn_down_shexp.weight	Block 7 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
118	blk.7.ffn_gate_exps.weight	Block 7 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
119	blk.7.ffn_gate_inp.weight	Block 7 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
120	blk.7.ffn_gate_shexp.weight	Block 7 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
121	blk.7.ffn_norm.weight	Block 7 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
122	blk.7.ffn_up_exps.weight	Block 7 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
123	blk.7.ffn_up_shexp.weight	Block 7 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.7: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 8 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
124	blk.8.attn_kv_a_mqa.weight	Block 8 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
125	blk.8.attn_kv_a_norm.weight	Block 8 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
126	blk.8.attn_kv_b.weight	Block 8 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
127	blk.8.attn_norm.weight	Block 8 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
128	blk.8.attn_output.weight	Block 8 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
129	blk.8.attn_q_a.weight	Block 8 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
130	blk.8.attn_q_a_norm.weight	Block 8 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
131	blk.8.attn_q_b.weight	Block 8 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
132	blk.8.exp_probs_b.bias	Block 8 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
133	blk.8.ffn_down_exps.weight	Block 8 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
134	blk.8.ffn_down_shexp.weight	Block 8 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
135	blk.8.ffn_gate_exps.weight	Block 8 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
136	blk.8.ffn_gate_inp.weight	Block 8 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
137	blk.8.ffn_gate_shexp.weight	Block 8 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
138	blk.8.ffn_norm.weight	Block 8 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
139	blk.8.ffn_up_exps.weight	Block 8 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
140	blk.8.ffn_up_shexp.weight	Block 8 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.8: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 9 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
141	blk.9.attn_kv_a_mqa.weight	Block 9 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
142	blk.9.attn_kv_a_norm.weight	Block 9 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
143	blk.9.attn_kv_b.weight	Block 9 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
144	blk.9.attn_norm.weight	Block 9 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
145	blk.9.attn_output.weight	Block 9 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
146	blk.9.attn_q_a.weight	Block 9 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
147	blk.9.attn_q_a_norm.weight	Block 9 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
148	blk.9.attn_q_b.weight	Block 9 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
149	blk.9.exp_probs_b.bias	Block 9 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
150	blk.9.ffn_down_exps.weight	Block 9 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
151	blk.9.ffn_down_shexp.weight	Block 9 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
152	blk.9.ffn_gate_exps.weight	Block 9 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
153	blk.9.ffn_gate_inp.weight	Block 9 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
154	blk.9.ffn_gate_shexp.weight	Block 9 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
155	blk.9.ffn_norm.weight	Block 9 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
156	blk.9.ffn_up_exps.weight	Block 9 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
157	blk.9.ffn_up_shexp.weight	Block 9 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.9: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 10 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
158	blk.10.attn_kv_a_mqa.weight	Block 10 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
159	blk.10.attn_kv_a_norm.weight	Block 10 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
160	blk.10.attn_kv_b.weight	Block 10 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
161	blk.10.attn_norm.weight	Block 10 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
162	blk.10.attn_output.weight	Block 10 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
163	blk.10.attn_q_a.weight	Block 10 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
164	blk.10.attn_q_a_norm.weight	Block 10 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
165	blk.10.attn_q_b.weight	Block 10 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
166	blk.10.exp_probs_b.bias	Block 10 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
167	blk.10.ffn_down_exps.weight	Block 10 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
168	blk.10.ffn_down_shexp.weight	Block 10 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
169	blk.10.ffn_gate_exps.weight	Block 10 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
170	blk.10.ffn_gate_inp.weight	Block 10 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
171	blk.10.ffn_gate_shexp.weight	Block 10 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
172	blk.10.ffn_norm.weight	Block 10 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
173	blk.10.ffn_up_exps.weight	Block 10 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
174	blk.10.ffn_up_shexp.weight	Block 10 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.10: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 11 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
175	blk.11.attn_kv_a_mqa.weight	Block 11 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
176	blk.11.attn_kv_a_norm.weight	Block 11 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
177	blk.11.attn_kv_b.weight	Block 11 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
178	blk.11.attn_norm.weight	Block 11 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
179	blk.11.attn_output.weight	Block 11 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
180	blk.11.attn_q_a.weight	Block 11 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
181	blk.11.attn_q_a_norm.weight	Block 11 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
182	blk.11.attn_q_b.weight	Block 11 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
183	blk.11.exp_probs_b.bias	Block 11 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
184	blk.11.ffn_down_exps.weight	Block 11 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
185	blk.11.ffn_down_shexp.weight	Block 11 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
186	blk.11.ffn_gate_exps.weight	Block 11 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
187	blk.11.ffn_gate_inp.weight	Block 11 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
188	blk.11.ffn_gate_shexp.weight	Block 11 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
189	blk.11.ffn_norm.weight	Block 11 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
190	blk.11.ffn_up_exps.weight	Block 11 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
191	blk.11.ffn_up_shexp.weight	Block 11 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.11: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 12 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
192	blk.12.attn_kv_a_mqa.weight	Block 12 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
193	blk.12.attn_kv_a_norm.weight	Block 12 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
194	blk.12.attn_kv_b.weight	Block 12 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
195	blk.12.attn_norm.weight	Block 12 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
196	blk.12.attn_output.weight	Block 12 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
197	blk.12.attn_q_a.weight	Block 12 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
198	blk.12.attn_q_a_norm.weight	Block 12 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
199	blk.12.attn_q_b.weight	Block 12 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
200	blk.12.exp_probs_b.bias	Block 12 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
201	blk.12.ffn_down_exps.weight	Block 12 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
202	blk.12.ffn_down_shexp.weight	Block 12 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
203	blk.12.ffn_gate_exps.weight	Block 12 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
204	blk.12.ffn_gate_inp.weight	Block 12 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
205	blk.12.ffn_gate_shexp.weight	Block 12 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
206	blk.12.ffn_norm.weight	Block 12 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
207	blk.12.ffn_up_exps.weight	Block 12 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
208	blk.12.ffn_up_shexp.weight	Block 12 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.12: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 13 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
209	blk.13.attn_kv_a_mqa.weight	Block 13 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
210	blk.13.attn_kv_a_norm.weight	Block 13 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
211	blk.13.attn_kv_b.weight	Block 13 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
212	blk.13.attn_norm.weight	Block 13 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
213	blk.13.attn_output.weight	Block 13 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
214	blk.13.attn_q_a.weight	Block 13 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
215	blk.13.attn_q_a_norm.weight	Block 13 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
216	blk.13.attn_q_b.weight	Block 13 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
217	blk.13.exp_probs_b.bias	Block 13 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
218	blk.13.ffn_down_exps.weight	Block 13 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
219	blk.13.ffn_down_shexp.weight	Block 13 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
220	blk.13.ffn_gate_exps.weight	Block 13 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
221	blk.13.ffn_gate_inp.weight	Block 13 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
222	blk.13.ffn_gate_shexp.weight	Block 13 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
223	blk.13.ffn_norm.weight	Block 13 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
224	blk.13.ffn_up_exps.weight	Block 13 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
225	blk.13.ffn_up_shexp.weight	Block 13 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.13: (~12B) 11507286272
Percentage of total elements: 1.71%

nicoboss

2 days ago

Block 14 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
226	blk.14.attn_kv_a_mqa.weight	Block 14 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
227	blk.14.attn_kv_a_norm.weight	Block 14 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
228	blk.14.attn_kv_b.weight	Block 14 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
229	blk.14.attn_norm.weight	Block 14 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
230	blk.14.attn_output.weight	Block 14 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
231	blk.14.attn_q_a.weight	Block 14 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
232	blk.14.attn_q_a_norm.weight	Block 14 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
233	blk.14.attn_q_b.weight	Block 14 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
234	blk.14.exp_probs_b.bias	Block 14 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
235	blk.14.ffn_down_exps.weight	Block 14 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
236	blk.14.ffn_down_shexp.weight	Block 14 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
237	blk.14.ffn_gate_exps.weight	Block 14 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
238	blk.14.ffn_gate_inp.weight	Block 14 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
239	blk.14.ffn_gate_shexp.weight	Block 14 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
240	blk.14.ffn_norm.weight	Block 14 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
241	blk.14.ffn_up_exps.weight	Block 14 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
242	blk.14.ffn_up_shexp.weight	Block 14 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.14: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 15 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
243	blk.15.attn_kv_a_mqa.weight	Block 15 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
244	blk.15.attn_kv_a_norm.weight	Block 15 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
245	blk.15.attn_kv_b.weight	Block 15 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
246	blk.15.attn_norm.weight	Block 15 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
247	blk.15.attn_output.weight	Block 15 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
248	blk.15.attn_q_a.weight	Block 15 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
249	blk.15.attn_q_a_norm.weight	Block 15 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
250	blk.15.attn_q_b.weight	Block 15 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
251	blk.15.exp_probs_b.bias	Block 15 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
252	blk.15.ffn_down_exps.weight	Block 15 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
253	blk.15.ffn_down_shexp.weight	Block 15 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
254	blk.15.ffn_gate_exps.weight	Block 15 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
255	blk.15.ffn_gate_inp.weight	Block 15 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
256	blk.15.ffn_gate_shexp.weight	Block 15 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
257	blk.15.ffn_norm.weight	Block 15 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
258	blk.15.ffn_up_exps.weight	Block 15 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
259	blk.15.ffn_up_shexp.weight	Block 15 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.15: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 16 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
260	blk.16.attn_kv_a_mqa.weight	Block 16 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
261	blk.16.attn_kv_a_norm.weight	Block 16 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
262	blk.16.attn_kv_b.weight	Block 16 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
263	blk.16.attn_norm.weight	Block 16 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
264	blk.16.attn_output.weight	Block 16 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
265	blk.16.attn_q_a.weight	Block 16 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
266	blk.16.attn_q_a_norm.weight	Block 16 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
267	blk.16.attn_q_b.weight	Block 16 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
268	blk.16.exp_probs_b.bias	Block 16 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
269	blk.16.ffn_down_exps.weight	Block 16 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
270	blk.16.ffn_down_shexp.weight	Block 16 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
271	blk.16.ffn_gate_exps.weight	Block 16 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
272	blk.16.ffn_gate_inp.weight	Block 16 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
273	blk.16.ffn_gate_shexp.weight	Block 16 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
274	blk.16.ffn_norm.weight	Block 16 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
275	blk.16.ffn_up_exps.weight	Block 16 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
276	blk.16.ffn_up_shexp.weight	Block 16 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.16: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 17 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
277	blk.17.attn_kv_a_mqa.weight	Block 17 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
278	blk.17.attn_kv_a_norm.weight	Block 17 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
279	blk.17.attn_kv_b.weight	Block 17 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
280	blk.17.attn_norm.weight	Block 17 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
281	blk.17.attn_output.weight	Block 17 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
282	blk.17.attn_q_a.weight	Block 17 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
283	blk.17.attn_q_a_norm.weight	Block 17 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
284	blk.17.attn_q_b.weight	Block 17 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
285	blk.17.exp_probs_b.bias	Block 17 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
286	blk.17.ffn_down_exps.weight	Block 17 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
287	blk.17.ffn_down_shexp.weight	Block 17 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
288	blk.17.ffn_gate_exps.weight	Block 17 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
289	blk.17.ffn_gate_inp.weight	Block 17 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
290	blk.17.ffn_gate_shexp.weight	Block 17 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
291	blk.17.ffn_norm.weight	Block 17 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
292	blk.17.ffn_up_exps.weight	Block 17 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
293	blk.17.ffn_up_shexp.weight	Block 17 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.17: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 18 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
294	blk.18.attn_kv_a_mqa.weight	Block 18 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
295	blk.18.attn_kv_a_norm.weight	Block 18 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
296	blk.18.attn_kv_b.weight	Block 18 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
297	blk.18.attn_norm.weight	Block 18 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
298	blk.18.attn_output.weight	Block 18 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
299	blk.18.attn_q_a.weight	Block 18 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
300	blk.18.attn_q_a_norm.weight	Block 18 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
301	blk.18.attn_q_b.weight	Block 18 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
302	blk.18.exp_probs_b.bias	Block 18 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
303	blk.18.ffn_down_exps.weight	Block 18 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
304	blk.18.ffn_down_shexp.weight	Block 18 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
305	blk.18.ffn_gate_exps.weight	Block 18 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
306	blk.18.ffn_gate_inp.weight	Block 18 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
307	blk.18.ffn_gate_shexp.weight	Block 18 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
308	blk.18.ffn_norm.weight	Block 18 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
309	blk.18.ffn_up_exps.weight	Block 18 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
310	blk.18.ffn_up_shexp.weight	Block 18 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.18: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 19 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
311	blk.19.attn_kv_a_mqa.weight	Block 19 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
312	blk.19.attn_kv_a_norm.weight	Block 19 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
313	blk.19.attn_kv_b.weight	Block 19 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
314	blk.19.attn_norm.weight	Block 19 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
315	blk.19.attn_output.weight	Block 19 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
316	blk.19.attn_q_a.weight	Block 19 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
317	blk.19.attn_q_a_norm.weight	Block 19 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
318	blk.19.attn_q_b.weight	Block 19 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
319	blk.19.exp_probs_b.bias	Block 19 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
320	blk.19.ffn_down_exps.weight	Block 19 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
321	blk.19.ffn_down_shexp.weight	Block 19 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
322	blk.19.ffn_gate_exps.weight	Block 19 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
323	blk.19.ffn_gate_inp.weight	Block 19 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
324	blk.19.ffn_gate_shexp.weight	Block 19 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
325	blk.19.ffn_norm.weight	Block 19 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
326	blk.19.ffn_up_exps.weight	Block 19 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
327	blk.19.ffn_up_shexp.weight	Block 19 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.19: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 20 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
328	blk.20.attn_kv_a_mqa.weight	Block 20 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
329	blk.20.attn_kv_a_norm.weight	Block 20 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
330	blk.20.attn_kv_b.weight	Block 20 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
331	blk.20.attn_norm.weight	Block 20 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
332	blk.20.attn_output.weight	Block 20 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
333	blk.20.attn_q_a.weight	Block 20 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
334	blk.20.attn_q_a_norm.weight	Block 20 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
335	blk.20.attn_q_b.weight	Block 20 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
336	blk.20.exp_probs_b.bias	Block 20 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
337	blk.20.ffn_down_exps.weight	Block 20 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
338	blk.20.ffn_down_shexp.weight	Block 20 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
339	blk.20.ffn_gate_exps.weight	Block 20 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
340	blk.20.ffn_gate_inp.weight	Block 20 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
341	blk.20.ffn_gate_shexp.weight	Block 20 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
342	blk.20.ffn_norm.weight	Block 20 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
343	blk.20.ffn_up_exps.weight	Block 20 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
344	blk.20.ffn_up_shexp.weight	Block 20 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.20: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 21 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
345	blk.21.attn_kv_a_mqa.weight	Block 21 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
346	blk.21.attn_kv_a_norm.weight	Block 21 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
347	blk.21.attn_kv_b.weight	Block 21 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
348	blk.21.attn_norm.weight	Block 21 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
349	blk.21.attn_output.weight	Block 21 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
350	blk.21.attn_q_a.weight	Block 21 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
351	blk.21.attn_q_a_norm.weight	Block 21 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
352	blk.21.attn_q_b.weight	Block 21 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
353	blk.21.exp_probs_b.bias	Block 21 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
354	blk.21.ffn_down_exps.weight	Block 21 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
355	blk.21.ffn_down_shexp.weight	Block 21 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
356	blk.21.ffn_gate_exps.weight	Block 21 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
357	blk.21.ffn_gate_inp.weight	Block 21 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
358	blk.21.ffn_gate_shexp.weight	Block 21 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
359	blk.21.ffn_norm.weight	Block 21 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
360	blk.21.ffn_up_exps.weight	Block 21 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
361	blk.21.ffn_up_shexp.weight	Block 21 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.21: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 22 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
362	blk.22.attn_kv_a_mqa.weight	Block 22 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
363	blk.22.attn_kv_a_norm.weight	Block 22 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
364	blk.22.attn_kv_b.weight	Block 22 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
365	blk.22.attn_norm.weight	Block 22 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
366	blk.22.attn_output.weight	Block 22 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
367	blk.22.attn_q_a.weight	Block 22 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
368	blk.22.attn_q_a_norm.weight	Block 22 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
369	blk.22.attn_q_b.weight	Block 22 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
370	blk.22.exp_probs_b.bias	Block 22 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
371	blk.22.ffn_down_exps.weight	Block 22 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
372	blk.22.ffn_down_shexp.weight	Block 22 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
373	blk.22.ffn_gate_exps.weight	Block 22 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
374	blk.22.ffn_gate_inp.weight	Block 22 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
375	blk.22.ffn_gate_shexp.weight	Block 22 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
376	blk.22.ffn_norm.weight	Block 22 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
377	blk.22.ffn_up_exps.weight	Block 22 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
378	blk.22.ffn_up_shexp.weight	Block 22 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.22: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 23 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
379	blk.23.attn_kv_a_mqa.weight	Block 23 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
380	blk.23.attn_kv_a_norm.weight	Block 23 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
381	blk.23.attn_kv_b.weight	Block 23 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
382	blk.23.attn_norm.weight	Block 23 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
383	blk.23.attn_output.weight	Block 23 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
384	blk.23.attn_q_a.weight	Block 23 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
385	blk.23.attn_q_a_norm.weight	Block 23 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
386	blk.23.attn_q_b.weight	Block 23 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
387	blk.23.exp_probs_b.bias	Block 23 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
388	blk.23.ffn_down_exps.weight	Block 23 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
389	blk.23.ffn_down_shexp.weight	Block 23 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
390	blk.23.ffn_gate_exps.weight	Block 23 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
391	blk.23.ffn_gate_inp.weight	Block 23 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
392	blk.23.ffn_gate_shexp.weight	Block 23 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
393	blk.23.ffn_norm.weight	Block 23 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
394	blk.23.ffn_up_exps.weight	Block 23 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
395	blk.23.ffn_up_shexp.weight	Block 23 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.23: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 24 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
396	blk.24.attn_kv_a_mqa.weight	Block 24 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
397	blk.24.attn_kv_a_norm.weight	Block 24 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
398	blk.24.attn_kv_b.weight	Block 24 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
399	blk.24.attn_norm.weight	Block 24 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
400	blk.24.attn_output.weight	Block 24 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
401	blk.24.attn_q_a.weight	Block 24 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
402	blk.24.attn_q_a_norm.weight	Block 24 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
403	blk.24.attn_q_b.weight	Block 24 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
404	blk.24.exp_probs_b.bias	Block 24 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
405	blk.24.ffn_down_exps.weight	Block 24 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
406	blk.24.ffn_down_shexp.weight	Block 24 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
407	blk.24.ffn_gate_exps.weight	Block 24 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
408	blk.24.ffn_gate_inp.weight	Block 24 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
409	blk.24.ffn_gate_shexp.weight	Block 24 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
410	blk.24.ffn_norm.weight	Block 24 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
411	blk.24.ffn_up_exps.weight	Block 24 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
412	blk.24.ffn_up_shexp.weight	Block 24 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.24: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 25 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
413	blk.25.attn_kv_a_mqa.weight	Block 25 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
414	blk.25.attn_kv_a_norm.weight	Block 25 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
415	blk.25.attn_kv_b.weight	Block 25 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
416	blk.25.attn_norm.weight	Block 25 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
417	blk.25.attn_output.weight	Block 25 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
418	blk.25.attn_q_a.weight	Block 25 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
419	blk.25.attn_q_a_norm.weight	Block 25 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
420	blk.25.attn_q_b.weight	Block 25 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
421	blk.25.exp_probs_b.bias	Block 25 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
422	blk.25.ffn_down_exps.weight	Block 25 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
423	blk.25.ffn_down_shexp.weight	Block 25 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
424	blk.25.ffn_gate_exps.weight	Block 25 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
425	blk.25.ffn_gate_inp.weight	Block 25 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
426	blk.25.ffn_gate_shexp.weight	Block 25 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
427	blk.25.ffn_norm.weight	Block 25 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
428	blk.25.ffn_up_exps.weight	Block 25 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
429	blk.25.ffn_up_shexp.weight	Block 25 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.25: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 26 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
430	blk.26.attn_kv_a_mqa.weight	Block 26 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
431	blk.26.attn_kv_a_norm.weight	Block 26 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
432	blk.26.attn_kv_b.weight	Block 26 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
433	blk.26.attn_norm.weight	Block 26 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
434	blk.26.attn_output.weight	Block 26 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
435	blk.26.attn_q_a.weight	Block 26 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
436	blk.26.attn_q_a_norm.weight	Block 26 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
437	blk.26.attn_q_b.weight	Block 26 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
438	blk.26.exp_probs_b.bias	Block 26 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
439	blk.26.ffn_down_exps.weight	Block 26 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
440	blk.26.ffn_down_shexp.weight	Block 26 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
441	blk.26.ffn_gate_exps.weight	Block 26 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
442	blk.26.ffn_gate_inp.weight	Block 26 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
443	blk.26.ffn_gate_shexp.weight	Block 26 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
444	blk.26.ffn_norm.weight	Block 26 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
445	blk.26.ffn_up_exps.weight	Block 26 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
446	blk.26.ffn_up_shexp.weight	Block 26 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.26: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 27 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
447	blk.27.attn_kv_a_mqa.weight	Block 27 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
448	blk.27.attn_kv_a_norm.weight	Block 27 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
449	blk.27.attn_kv_b.weight	Block 27 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
450	blk.27.attn_norm.weight	Block 27 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
451	blk.27.attn_output.weight	Block 27 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
452	blk.27.attn_q_a.weight	Block 27 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
453	blk.27.attn_q_a_norm.weight	Block 27 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
454	blk.27.attn_q_b.weight	Block 27 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
455	blk.27.exp_probs_b.bias	Block 27 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
456	blk.27.ffn_down_exps.weight	Block 27 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
457	blk.27.ffn_down_shexp.weight	Block 27 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
458	blk.27.ffn_gate_exps.weight	Block 27 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
459	blk.27.ffn_gate_inp.weight	Block 27 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
460	blk.27.ffn_gate_shexp.weight	Block 27 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
461	blk.27.ffn_norm.weight	Block 27 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
462	blk.27.ffn_up_exps.weight	Block 27 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
463	blk.27.ffn_up_shexp.weight	Block 27 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.27: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 28 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
464	blk.28.attn_kv_a_mqa.weight	Block 28 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
465	blk.28.attn_kv_a_norm.weight	Block 28 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
466	blk.28.attn_kv_b.weight	Block 28 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
467	blk.28.attn_norm.weight	Block 28 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
468	blk.28.attn_output.weight	Block 28 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
469	blk.28.attn_q_a.weight	Block 28 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
470	blk.28.attn_q_a_norm.weight	Block 28 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
471	blk.28.attn_q_b.weight	Block 28 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
472	blk.28.exp_probs_b.bias	Block 28 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
473	blk.28.ffn_down_exps.weight	Block 28 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
474	blk.28.ffn_down_shexp.weight	Block 28 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
475	blk.28.ffn_gate_exps.weight	Block 28 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
476	blk.28.ffn_gate_inp.weight	Block 28 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
477	blk.28.ffn_gate_shexp.weight	Block 28 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
478	blk.28.ffn_norm.weight	Block 28 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
479	blk.28.ffn_up_exps.weight	Block 28 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
480	blk.28.ffn_up_shexp.weight	Block 28 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.28: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 29 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
481	blk.29.attn_kv_a_mqa.weight	Block 29 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
482	blk.29.attn_kv_a_norm.weight	Block 29 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
483	blk.29.attn_kv_b.weight	Block 29 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
484	blk.29.attn_norm.weight	Block 29 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
485	blk.29.attn_output.weight	Block 29 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
486	blk.29.attn_q_a.weight	Block 29 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
487	blk.29.attn_q_a_norm.weight	Block 29 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
488	blk.29.attn_q_b.weight	Block 29 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
489	blk.29.exp_probs_b.bias	Block 29 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
490	blk.29.ffn_down_exps.weight	Block 29 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
491	blk.29.ffn_down_shexp.weight	Block 29 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
492	blk.29.ffn_gate_exps.weight	Block 29 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
493	blk.29.ffn_gate_inp.weight	Block 29 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
494	blk.29.ffn_gate_shexp.weight	Block 29 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
495	blk.29.ffn_norm.weight	Block 29 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
496	blk.29.ffn_up_exps.weight	Block 29 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
497	blk.29.ffn_up_shexp.weight	Block 29 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.29: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 30 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
498	blk.30.attn_kv_a_mqa.weight	Block 30 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
499	blk.30.attn_kv_a_norm.weight	Block 30 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
500	blk.30.attn_kv_b.weight	Block 30 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
501	blk.30.attn_norm.weight	Block 30 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
502	blk.30.attn_output.weight	Block 30 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
503	blk.30.attn_q_a.weight	Block 30 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
504	blk.30.attn_q_a_norm.weight	Block 30 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
505	blk.30.attn_q_b.weight	Block 30 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
506	blk.30.exp_probs_b.bias	Block 30 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
507	blk.30.ffn_down_exps.weight	Block 30 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
508	blk.30.ffn_down_shexp.weight	Block 30 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
509	blk.30.ffn_gate_exps.weight	Block 30 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
510	blk.30.ffn_gate_inp.weight	Block 30 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
511	blk.30.ffn_gate_shexp.weight	Block 30 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
512	blk.30.ffn_norm.weight	Block 30 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
513	blk.30.ffn_up_exps.weight	Block 30 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
514	blk.30.ffn_up_shexp.weight	Block 30 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.30: (~12B) 11507286272
Percentage of total elements: 1.71%

nicoboss

2 days ago

Block 31 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
515	blk.31.attn_kv_a_mqa.weight	Block 31 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
516	blk.31.attn_kv_a_norm.weight	Block 31 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
517	blk.31.attn_kv_b.weight	Block 31 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
518	blk.31.attn_norm.weight	Block 31 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
519	blk.31.attn_output.weight	Block 31 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
520	blk.31.attn_q_a.weight	Block 31 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
521	blk.31.attn_q_a_norm.weight	Block 31 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
522	blk.31.attn_q_b.weight	Block 31 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
523	blk.31.exp_probs_b.bias	Block 31 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
524	blk.31.ffn_down_exps.weight	Block 31 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
525	blk.31.ffn_down_shexp.weight	Block 31 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
526	blk.31.ffn_gate_exps.weight	Block 31 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
527	blk.31.ffn_gate_inp.weight	Block 31 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
528	blk.31.ffn_gate_shexp.weight	Block 31 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
529	blk.31.ffn_norm.weight	Block 31 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
530	blk.31.ffn_up_exps.weight	Block 31 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
531	blk.31.ffn_up_shexp.weight	Block 31 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.31: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 32 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
532	blk.32.attn_kv_a_mqa.weight	Block 32 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
533	blk.32.attn_kv_a_norm.weight	Block 32 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
534	blk.32.attn_kv_b.weight	Block 32 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
535	blk.32.attn_norm.weight	Block 32 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
536	blk.32.attn_output.weight	Block 32 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
537	blk.32.attn_q_a.weight	Block 32 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
538	blk.32.attn_q_a_norm.weight	Block 32 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
539	blk.32.attn_q_b.weight	Block 32 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
540	blk.32.exp_probs_b.bias	Block 32 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
541	blk.32.ffn_down_exps.weight	Block 32 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
542	blk.32.ffn_down_shexp.weight	Block 32 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
543	blk.32.ffn_gate_exps.weight	Block 32 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
544	blk.32.ffn_gate_inp.weight	Block 32 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
545	blk.32.ffn_gate_shexp.weight	Block 32 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
546	blk.32.ffn_norm.weight	Block 32 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
547	blk.32.ffn_up_exps.weight	Block 32 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
548	blk.32.ffn_up_shexp.weight	Block 32 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.32: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 33 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
549	blk.33.attn_kv_a_mqa.weight	Block 33 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
550	blk.33.attn_kv_a_norm.weight	Block 33 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
551	blk.33.attn_kv_b.weight	Block 33 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
552	blk.33.attn_norm.weight	Block 33 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
553	blk.33.attn_output.weight	Block 33 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
554	blk.33.attn_q_a.weight	Block 33 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
555	blk.33.attn_q_a_norm.weight	Block 33 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
556	blk.33.attn_q_b.weight	Block 33 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
557	blk.33.exp_probs_b.bias	Block 33 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
558	blk.33.ffn_down_exps.weight	Block 33 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
559	blk.33.ffn_down_shexp.weight	Block 33 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
560	blk.33.ffn_gate_exps.weight	Block 33 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
561	blk.33.ffn_gate_inp.weight	Block 33 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
562	blk.33.ffn_gate_shexp.weight	Block 33 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
563	blk.33.ffn_norm.weight	Block 33 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
564	blk.33.ffn_up_exps.weight	Block 33 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
565	blk.33.ffn_up_shexp.weight	Block 33 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.33: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 34 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
566	blk.34.attn_kv_a_mqa.weight	Block 34 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
567	blk.34.attn_kv_a_norm.weight	Block 34 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
568	blk.34.attn_kv_b.weight	Block 34 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
569	blk.34.attn_norm.weight	Block 34 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
570	blk.34.attn_output.weight	Block 34 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
571	blk.34.attn_q_a.weight	Block 34 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
572	blk.34.attn_q_a_norm.weight	Block 34 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
573	blk.34.attn_q_b.weight	Block 34 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
574	blk.34.exp_probs_b.bias	Block 34 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
575	blk.34.ffn_down_exps.weight	Block 34 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
576	blk.34.ffn_down_shexp.weight	Block 34 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
577	blk.34.ffn_gate_exps.weight	Block 34 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
578	blk.34.ffn_gate_inp.weight	Block 34 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
579	blk.34.ffn_gate_shexp.weight	Block 34 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
580	blk.34.ffn_norm.weight	Block 34 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
581	blk.34.ffn_up_exps.weight	Block 34 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
582	blk.34.ffn_up_shexp.weight	Block 34 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.34: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 35 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
583	blk.35.attn_kv_a_mqa.weight	Block 35 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
584	blk.35.attn_kv_a_norm.weight	Block 35 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
585	blk.35.attn_kv_b.weight	Block 35 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
586	blk.35.attn_norm.weight	Block 35 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
587	blk.35.attn_output.weight	Block 35 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
588	blk.35.attn_q_a.weight	Block 35 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
589	blk.35.attn_q_a_norm.weight	Block 35 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
590	blk.35.attn_q_b.weight	Block 35 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
591	blk.35.exp_probs_b.bias	Block 35 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
592	blk.35.ffn_down_exps.weight	Block 35 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
593	blk.35.ffn_down_shexp.weight	Block 35 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
594	blk.35.ffn_gate_exps.weight	Block 35 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
595	blk.35.ffn_gate_inp.weight	Block 35 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
596	blk.35.ffn_gate_shexp.weight	Block 35 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
597	blk.35.ffn_norm.weight	Block 35 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
598	blk.35.ffn_up_exps.weight	Block 35 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
599	blk.35.ffn_up_shexp.weight	Block 35 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.35: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 36 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
600	blk.36.attn_kv_a_mqa.weight	Block 36 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
601	blk.36.attn_kv_a_norm.weight	Block 36 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
602	blk.36.attn_kv_b.weight	Block 36 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
603	blk.36.attn_norm.weight	Block 36 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
604	blk.36.attn_output.weight	Block 36 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
605	blk.36.attn_q_a.weight	Block 36 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
606	blk.36.attn_q_a_norm.weight	Block 36 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
607	blk.36.attn_q_b.weight	Block 36 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
608	blk.36.exp_probs_b.bias	Block 36 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
609	blk.36.ffn_down_exps.weight	Block 36 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
610	blk.36.ffn_down_shexp.weight	Block 36 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
611	blk.36.ffn_gate_exps.weight	Block 36 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
612	blk.36.ffn_gate_inp.weight	Block 36 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
613	blk.36.ffn_gate_shexp.weight	Block 36 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
614	blk.36.ffn_norm.weight	Block 36 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
615	blk.36.ffn_up_exps.weight	Block 36 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
616	blk.36.ffn_up_shexp.weight	Block 36 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.36: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 37 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
617	blk.37.attn_kv_a_mqa.weight	Block 37 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
618	blk.37.attn_kv_a_norm.weight	Block 37 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
619	blk.37.attn_kv_b.weight	Block 37 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
620	blk.37.attn_norm.weight	Block 37 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
621	blk.37.attn_output.weight	Block 37 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
622	blk.37.attn_q_a.weight	Block 37 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
623	blk.37.attn_q_a_norm.weight	Block 37 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
624	blk.37.attn_q_b.weight	Block 37 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
625	blk.37.exp_probs_b.bias	Block 37 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
626	blk.37.ffn_down_exps.weight	Block 37 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
627	blk.37.ffn_down_shexp.weight	Block 37 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
628	blk.37.ffn_gate_exps.weight	Block 37 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
629	blk.37.ffn_gate_inp.weight	Block 37 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
630	blk.37.ffn_gate_shexp.weight	Block 37 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
631	blk.37.ffn_norm.weight	Block 37 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
632	blk.37.ffn_up_exps.weight	Block 37 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
633	blk.37.ffn_up_shexp.weight	Block 37 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.37: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 38 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
634	blk.38.attn_kv_a_mqa.weight	Block 38 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
635	blk.38.attn_kv_a_norm.weight	Block 38 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
636	blk.38.attn_kv_b.weight	Block 38 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
637	blk.38.attn_norm.weight	Block 38 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
638	blk.38.attn_output.weight	Block 38 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
639	blk.38.attn_q_a.weight	Block 38 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
640	blk.38.attn_q_a_norm.weight	Block 38 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
641	blk.38.attn_q_b.weight	Block 38 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
642	blk.38.exp_probs_b.bias	Block 38 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
643	blk.38.ffn_down_exps.weight	Block 38 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
644	blk.38.ffn_down_shexp.weight	Block 38 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
645	blk.38.ffn_gate_exps.weight	Block 38 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
646	blk.38.ffn_gate_inp.weight	Block 38 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
647	blk.38.ffn_gate_shexp.weight	Block 38 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
648	blk.38.ffn_norm.weight	Block 38 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
649	blk.38.ffn_up_exps.weight	Block 38 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
650	blk.38.ffn_up_shexp.weight	Block 38 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.38: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 39 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
651	blk.39.attn_kv_a_mqa.weight	Block 39 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
652	blk.39.attn_kv_a_norm.weight	Block 39 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
653	blk.39.attn_kv_b.weight	Block 39 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
654	blk.39.attn_norm.weight	Block 39 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
655	blk.39.attn_output.weight	Block 39 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
656	blk.39.attn_q_a.weight	Block 39 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
657	blk.39.attn_q_a_norm.weight	Block 39 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
658	blk.39.attn_q_b.weight	Block 39 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
659	blk.39.exp_probs_b.bias	Block 39 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
660	blk.39.ffn_down_exps.weight	Block 39 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
661	blk.39.ffn_down_shexp.weight	Block 39 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
662	blk.39.ffn_gate_exps.weight	Block 39 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
663	blk.39.ffn_gate_inp.weight	Block 39 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
664	blk.39.ffn_gate_shexp.weight	Block 39 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
665	blk.39.ffn_norm.weight	Block 39 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
666	blk.39.ffn_up_exps.weight	Block 39 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
667	blk.39.ffn_up_shexp.weight	Block 39 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.39: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 40 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
668	blk.40.attn_kv_a_mqa.weight	Block 40 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
669	blk.40.attn_kv_a_norm.weight	Block 40 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
670	blk.40.attn_kv_b.weight	Block 40 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
671	blk.40.attn_norm.weight	Block 40 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
672	blk.40.attn_output.weight	Block 40 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
673	blk.40.attn_q_a.weight	Block 40 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
674	blk.40.attn_q_a_norm.weight	Block 40 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
675	blk.40.attn_q_b.weight	Block 40 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
676	blk.40.exp_probs_b.bias	Block 40 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
677	blk.40.ffn_down_exps.weight	Block 40 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
678	blk.40.ffn_down_shexp.weight	Block 40 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
679	blk.40.ffn_gate_exps.weight	Block 40 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
680	blk.40.ffn_gate_inp.weight	Block 40 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
681	blk.40.ffn_gate_shexp.weight	Block 40 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
682	blk.40.ffn_norm.weight	Block 40 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
683	blk.40.ffn_up_exps.weight	Block 40 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
684	blk.40.ffn_up_shexp.weight	Block 40 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.40: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 41 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
685	blk.41.attn_kv_a_mqa.weight	Block 41 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
686	blk.41.attn_kv_a_norm.weight	Block 41 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
687	blk.41.attn_kv_b.weight	Block 41 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
688	blk.41.attn_norm.weight	Block 41 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
689	blk.41.attn_output.weight	Block 41 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
690	blk.41.attn_q_a.weight	Block 41 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
691	blk.41.attn_q_a_norm.weight	Block 41 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
692	blk.41.attn_q_b.weight	Block 41 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
693	blk.41.exp_probs_b.bias	Block 41 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
694	blk.41.ffn_down_exps.weight	Block 41 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
695	blk.41.ffn_down_shexp.weight	Block 41 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
696	blk.41.ffn_gate_exps.weight	Block 41 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
697	blk.41.ffn_gate_inp.weight	Block 41 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
698	blk.41.ffn_gate_shexp.weight	Block 41 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
699	blk.41.ffn_norm.weight	Block 41 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
700	blk.41.ffn_up_exps.weight	Block 41 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
701	blk.41.ffn_up_shexp.weight	Block 41 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.41: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 42 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
702	blk.42.attn_kv_a_mqa.weight	Block 42 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
703	blk.42.attn_kv_a_norm.weight	Block 42 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
704	blk.42.attn_kv_b.weight	Block 42 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
705	blk.42.attn_norm.weight	Block 42 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
706	blk.42.attn_output.weight	Block 42 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
707	blk.42.attn_q_a.weight	Block 42 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
708	blk.42.attn_q_a_norm.weight	Block 42 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
709	blk.42.attn_q_b.weight	Block 42 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
710	blk.42.exp_probs_b.bias	Block 42 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
711	blk.42.ffn_down_exps.weight	Block 42 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
712	blk.42.ffn_down_shexp.weight	Block 42 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
713	blk.42.ffn_gate_exps.weight	Block 42 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
714	blk.42.ffn_gate_inp.weight	Block 42 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
715	blk.42.ffn_gate_shexp.weight	Block 42 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
716	blk.42.ffn_norm.weight	Block 42 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
717	blk.42.ffn_up_exps.weight	Block 42 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
718	blk.42.ffn_up_shexp.weight	Block 42 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.42: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 43 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
719	blk.43.attn_kv_a_mqa.weight	Block 43 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
720	blk.43.attn_kv_a_norm.weight	Block 43 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
721	blk.43.attn_kv_b.weight	Block 43 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
722	blk.43.attn_norm.weight	Block 43 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
723	blk.43.attn_output.weight	Block 43 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
724	blk.43.attn_q_a.weight	Block 43 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
725	blk.43.attn_q_a_norm.weight	Block 43 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
726	blk.43.attn_q_b.weight	Block 43 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
727	blk.43.exp_probs_b.bias	Block 43 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
728	blk.43.ffn_down_exps.weight	Block 43 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
729	blk.43.ffn_down_shexp.weight	Block 43 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
730	blk.43.ffn_gate_exps.weight	Block 43 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
731	blk.43.ffn_gate_inp.weight	Block 43 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
732	blk.43.ffn_gate_shexp.weight	Block 43 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
733	blk.43.ffn_norm.weight	Block 43 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
734	blk.43.ffn_up_exps.weight	Block 43 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
735	blk.43.ffn_up_shexp.weight	Block 43 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.43: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 44 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
736	blk.44.attn_kv_a_mqa.weight	Block 44 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
737	blk.44.attn_kv_a_norm.weight	Block 44 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
738	blk.44.attn_kv_b.weight	Block 44 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
739	blk.44.attn_norm.weight	Block 44 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
740	blk.44.attn_output.weight	Block 44 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
741	blk.44.attn_q_a.weight	Block 44 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
742	blk.44.attn_q_a_norm.weight	Block 44 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
743	blk.44.attn_q_b.weight	Block 44 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
744	blk.44.exp_probs_b.bias	Block 44 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
745	blk.44.ffn_down_exps.weight	Block 44 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
746	blk.44.ffn_down_shexp.weight	Block 44 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
747	blk.44.ffn_gate_exps.weight	Block 44 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
748	blk.44.ffn_gate_inp.weight	Block 44 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
749	blk.44.ffn_gate_shexp.weight	Block 44 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
750	blk.44.ffn_norm.weight	Block 44 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
751	blk.44.ffn_up_exps.weight	Block 44 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
752	blk.44.ffn_up_shexp.weight	Block 44 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.44: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 45 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
753	blk.45.attn_kv_a_mqa.weight	Block 45 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
754	blk.45.attn_kv_a_norm.weight	Block 45 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
755	blk.45.attn_kv_b.weight	Block 45 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
756	blk.45.attn_norm.weight	Block 45 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
757	blk.45.attn_output.weight	Block 45 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
758	blk.45.attn_q_a.weight	Block 45 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
759	blk.45.attn_q_a_norm.weight	Block 45 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
760	blk.45.attn_q_b.weight	Block 45 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
761	blk.45.exp_probs_b.bias	Block 45 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
762	blk.45.ffn_down_exps.weight	Block 45 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
763	blk.45.ffn_down_shexp.weight	Block 45 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
764	blk.45.ffn_gate_exps.weight	Block 45 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
765	blk.45.ffn_gate_inp.weight	Block 45 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
766	blk.45.ffn_gate_shexp.weight	Block 45 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
767	blk.45.ffn_norm.weight	Block 45 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
768	blk.45.ffn_up_exps.weight	Block 45 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
769	blk.45.ffn_up_shexp.weight	Block 45 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.45: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 46 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
770	blk.46.attn_kv_a_mqa.weight	Block 46 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
771	blk.46.attn_kv_a_norm.weight	Block 46 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
772	blk.46.attn_kv_b.weight	Block 46 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
773	blk.46.attn_norm.weight	Block 46 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
774	blk.46.attn_output.weight	Block 46 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
775	blk.46.attn_q_a.weight	Block 46 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
776	blk.46.attn_q_a_norm.weight	Block 46 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
777	blk.46.attn_q_b.weight	Block 46 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
778	blk.46.exp_probs_b.bias	Block 46 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
779	blk.46.ffn_down_exps.weight	Block 46 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
780	blk.46.ffn_down_shexp.weight	Block 46 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
781	blk.46.ffn_gate_exps.weight	Block 46 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
782	blk.46.ffn_gate_inp.weight	Block 46 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
783	blk.46.ffn_gate_shexp.weight	Block 46 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
784	blk.46.ffn_norm.weight	Block 46 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
785	blk.46.ffn_up_exps.weight	Block 46 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
786	blk.46.ffn_up_shexp.weight	Block 46 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.46: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 47 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
787	blk.47.attn_kv_a_mqa.weight	Block 47 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
788	blk.47.attn_kv_a_norm.weight	Block 47 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
789	blk.47.attn_kv_b.weight	Block 47 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
790	blk.47.attn_norm.weight	Block 47 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
791	blk.47.attn_output.weight	Block 47 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
792	blk.47.attn_q_a.weight	Block 47 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
793	blk.47.attn_q_a_norm.weight	Block 47 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
794	blk.47.attn_q_b.weight	Block 47 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
795	blk.47.exp_probs_b.bias	Block 47 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
796	blk.47.ffn_down_exps.weight	Block 47 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
797	blk.47.ffn_down_shexp.weight	Block 47 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
798	blk.47.ffn_gate_exps.weight	Block 47 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
799	blk.47.ffn_gate_inp.weight	Block 47 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
800	blk.47.ffn_gate_shexp.weight	Block 47 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
801	blk.47.ffn_norm.weight	Block 47 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
802	blk.47.ffn_up_exps.weight	Block 47 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
803	blk.47.ffn_up_shexp.weight	Block 47 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.47: (~12B) 11507286272
Percentage of total elements: 1.71%

nicoboss

2 days ago

Block 48 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
804	blk.48.attn_kv_a_mqa.weight	Block 48 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
805	blk.48.attn_kv_a_norm.weight	Block 48 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
806	blk.48.attn_kv_b.weight	Block 48 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
807	blk.48.attn_norm.weight	Block 48 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
808	blk.48.attn_output.weight	Block 48 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
809	blk.48.attn_q_a.weight	Block 48 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
810	blk.48.attn_q_a_norm.weight	Block 48 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
811	blk.48.attn_q_b.weight	Block 48 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
812	blk.48.exp_probs_b.bias	Block 48 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
813	blk.48.ffn_down_exps.weight	Block 48 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
814	blk.48.ffn_down_shexp.weight	Block 48 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
815	blk.48.ffn_gate_exps.weight	Block 48 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
816	blk.48.ffn_gate_inp.weight	Block 48 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
817	blk.48.ffn_gate_shexp.weight	Block 48 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
818	blk.48.ffn_norm.weight	Block 48 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
819	blk.48.ffn_up_exps.weight	Block 48 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
820	blk.48.ffn_up_shexp.weight	Block 48 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.48: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 49 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
821	blk.49.attn_kv_a_mqa.weight	Block 49 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
822	blk.49.attn_kv_a_norm.weight	Block 49 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
823	blk.49.attn_kv_b.weight	Block 49 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
824	blk.49.attn_norm.weight	Block 49 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
825	blk.49.attn_output.weight	Block 49 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
826	blk.49.attn_q_a.weight	Block 49 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
827	blk.49.attn_q_a_norm.weight	Block 49 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
828	blk.49.attn_q_b.weight	Block 49 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
829	blk.49.exp_probs_b.bias	Block 49 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
830	blk.49.ffn_down_exps.weight	Block 49 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
831	blk.49.ffn_down_shexp.weight	Block 49 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
832	blk.49.ffn_gate_exps.weight	Block 49 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
833	blk.49.ffn_gate_inp.weight	Block 49 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
834	blk.49.ffn_gate_shexp.weight	Block 49 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
835	blk.49.ffn_norm.weight	Block 49 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
836	blk.49.ffn_up_exps.weight	Block 49 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
837	blk.49.ffn_up_shexp.weight	Block 49 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.49: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 50 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
838	blk.50.attn_kv_a_mqa.weight	Block 50 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
839	blk.50.attn_kv_a_norm.weight	Block 50 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
840	blk.50.attn_kv_b.weight	Block 50 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
841	blk.50.attn_norm.weight	Block 50 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
842	blk.50.attn_output.weight	Block 50 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
843	blk.50.attn_q_a.weight	Block 50 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
844	blk.50.attn_q_a_norm.weight	Block 50 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
845	blk.50.attn_q_b.weight	Block 50 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
846	blk.50.exp_probs_b.bias	Block 50 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
847	blk.50.ffn_down_exps.weight	Block 50 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
848	blk.50.ffn_down_shexp.weight	Block 50 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
849	blk.50.ffn_gate_exps.weight	Block 50 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
850	blk.50.ffn_gate_inp.weight	Block 50 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
851	blk.50.ffn_gate_shexp.weight	Block 50 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
852	blk.50.ffn_norm.weight	Block 50 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
853	blk.50.ffn_up_exps.weight	Block 50 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
854	blk.50.ffn_up_shexp.weight	Block 50 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.50: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 51 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
855	blk.51.attn_kv_a_mqa.weight	Block 51 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
856	blk.51.attn_kv_a_norm.weight	Block 51 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
857	blk.51.attn_kv_b.weight	Block 51 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
858	blk.51.attn_norm.weight	Block 51 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
859	blk.51.attn_output.weight	Block 51 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
860	blk.51.attn_q_a.weight	Block 51 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
861	blk.51.attn_q_a_norm.weight	Block 51 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
862	blk.51.attn_q_b.weight	Block 51 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
863	blk.51.exp_probs_b.bias	Block 51 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
864	blk.51.ffn_down_exps.weight	Block 51 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
865	blk.51.ffn_down_shexp.weight	Block 51 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
866	blk.51.ffn_gate_exps.weight	Block 51 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
867	blk.51.ffn_gate_inp.weight	Block 51 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
868	blk.51.ffn_gate_shexp.weight	Block 51 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
869	blk.51.ffn_norm.weight	Block 51 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
870	blk.51.ffn_up_exps.weight	Block 51 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
871	blk.51.ffn_up_shexp.weight	Block 51 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.51: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 52 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
872	blk.52.attn_kv_a_mqa.weight	Block 52 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
873	blk.52.attn_kv_a_norm.weight	Block 52 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
874	blk.52.attn_kv_b.weight	Block 52 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
875	blk.52.attn_norm.weight	Block 52 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
876	blk.52.attn_output.weight	Block 52 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
877	blk.52.attn_q_a.weight	Block 52 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
878	blk.52.attn_q_a_norm.weight	Block 52 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
879	blk.52.attn_q_b.weight	Block 52 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
880	blk.52.exp_probs_b.bias	Block 52 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
881	blk.52.ffn_down_exps.weight	Block 52 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
882	blk.52.ffn_down_shexp.weight	Block 52 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
883	blk.52.ffn_gate_exps.weight	Block 52 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
884	blk.52.ffn_gate_inp.weight	Block 52 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
885	blk.52.ffn_gate_shexp.weight	Block 52 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
886	blk.52.ffn_norm.weight	Block 52 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
887	blk.52.ffn_up_exps.weight	Block 52 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
888	blk.52.ffn_up_shexp.weight	Block 52 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.52: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 53 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
889	blk.53.attn_kv_a_mqa.weight	Block 53 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
890	blk.53.attn_kv_a_norm.weight	Block 53 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
891	blk.53.attn_kv_b.weight	Block 53 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
892	blk.53.attn_norm.weight	Block 53 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
893	blk.53.attn_output.weight	Block 53 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
894	blk.53.attn_q_a.weight	Block 53 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
895	blk.53.attn_q_a_norm.weight	Block 53 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
896	blk.53.attn_q_b.weight	Block 53 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
897	blk.53.exp_probs_b.bias	Block 53 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
898	blk.53.ffn_down_exps.weight	Block 53 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
899	blk.53.ffn_down_shexp.weight	Block 53 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
900	blk.53.ffn_gate_exps.weight	Block 53 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
901	blk.53.ffn_gate_inp.weight	Block 53 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
902	blk.53.ffn_gate_shexp.weight	Block 53 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
903	blk.53.ffn_norm.weight	Block 53 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
904	blk.53.ffn_up_exps.weight	Block 53 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
905	blk.53.ffn_up_shexp.weight	Block 53 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.53: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 54 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
906	blk.54.attn_kv_a_mqa.weight	Block 54 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
907	blk.54.attn_kv_a_norm.weight	Block 54 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
908	blk.54.attn_kv_b.weight	Block 54 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
909	blk.54.attn_norm.weight	Block 54 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
910	blk.54.attn_output.weight	Block 54 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
911	blk.54.attn_q_a.weight	Block 54 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
912	blk.54.attn_q_a_norm.weight	Block 54 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
913	blk.54.attn_q_b.weight	Block 54 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
914	blk.54.exp_probs_b.bias	Block 54 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
915	blk.54.ffn_down_exps.weight	Block 54 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
916	blk.54.ffn_down_shexp.weight	Block 54 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
917	blk.54.ffn_gate_exps.weight	Block 54 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
918	blk.54.ffn_gate_inp.weight	Block 54 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
919	blk.54.ffn_gate_shexp.weight	Block 54 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
920	blk.54.ffn_norm.weight	Block 54 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
921	blk.54.ffn_up_exps.weight	Block 54 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
922	blk.54.ffn_up_shexp.weight	Block 54 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.54: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 55 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
923	blk.55.attn_kv_a_mqa.weight	Block 55 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
924	blk.55.attn_kv_a_norm.weight	Block 55 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
925	blk.55.attn_kv_b.weight	Block 55 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
926	blk.55.attn_norm.weight	Block 55 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
927	blk.55.attn_output.weight	Block 55 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
928	blk.55.attn_q_a.weight	Block 55 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
929	blk.55.attn_q_a_norm.weight	Block 55 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
930	blk.55.attn_q_b.weight	Block 55 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
931	blk.55.exp_probs_b.bias	Block 55 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
932	blk.55.ffn_down_exps.weight	Block 55 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
933	blk.55.ffn_down_shexp.weight	Block 55 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
934	blk.55.ffn_gate_exps.weight	Block 55 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
935	blk.55.ffn_gate_inp.weight	Block 55 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
936	blk.55.ffn_gate_shexp.weight	Block 55 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
937	blk.55.ffn_norm.weight	Block 55 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
938	blk.55.ffn_up_exps.weight	Block 55 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
939	blk.55.ffn_up_shexp.weight	Block 55 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.55: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 56 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
940	blk.56.attn_kv_a_mqa.weight	Block 56 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
941	blk.56.attn_kv_a_norm.weight	Block 56 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
942	blk.56.attn_kv_b.weight	Block 56 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
943	blk.56.attn_norm.weight	Block 56 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
944	blk.56.attn_output.weight	Block 56 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
945	blk.56.attn_q_a.weight	Block 56 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
946	blk.56.attn_q_a_norm.weight	Block 56 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
947	blk.56.attn_q_b.weight	Block 56 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
948	blk.56.exp_probs_b.bias	Block 56 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
949	blk.56.ffn_down_exps.weight	Block 56 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
950	blk.56.ffn_down_shexp.weight	Block 56 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
951	blk.56.ffn_gate_exps.weight	Block 56 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
952	blk.56.ffn_gate_inp.weight	Block 56 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
953	blk.56.ffn_gate_shexp.weight	Block 56 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
954	blk.56.ffn_norm.weight	Block 56 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
955	blk.56.ffn_up_exps.weight	Block 56 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
956	blk.56.ffn_up_shexp.weight	Block 56 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.56: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 57 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
957	blk.57.attn_kv_a_mqa.weight	Block 57 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
958	blk.57.attn_kv_a_norm.weight	Block 57 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
959	blk.57.attn_kv_b.weight	Block 57 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
960	blk.57.attn_norm.weight	Block 57 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
961	blk.57.attn_output.weight	Block 57 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
962	blk.57.attn_q_a.weight	Block 57 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
963	blk.57.attn_q_a_norm.weight	Block 57 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
964	blk.57.attn_q_b.weight	Block 57 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
965	blk.57.exp_probs_b.bias	Block 57 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
966	blk.57.ffn_down_exps.weight	Block 57 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
967	blk.57.ffn_down_shexp.weight	Block 57 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
968	blk.57.ffn_gate_exps.weight	Block 57 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
969	blk.57.ffn_gate_inp.weight	Block 57 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
970	blk.57.ffn_gate_shexp.weight	Block 57 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
971	blk.57.ffn_norm.weight	Block 57 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
972	blk.57.ffn_up_exps.weight	Block 57 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
973	blk.57.ffn_up_shexp.weight	Block 57 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.57: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 58 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
974	blk.58.attn_kv_a_mqa.weight	Block 58 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
975	blk.58.attn_kv_a_norm.weight	Block 58 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
976	blk.58.attn_kv_b.weight	Block 58 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
977	blk.58.attn_norm.weight	Block 58 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
978	blk.58.attn_output.weight	Block 58 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
979	blk.58.attn_q_a.weight	Block 58 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
980	blk.58.attn_q_a_norm.weight	Block 58 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
981	blk.58.attn_q_b.weight	Block 58 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
982	blk.58.exp_probs_b.bias	Block 58 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
983	blk.58.ffn_down_exps.weight	Block 58 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
984	blk.58.ffn_down_shexp.weight	Block 58 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
985	blk.58.ffn_gate_exps.weight	Block 58 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
986	blk.58.ffn_gate_inp.weight	Block 58 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
987	blk.58.ffn_gate_shexp.weight	Block 58 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
988	blk.58.ffn_norm.weight	Block 58 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
989	blk.58.ffn_up_exps.weight	Block 58 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
990	blk.58.ffn_up_shexp.weight	Block 58 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.58: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 59 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
991	blk.59.attn_kv_a_mqa.weight	Block 59 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
992	blk.59.attn_kv_a_norm.weight	Block 59 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
993	blk.59.attn_kv_b.weight	Block 59 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
994	blk.59.attn_norm.weight	Block 59 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
995	blk.59.attn_output.weight	Block 59 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
996	blk.59.attn_q_a.weight	Block 59 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
997	blk.59.attn_q_a_norm.weight	Block 59 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
998	blk.59.attn_q_b.weight	Block 59 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
999	blk.59.exp_probs_b.bias	Block 59 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
1000	blk.59.ffn_down_exps.weight	Block 59 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
1001	blk.59.ffn_down_shexp.weight	Block 59 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
1002	blk.59.ffn_gate_exps.weight	Block 59 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
1003	blk.59.ffn_gate_inp.weight	Block 59 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
1004	blk.59.ffn_gate_shexp.weight	Block 59 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
1005	blk.59.ffn_norm.weight	Block 59 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
1006	blk.59.ffn_up_exps.weight	Block 59 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
1007	blk.59.ffn_up_shexp.weight	Block 59 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.59: (~12B) 11507286272
Percentage of total elements: 1.71%

Block 60 Tensor Group : ~12B Elements

T_ID	Tensor Layer Name	Human Friendly Tensor Layer Name	Elements	Shape	Type
1008	blk.60.attn_kv_a_mqa.weight	Block 60 Attn_Kv_A_Mqa (W)	( ~4M) 4128768	7168 x 576 x 1 x 1	IQ2_XS
1009	blk.60.attn_kv_a_norm.weight	Block 60 Attn_Kv_A_Norm (W)	( 512) 512	512 x 1 x 1 x 1	F32
1010	blk.60.attn_kv_b.weight	Block 60 Attn_Kv_B (W)	( ~17M) 16777216	512 x 32768 x 1 x 1	IQ2_XS
1011	blk.60.attn_norm.weight	Block 60 Attention Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
1012	blk.60.attn_output.weight	Block 60 Attention Output (W)	(~117M) 117440512	16384 x 7168 x 1 x 1	IQ3_S
1013	blk.60.attn_q_a.weight	Block 60 Attn_Q_A (W)	( ~11M) 11010048	7168 x 1536 x 1 x 1	IQ2_XS
1014	blk.60.attn_q_a_norm.weight	Block 60 Attn_Q_A_Norm (W)	( ~2K) 1536	1536 x 1 x 1 x 1	F32
1015	blk.60.attn_q_b.weight	Block 60 Attn_Q_B (W)	( ~38M) 37748736	1536 x 24576 x 1 x 1	IQ2_XS
1016	blk.60.exp_probs_b.bias	Block 60 Exp_Probs_B (B)	( 256) 256	256 x 1 x 1 x 1	F32
1017	blk.60.ffn_down_exps.weight	Block 60 Ffn_Down_Exps (W)	( ~4B) 3758096384	2048 x 7168 x 256 x 1	IQ2_XS
1018	blk.60.ffn_down_shexp.weight	Block 60 Ffn_Down_Shexp (W)	( ~15M) 14680064	2048 x 7168 x 1 x 1	IQ2_XS
1019	blk.60.ffn_gate_exps.weight	Block 60 Ffn_Gate_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
1020	blk.60.ffn_gate_inp.weight	Block 60 Expert-Routing Layer For The Feed-Forward Network In Mixture Of Expert Models (W)	( ~2M) 1835008	7168 x 256 x 1 x 1	F32
1021	blk.60.ffn_gate_shexp.weight	Block 60 Ffn_Gate_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS
1022	blk.60.ffn_norm.weight	Block 60 Feed-Forward Network Normalization (W)	( ~7K) 7168	7168 x 1 x 1 x 1	F32
1023	blk.60.ffn_up_exps.weight	Block 60 Ffn_Up_Exps (W)	( ~4B) 3758096384	7168 x 2048 x 256 x 1	IQ2_XS
1024	blk.60.ffn_up_shexp.weight	Block 60 Ffn_Up_Shexp (W)	( ~15M) 14680064	7168 x 2048 x 1 x 1	IQ2_XS

Total elements in blk.60: (~12B) 11507286272
Percentage of total elements: 1.71%

mradermacher

Owner 2 days ago

Perhaps this precedent was set by

No, it's forced by the 50GB file size limit on hf. I don't know how thebloke split his files, but we split them so you could load mmap the parts directly. I think it's the most common format on hf as well. We tried hard to provide the newer format (even trying to patch gguf-split, but the usage of C++ iostreams makes it pretty much impossible). We simply don't have the resources for this format on most servers.

It's not an issue for most models and users, fortunately, but your use case of course is such an example.

As of now I believe bartowski is using the same smaller quantization across all the layers except keeping token embedding at Q8_0.

How could they even differ? If bartowski's quants are in a nicer format for you, there should not be a reason not to use those. The mix should be the same (unless I am mistaken).

It would print something out like this

Good that nico apparently did it. I could have provided the info from the quants (as json), but not the extra info the tool prints.

ubergarm

1 day ago

•

edited 1 day ago

Heya really appreciate it both of you! Okay yes I can see exactly which quant was used for each layer completely! That helps me to compare across the available quants in this size class. Maybe some time I can try to compare perplexity across each model to get a rough estimate of "Perplexity per GiB" or something haha...

How could they even differ? If bartowski's quants are in a nicer format for you, there should not be a reason not to use those. The mix should be the same (unless I am mistaken).

So I made a comparison chart here and yes your mix has the same ratios as bartowski. You also seem to use a high quality imatrix mix. unsloth has a custom fork they use which changes a few layers to be higher quality. I'm using ik_llama.cpp fork and a convenient bash script to map each layer to a desired quantization level.

I'm still experimenting on how the various mixes perform in terms of quality (perplexity) and speed (llama-bench) for prompt processing and token generation. Amazing how fast this stuff is moving! Thanks for all your help and all the quants!

nicoboss

1 day ago

•

edited 1 day ago

So I made a comparison chart here

Awesome. Thanks lot for collecting and visualizing all this data!

your mix has the same ratios as bartowski.

That was expected as we use the standardized llama.cpp mix and so does bartowski.

You also seem to use a high quality imatrix mix.

Yes we do. Awesome that you figured this out. Our matrix training is superior compared to bartowski. Our imatrix dataset is around double the size with the first half of it beeing bartowski's imatrix dataset while the other half consists of proprietary high-quality data covering all common use cases of LLMs that are missing in bartowski's imatrix dataset like story writing and roleplay. mradermacher put a lot of effort into creating the best imatrix dataset possible last spring before we scaled up our quantization throughput our current almost industrial scale. We also train our imatrix in F16 for all models other than r1 for which we use Q8 while many other quarters use less percussion for imatrix computation. We are perfectionists and value quality above almost everything. @ubergarm Did you actually measure any real-world difference between our and bartowski's imatrix quants? I don't see you would see one unless you test all kind of different real world use cases unless just having a larger imatrix and so doing more imatrix training has a measurable effect.

I'm still experimenting on how the various mixes perform in terms of quality (perplexity) and speed (llama-bench) for prompt processing and token generation. Amazing how fast this stuff is moving!

I highly recommend you meassure kv-divergence, top token probability and same token probability instead of perplexity to get much better data.

Thanks for all your help and all the quants!

No problem. Glad I was able to help. If you need anything else please just let me know.

Model card side-bar missing?

/mradermacher/root/DeepSeek-V3-0324.i1-IQ2_S.gguf - GGUF Internal File Dump

Key Value Metadata Store