YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This repo contains sparsity report for each of the pruned model in the table below.
The report (csv) shows layer-wise sparsity, sparsity by tile of 128x16, sparsity by col and row global to its layers.
Pruning meta-llama/Meta-Llama-3.1-8B
with Wanda
Weight Target Sparsity | Perplexity (lower is better) |
---|---|
0 (dense, baseline) | 5.8393 |
10 | 5.8781 |
20 | 6.0102 |
30 | 6.3076 |
40 | 7.0094 |
50 | 9.0642 |
60 | 20.2265 |
70 | 103.5209 |
For a more granular sparsity report within a given tile, pls continue below.
Install
pip install torch ipython pandas
Interative look up a specific tile of a layer
# pls make sure git lfs is installed at your end
git clone https://huggingface.co/vuiseng9/24-0830-wanda-llama3.1-8B
cd 24-0830-wanda-llama3.1-8B
./interactive_sparsity.sh
Expected outcome in as follows, it will be in ipython console with the needed functionality loaded.
$ ./interactive_sparsity.sh
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.
- Help ------------------
h = SparseBlob("path to sparsity blob")
SparseBlob.preview:
preview sparsity dataframe, intend to show row id, short id for look up
eg. h.preview()
SparseBlob.ls_layers:
list all available layer ids for look up
eg. h.ls_layers()
SparseBlob.get_sparsity_by_short_id:
return a sparsity stats of a layer via short_id lookup.
eg. h.get_sparsity_by_short_id('tx.0.attn.v')
SparseBlob.get_sparsity_by_row_id:
return a sparsity stats of a layer via row id lookup.
eg. h.get_sparsity_by_row_id(36)
SparseBlob.get_sparsity_of_tile:
zoom into a specific layer and a specific tile,
return the sparsity stats of the tile down to col, row granularity
eg. h.get_sparsity_by_row_id(36, (5, 6))
SparseBlob.show_help:
print help for available function of SparseBlob
eg. h.show_help()
- End of Help ------------------
In [1]:
Sample usage:
In [1]: ls blob*
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.0
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.1
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.2
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.3
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.4
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.6
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.7
In [2]: h = SparseBlob("blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5")
In [3]: h.preview()
layer_id short_id ... row_med row_max
0 model.layers.0.self_attn.q_proj tx.0.attn.q ... 0.5000 1.0000
1 model.layers.0.self_attn.k_proj tx.0.attn.k ... 0.5000 1.0000
2 model.layers.0.self_attn.v_proj tx.0.attn.v ... 0.5000 1.0000
3 model.layers.0.self_attn.o_proj tx.0.attn.o ... 0.5000 1.0000
4 model.layers.0.mlp.gate_proj tx.0.mlp.gate ... 0.5000 1.0000
5 model.layers.0.mlp.up_proj tx.0.mlp.up ... 0.5000 1.0000
6 model.layers.0.mlp.down_proj tx.0.mlp.down ... 0.5000 1.0000
7 model.layers.1.self_attn.q_proj tx.1.attn.q ... 0.5000 1.0000
8 model.layers.1.self_attn.k_proj tx.1.attn.k ... 0.5000 1.0000
9 model.layers.1.self_attn.v_proj tx.1.attn.v ... 0.5000 1.0000
10 model.layers.1.self_attn.o_proj tx.1.attn.o ... 0.5000 1.0000
11 model.layers.1.mlp.gate_proj tx.1.mlp.gate ... 0.5000 1.0000
.
.
.
222 model.layers.31.mlp.up_proj tx.31.mlp.up ... 0.5000 1.0000
223 model.layers.31.mlp.down_proj tx.31.mlp.down ... 0.5000 1.0000
224 lm_head lm_head ... 0.0000 0.0000
[225 rows x 23 columns]
In [4]: h.get_sparsity_by_row_id(10)
Out[4]:
layer_id model.layers.1.self_attn.o_proj
short_id tx.1.attn.o
layer_type Linear
param_type weight
shape [4096, 4096]
nparam 16777216
nnz 8388608
sparsity 0.5000
tile_shape (128, 16)
n_tile 32 x 256
n_tile_total 8192
tile_avg 0.5000
tile_min 0.2197
tile_med 0.5073
tile_max 0.9678
col_avg 0.5000
col_min 0.0312
col_med 0.4609
col_max 1.0000
row_avg 0.5000
row_min 0.0000
row_med 0.5000
row_max 1.0000
Name: 10, dtype: object
In [5]: h.get_sparsity_of_tile(10, (30, 245))
(30, 245) : tile_id
model.layers.1.self_attn.o_proj : layer_id
(128, 16) : tiled by
0.2861 : tile_sparsity
16 : col_count
0.2861 : col_avg
0.2266 : col_min
0.2734 : col_med
0.3594 : col_max
128 : row_count
0.2861 : row_avg
0.0000 : row_min
0.2500 : row_med
0.6250 : row_max
Internal notes
see patch wanda branch here. see the raw!