This repo contains sparsity report for each of the pruned model in the table below. The report (csv) shows layer-wise sparsity, sparsity by tile of 128x16, sparsity by col and row global to its layers. ![Perplexity over Sparsity](./perplexity_vs_sparsity.png) Pruning ```meta-llama/Meta-Llama-3.1-8B``` with Wanda | Weight Target Sparsity | Perplexity (lower is better) | |------------------------|-----------------------------| | 0 (dense, baseline) | 5.8393 | | 10 | 5.8781 | | 20 | 6.0102 | | 30 | 6.3076 | | 40 | 7.0094 | | 50 | 9.0642 | | 60 | 20.2265 | | 70 | 103.5209 | > For a more granular sparsity report within a given tile, pls continue below. # Install ```pip install torch ipython pandas``` # Interative look up a specific tile of a layer ```bash # pls make sure git lfs is installed at your end git clone https://huggingface.co/vuiseng9/24-0830-wanda-llama3.1-8B cd 24-0830-wanda-llama3.1-8B ./interactive_sparsity.sh ``` Expected outcome in as follows, it will be in ipython console with the needed functionality loaded. ``` $ ./interactive_sparsity.sh Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] Type 'copyright', 'credits' or 'license' for more information IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help. - Help ------------------ h = SparseBlob("path to sparsity blob") SparseBlob.preview: preview sparsity dataframe, intend to show row id, short id for look up eg. h.preview() SparseBlob.ls_layers: list all available layer ids for look up eg. h.ls_layers() SparseBlob.get_sparsity_by_short_id: return a sparsity stats of a layer via short_id lookup. eg. h.get_sparsity_by_short_id('tx.0.attn.v') SparseBlob.get_sparsity_by_row_id: return a sparsity stats of a layer via row id lookup. eg. h.get_sparsity_by_row_id(36) SparseBlob.get_sparsity_of_tile: zoom into a specific layer and a specific tile, return the sparsity stats of the tile down to col, row granularity eg. h.get_sparsity_by_row_id(36, (5, 6)) SparseBlob.show_help: print help for available function of SparseBlob eg. h.show_help() - End of Help ------------------ In [1]: ``` Sample usage: ``` In [1]: ls blob* blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.0 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.1 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.2 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.3 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.4 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.6 blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.7 In [2]: h = SparseBlob("blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5") In [3]: h.preview() layer_id short_id ... row_med row_max 0 model.layers.0.self_attn.q_proj tx.0.attn.q ... 0.5000 1.0000 1 model.layers.0.self_attn.k_proj tx.0.attn.k ... 0.5000 1.0000 2 model.layers.0.self_attn.v_proj tx.0.attn.v ... 0.5000 1.0000 3 model.layers.0.self_attn.o_proj tx.0.attn.o ... 0.5000 1.0000 4 model.layers.0.mlp.gate_proj tx.0.mlp.gate ... 0.5000 1.0000 5 model.layers.0.mlp.up_proj tx.0.mlp.up ... 0.5000 1.0000 6 model.layers.0.mlp.down_proj tx.0.mlp.down ... 0.5000 1.0000 7 model.layers.1.self_attn.q_proj tx.1.attn.q ... 0.5000 1.0000 8 model.layers.1.self_attn.k_proj tx.1.attn.k ... 0.5000 1.0000 9 model.layers.1.self_attn.v_proj tx.1.attn.v ... 0.5000 1.0000 10 model.layers.1.self_attn.o_proj tx.1.attn.o ... 0.5000 1.0000 11 model.layers.1.mlp.gate_proj tx.1.mlp.gate ... 0.5000 1.0000 . . . 222 model.layers.31.mlp.up_proj tx.31.mlp.up ... 0.5000 1.0000 223 model.layers.31.mlp.down_proj tx.31.mlp.down ... 0.5000 1.0000 224 lm_head lm_head ... 0.0000 0.0000 [225 rows x 23 columns] In [4]: h.get_sparsity_by_row_id(10) Out[4]: layer_id model.layers.1.self_attn.o_proj short_id tx.1.attn.o layer_type Linear param_type weight shape [4096, 4096] nparam 16777216 nnz 8388608 sparsity 0.5000 tile_shape (128, 16) n_tile 32 x 256 n_tile_total 8192 tile_avg 0.5000 tile_min 0.2197 tile_med 0.5073 tile_max 0.9678 col_avg 0.5000 col_min 0.0312 col_med 0.4609 col_max 1.0000 row_avg 0.5000 row_min 0.0000 row_med 0.5000 row_max 1.0000 Name: 10, dtype: object In [5]: h.get_sparsity_of_tile(10, (30, 245)) (30, 245) : tile_id model.layers.1.self_attn.o_proj : layer_id (128, 16) : tiled by 0.2861 : tile_sparsity 16 : col_count 0.2861 : col_avg 0.2266 : col_min 0.2734 : col_med 0.3594 : col_max 128 : row_count 0.2861 : row_avg 0.0000 : row_min 0.2500 : row_med 0.6250 : row_max ``` ### Internal notes see patch wanda branch [here](https://github.com/vuiseng9/wanda/tree/240823-sparse-weight-analysis). see the raw!