LeanQuant commited on
Commit
324a0cc
·
verified ·
1 Parent(s): ac006b8

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sketchtune
4
+ - sketch to adapt
5
+ library_name: transformers
6
+ ---
7
+
8
+ # Base Models for Fine-tuning in *(ICML 2025) Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation*
9
+
10
+ This repository hosts the compressed base models used in the fine-tuning experiments from our ICML 2025 paper: **Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation**. The available models and formats are as follows.
11
+
12
+ | Model | Bits | GPR (Groups Per Row) |
13
+ |---------------|--------|--------------------|
14
+ | Llama-3-8B | INT4 | 1/2/4/8 |
15
+ | Llama-2-7B | INT4 | 1/2/4/8 |
16
+ | Llama-7B | INT4 | 1/2/4/8 |
17
+ | Llama-13B | INT4 | 1/2/4/8 |
18
+
19
+ For full details on how to reproduce the experiments, please refer to our GitHub repository:
20
+
21
+ 👉 [https://github.com/LeanModels/SketchTune](https://github.com/LeanModels/SketchTune).
22
+
23
+ ### What is SketchTune?
24
+
25
+ SketchTune is a novel method for adapting large language models (LLMs) that focuses on reducing memory usage and improving speed while fine-tuning. Instead of adding low-rank adapters like LoRA or DoRA, it compresses the model's weights into compact, trainable "sketches" for downstream adaptation.
26
+
27
+ **Key benefits:**
28
+
29
+ * **Combines compression and adaptation** - SketchTune trains directly on compressed representations, removing the need for separate adapters. This saves memory, improves model performance and speed.
30
+ * **Avoids low-rank limits** - Low-rank adapters assume weight updates follow a low rank structure. SketchTune skips this assumption, using sketching to better capture complex changes in model weights.
31
+
32
+ **Performance highlights:**
33
+
34
+ * Even with base models that are **2.6–3.5× smaller**, SketchTune **outperforms LoRA, DoRA, and S2FT** on commonsense and math reasoning benchmarks.
35
+ * On the GSM8K math dataset, SketchTune achieves a **14.48% higher accuracy than LoftQ**, while training **7.3× fewer parameters**.
36
+
37
+ For a deep dive into how sketching works, including math details and extensive test results, check out our full paper: [https://arxiv.org/abs/2410.06364](https://arxiv.org/abs/2410.06364).
38
+
39
+ ### Citation
40
+
41
+ If you find this work helpful, please consider citing our paper:
42
+ ```bibtex
43
+ @inproceedings{
44
+ zhang2025sketch,
45
+ title={Sketch to Adapt: Fine-Tunable Sketches for Efficient {LLM} Adaptation},
46
+ author={Tianyi Zhang and Junda Su and Aditya Desai and Oscar Wu and Zhaozhuo Xu and Anshumali Shrivastava},
47
+ booktitle={Forty-second International Conference on Machine Learning},
48
+ year={2025},
49
+ url={https://openreview.net/forum?id=zZXOXhxO6I}
50
+ }
51
+ ```
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "model_type": "llama"
3
+ }
llama-13b-int4-gpr1.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34b3840acb296dc0a591e8412c64ed9b5645d7e05a16048f8daa14932520cc7c
3
+ size 6480376649
llama-13b-int4-gpr2.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d446e58990eb0bc3653a948d742343fab05e06f66cbaef8dfe45300f379aa867
3
+ size 6616691527
llama-13b-int4-gpr4.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2d9ca1e6df21127d518bf2f7271df135b2570caafe2e6c1862bde90b72b4fab
3
+ size 6889321283
llama-13b-int4-gpr8.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9ab0f3b3ba4aca87cf71fa0edf1536eb2a1792b795eadb59609ddde043e6eb0
3
+ size 7434580771
llama-2-7b-int4-gpr1.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96b8d6a15a0732ca49a6e8b294087ea375a1adda795883ba4cf25571ed546e05
3
+ size 3325176021
llama-2-7b-int4-gpr2.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6f8149ac88e3f6699ebfb21a68794de391aa213d323ec21deccf5e86e93f3fd
3
+ size 3412207907
llama-2-7b-int4-gpr4.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9476e3e7e7957212fe61ac6c2eec5f59465fd0b48fa897eb20c0e748582911eb
3
+ size 3586271529
llama-2-7b-int4-gpr8.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1e818f85e610849d7cb5b1ecc6369dec3014d62b1b407f7d8ec2c9d2dcb41e0
3
+ size 3934398841
llama-3-8b-int4-gpr1.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f278c8e71b89ac3427268286764db6b48999cb419c04b863b45feff8bca1570
3
+ size 3577882741
llama-3-8b-int4-gpr2.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:40e49c69cad3d89d4d623735fd6fb07d764896ba8b84dfd83daeaecca6a9739a
3
+ size 3665963167
llama-3-8b-int4-gpr4.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26f36bb735ba736137610231ca27ec7329defdf6de9de50cc4d5235985949811
3
+ size 3842124149
llama-3-8b-int4-gpr8.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d040af24cbf1fbbf4da1a28d5f1ae0b73ea205976f87e3a7e8dd77e6902e4e6b
3
+ size 4194445635
llama-7b-int4-gpr1.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8401d82ff58c5bf9d07739149b8db694ff2d38b6015555ad4e20135e93e74ce3
3
+ size 3325176045
llama-7b-int4-gpr2.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:546a0d815a8c888df74c0a3c4e32729d28873f209ae80159203afe893416f463
3
+ size 3412207905
llama-7b-int4-gpr4.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:471ed1f4107558a7e77812d3c5dba2352466a51ad8399791b3c4b5bae02ccfb5
3
+ size 3586271483
llama-7b-int4-gpr8.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8cc80c1bb204e8332c3eb2fac58914335233096118f27cc8469b165ce19e28c
3
+ size 3934398831