Upload folder using huggingface_hub
Browse files- README.md +51 -0
- config.json +3 -0
- llama-13b-int4-gpr1.pkl +3 -0
- llama-13b-int4-gpr2.pkl +3 -0
- llama-13b-int4-gpr4.pkl +3 -0
- llama-13b-int4-gpr8.pkl +3 -0
- llama-2-7b-int4-gpr1.pkl +3 -0
- llama-2-7b-int4-gpr2.pkl +3 -0
- llama-2-7b-int4-gpr4.pkl +3 -0
- llama-2-7b-int4-gpr8.pkl +3 -0
- llama-3-8b-int4-gpr1.pkl +3 -0
- llama-3-8b-int4-gpr2.pkl +3 -0
- llama-3-8b-int4-gpr4.pkl +3 -0
- llama-3-8b-int4-gpr8.pkl +3 -0
- llama-7b-int4-gpr1.pkl +3 -0
- llama-7b-int4-gpr2.pkl +3 -0
- llama-7b-int4-gpr4.pkl +3 -0
- llama-7b-int4-gpr8.pkl +3 -0
README.md
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- sketchtune
|
4 |
+
- sketch to adapt
|
5 |
+
library_name: transformers
|
6 |
+
---
|
7 |
+
|
8 |
+
# Base Models for Fine-tuning in *(ICML 2025) Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation*
|
9 |
+
|
10 |
+
This repository hosts the compressed base models used in the fine-tuning experiments from our ICML 2025 paper: **Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation**. The available models and formats are as follows.
|
11 |
+
|
12 |
+
| Model | Bits | GPR (Groups Per Row) |
|
13 |
+
|---------------|--------|--------------------|
|
14 |
+
| Llama-3-8B | INT4 | 1/2/4/8 |
|
15 |
+
| Llama-2-7B | INT4 | 1/2/4/8 |
|
16 |
+
| Llama-7B | INT4 | 1/2/4/8 |
|
17 |
+
| Llama-13B | INT4 | 1/2/4/8 |
|
18 |
+
|
19 |
+
For full details on how to reproduce the experiments, please refer to our GitHub repository:
|
20 |
+
|
21 |
+
👉 [https://github.com/LeanModels/SketchTune](https://github.com/LeanModels/SketchTune).
|
22 |
+
|
23 |
+
### What is SketchTune?
|
24 |
+
|
25 |
+
SketchTune is a novel method for adapting large language models (LLMs) that focuses on reducing memory usage and improving speed while fine-tuning. Instead of adding low-rank adapters like LoRA or DoRA, it compresses the model's weights into compact, trainable "sketches" for downstream adaptation.
|
26 |
+
|
27 |
+
**Key benefits:**
|
28 |
+
|
29 |
+
* **Combines compression and adaptation** - SketchTune trains directly on compressed representations, removing the need for separate adapters. This saves memory, improves model performance and speed.
|
30 |
+
* **Avoids low-rank limits** - Low-rank adapters assume weight updates follow a low rank structure. SketchTune skips this assumption, using sketching to better capture complex changes in model weights.
|
31 |
+
|
32 |
+
**Performance highlights:**
|
33 |
+
|
34 |
+
* Even with base models that are **2.6–3.5× smaller**, SketchTune **outperforms LoRA, DoRA, and S2FT** on commonsense and math reasoning benchmarks.
|
35 |
+
* On the GSM8K math dataset, SketchTune achieves a **14.48% higher accuracy than LoftQ**, while training **7.3× fewer parameters**.
|
36 |
+
|
37 |
+
For a deep dive into how sketching works, including math details and extensive test results, check out our full paper: [https://arxiv.org/abs/2410.06364](https://arxiv.org/abs/2410.06364).
|
38 |
+
|
39 |
+
### Citation
|
40 |
+
|
41 |
+
If you find this work helpful, please consider citing our paper:
|
42 |
+
```bibtex
|
43 |
+
@inproceedings{
|
44 |
+
zhang2025sketch,
|
45 |
+
title={Sketch to Adapt: Fine-Tunable Sketches for Efficient {LLM} Adaptation},
|
46 |
+
author={Tianyi Zhang and Junda Su and Aditya Desai and Oscar Wu and Zhaozhuo Xu and Anshumali Shrivastava},
|
47 |
+
booktitle={Forty-second International Conference on Machine Learning},
|
48 |
+
year={2025},
|
49 |
+
url={https://openreview.net/forum?id=zZXOXhxO6I}
|
50 |
+
}
|
51 |
+
```
|
config.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"model_type": "llama"
|
3 |
+
}
|
llama-13b-int4-gpr1.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:34b3840acb296dc0a591e8412c64ed9b5645d7e05a16048f8daa14932520cc7c
|
3 |
+
size 6480376649
|
llama-13b-int4-gpr2.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d446e58990eb0bc3653a948d742343fab05e06f66cbaef8dfe45300f379aa867
|
3 |
+
size 6616691527
|
llama-13b-int4-gpr4.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d2d9ca1e6df21127d518bf2f7271df135b2570caafe2e6c1862bde90b72b4fab
|
3 |
+
size 6889321283
|
llama-13b-int4-gpr8.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f9ab0f3b3ba4aca87cf71fa0edf1536eb2a1792b795eadb59609ddde043e6eb0
|
3 |
+
size 7434580771
|
llama-2-7b-int4-gpr1.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:96b8d6a15a0732ca49a6e8b294087ea375a1adda795883ba4cf25571ed546e05
|
3 |
+
size 3325176021
|
llama-2-7b-int4-gpr2.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f6f8149ac88e3f6699ebfb21a68794de391aa213d323ec21deccf5e86e93f3fd
|
3 |
+
size 3412207907
|
llama-2-7b-int4-gpr4.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9476e3e7e7957212fe61ac6c2eec5f59465fd0b48fa897eb20c0e748582911eb
|
3 |
+
size 3586271529
|
llama-2-7b-int4-gpr8.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e1e818f85e610849d7cb5b1ecc6369dec3014d62b1b407f7d8ec2c9d2dcb41e0
|
3 |
+
size 3934398841
|
llama-3-8b-int4-gpr1.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9f278c8e71b89ac3427268286764db6b48999cb419c04b863b45feff8bca1570
|
3 |
+
size 3577882741
|
llama-3-8b-int4-gpr2.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:40e49c69cad3d89d4d623735fd6fb07d764896ba8b84dfd83daeaecca6a9739a
|
3 |
+
size 3665963167
|
llama-3-8b-int4-gpr4.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:26f36bb735ba736137610231ca27ec7329defdf6de9de50cc4d5235985949811
|
3 |
+
size 3842124149
|
llama-3-8b-int4-gpr8.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d040af24cbf1fbbf4da1a28d5f1ae0b73ea205976f87e3a7e8dd77e6902e4e6b
|
3 |
+
size 4194445635
|
llama-7b-int4-gpr1.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8401d82ff58c5bf9d07739149b8db694ff2d38b6015555ad4e20135e93e74ce3
|
3 |
+
size 3325176045
|
llama-7b-int4-gpr2.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:546a0d815a8c888df74c0a3c4e32729d28873f209ae80159203afe893416f463
|
3 |
+
size 3412207905
|
llama-7b-int4-gpr4.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:471ed1f4107558a7e77812d3c5dba2352466a51ad8399791b3c4b5bae02ccfb5
|
3 |
+
size 3586271483
|
llama-7b-int4-gpr8.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c8cc80c1bb204e8332c3eb2fac58914335233096118f27cc8469b165ce19e28c
|
3 |
+
size 3934398831
|