How to get these new quantinized version of model in LMStudio?

#24
by furyzhenxi - opened

How to get these new quantinized version of model in LMStudio?

MLX Community org

How to get these new quantinized version of model in LMStudio?

Yes, this is exactly the thing I'm trying to figure out by making these new repos. :)

MLX Community org
β€’
edited 22 days ago

To close the loop here, it looks like the LM Studio folks confirmed that they do not support multi-quant repos for MLX at this time.

Here's one example of how you could download a quant directly to where your LM Studio models are saved. LM Studio will automatically recognize new models placed in this directory. NB: this assumes that you are 1. on a POSIX system, and 2. left defaults for where model files are loaded from.
(Please first install huggingface-cli)

# On MacOS, use Homebrew
brew install huggingface-cli

# Using pip
pip install 'huggingface-hub[cli]'

Caution with limited disk space:
The following example will download the quant files to your huggingface cache (default: $HOME/.cache/huggingface/), then duplicate the files into your LM Studio models directory.
If you need to save space, you should clean your cache and delete files no longer required.

WARNING: DO NOT manually add or remove files from your Huggingface cache. It is very delicate, and I can almost guarantee you will corrupt it if you do so, requiring a fresh start.

# Easily modify command without touching it using variables
quant="4bit"        # options include: 4bit, 8bit, mixed_2_6, mixed_3_6, etc
repoId="mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX"       # as far as I know, this is the only repo set up like this. I have held off on uploading more unless it would be helpful?

# Workhorse of this 'script'
huggingface-cli download --include "*${quant}/" ${repoId} --local-dir "${HOME}/.lmstudio/models/mlx-community/" 

To see a list of what's available, you can check out the directories in the repo; or, because why not...let's do it programmatically. Unfortunately the Huggingface CLI is insufficient, so we must use Python:

# first install the package using: pip install -U 'huggingface-hub[cli]'
from huggingface_hub import HfApi

repo_id = "mlx-community/DeepSeek-R1-Distill-Qwen-32B-MLX"
exclude_filenames = ['LICENSE', 'LICENSE.txt', 'README', 'README.md']

api = HfApi()
quant_opts = set()

for file in api.list_repo_files(repo_id=repo_id):
    if file.startswith(".") or file in exclude_filenames: continue
    quant_opts.add(file.split("/",1)[0])

quant_opts = list(quant_opts)
quant_opts.sort()
quant_opts=[ ( quant.split("-")[-1], quant ) for quant in quant_opts]

def display_quants(quant_opts):
    print(f"Quants available for {repo_id}.\nUse the format '*{{quant}}/' as the include glob.")
    print("\nquant: repo directory")
    for quant, dir in quant_opts:
        print(f"{quant}: {dir}")

display_quants(quant_opts)
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment