yanghaojin (Haojin Yang)

published a model 16 days ago

GreenBitAI/DeepSeek-R1-Distill-Llama-70B-layer-mix-bpw-4.0-mlx

Updated 17 days ago • 66

published a model 22 days ago

GreenBitAI/DeepSeek-R1-Distill-Qwen-32B-layer-mix-bpw-4.0-mlx

Updated 23 days ago • 24

upvoted an article 9 months ago

Article

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

By

•

May 25, 2024

• 10

upvoted 7 collections 9 months ago

reacted to their post with 🔥❤️🧠🚀 10 months ago

Post

900

Dear community,

Please check our recent blog post, "GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing". A cheaper and more efficient SFT scheme for quantized LLMs is provided.

https://huggingface.co/blog/NicoNico/green-bit-llm

posted an update 10 months ago

Post

900

Dear community,

Please check our recent blog post, "GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing". A cheaper and more efficient SFT scheme for quantized LLMs is provided.

https://huggingface.co/blog/NicoNico/green-bit-llm

reacted to their post with 🚀 10 months ago

Post

2248

Full parameter fine-tuning of the LLaMA-3 8B model using a single GTX 3090 GPU with 24GB of graphics memory?

Please check out our tool for fine-tuning, inferencing, and evaluating GreenBitAI's low-bit LLMs:
https://github.com/GreenBitAI/green-bit-llm
Model Zoo:
https://huggingface.co/GreenBitAI

3 replies

·

reacted to their post with 🔥 11 months ago

Post

2248

Full parameter fine-tuning of the LLaMA-3 8B model using a single GTX 3090 GPU with 24GB of graphics memory?

Please check out our tool for fine-tuning, inferencing, and evaluating GreenBitAI's low-bit LLMs:
https://github.com/GreenBitAI/green-bit-llm
Model Zoo:
https://huggingface.co/GreenBitAI

3 replies

·

replied to their post 11 months ago

Command for reproducing this run 😉 :
CUDA_VISIBLE_DEVICES=0 WANDB_DISABLED=true python -m sft.finetune --model GreenBitAI/Llama-3-8B-layer-mix-bpw-2.2 --tune-qweight-only --galore --galore-rank 64 --optimizer adamw8bit --batch-size 1 --seqlen 96

posted an update 11 months ago

Post

2248

Full parameter fine-tuning of the LLaMA-3 8B model using a single GTX 3090 GPU with 24GB of graphics memory?

Please check out our tool for fine-tuning, inferencing, and evaluating GreenBitAI's low-bit LLMs:
https://github.com/GreenBitAI/green-bit-llm
Model Zoo:
https://huggingface.co/GreenBitAI

3 replies

·

reacted to their post with 🔥 11 months ago

Post

1348

Dear all,

We are happy to share that we have just open-sourced over 200 low-bit LLMs. For the MLX community, we have prepared 2-4 bit versions of mainstream LLMs. You can visit the following collection to access them: GreenBitAI/greenbitai-mlx-llm-6614eb6ceb8da657c2b4ed58.

These low-bit models can be conveniently used through our open-source tool at https://github.com/GreenBitAI/gbx-lm.

Compared to other open-source quantization algorithms, these models provide better accuracy retention. We have provided some model evaluation results here:
https://github.com/GreenBitAI/green-bit-llm/blob/main/green_bit_llm/evaluation/README.md.

You can also evaluate the models yourself using the evaluation script we provided.

1 reply

·

Haojin Yang

AI & ML interests

Recent Activity

Organizations

yanghaojin's activity

GreenBitAI/DeepSeek-R1-Distill-Llama-70B-layer-mix-bpw-4.0-mlx

GreenBitAI/DeepSeek-R1-Distill-Qwen-32B-layer-mix-bpw-4.0-mlx

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

GreenBitAI Phi-3

GreenBitAI Llama-2

GreenBitAI Mistral

GreenBitAI 01-Yi

GreenBitAI Qwen1.5

GreenBitAI Llama-3

GreenBitAI MLX LLM