--- library_name: transformers license: gemma base_model: - google/gemma-3-4b-it --- ## Overview [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)をBitsAndBytes(0.44.1)で4bit量子化 量子化の際のコードは以下の通りです。 ~~~~python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig model_id = "google/gemma-3-4b-it" repo_id = "indiebot-community/gemma-3-4b-it-bnb-4bit" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto") tokenizer.push_to_hub(repo_id) model.push_to_hub(repo_id) ~~~~