Siddharth63
/

Qwen3-8B-Base-4bits-AutoRound-sym

4-bit precision

Model card Files Files and versions Community

Siddharth63 commited on May 2

Commit

c46e264

·

verified ·

1 Parent(s): 44b29f0

Update README.md

Files changed (1) hide show

README.md +20 -3

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+```
+!pip install --upgrade auto-round transformers
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+from auto_round import AutoRoundConfig  ## must import for auto-round format
+quantized_model_path = "Siddharth63/Qwen3-8B-Base-4bits-AutoRound-sym"
+quantization_config = AutoRoundConfig(backend="auto")
+model = AutoModelForCausalLM.from_pretrained(quantized_model_path, device_map="auto",
+                                             torch_dtype=torch.float16,
+                                             quantization_config=quantization_config)
+tokenizer = AutoTokenizer.from_pretrained(quantized_model_path)
+text = "Atherosclerosis"
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+print(tokenizer.decode(model.generate(**inputs, max_new_tokens=50)[0]))
+```