sk16er commited on
Commit
58425f2
·
verified ·
1 Parent(s): e2212e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -1,3 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  language:
@@ -6,4 +33,4 @@ base_model:
6
  - codellama/CodeLlama-7b-hf
7
  tags:
8
  - co
9
- ---
 
1
+ # CodeVero 7B - 4-bit Quantized
2
+
3
+ This is a 4-bit quantized version of CodeLLaMA 7B, prepared using `bitsandbytes` and Hugging Face Transformers.
4
+ Optimized for inference and fine-tuning in low-resource environments.
5
+
6
+ ## Model Details
7
+ - Base: CodeLLaMA-7B
8
+ - Quantization: bitsandbytes 4-bit (bnb_4bit, NF4)
9
+ - Format: Hugging Face (`.safetensors`)
10
+ - Usage: Transformers
11
+
12
+ ## Example Usage
13
+
14
+ ```python
15
+ from transformers import AutoModelForCausalLM, AutoTokenizer
16
+
17
+ model = AutoModelForCausalLM.from_pretrained("your-username/codevero-7b-4bit", device_map="auto")
18
+ tokenizer = AutoTokenizer.from_pretrained("your-username/codevero-7b-4bit")
19
+
20
+ prompt = "Write a Python function to calculate factorial."
21
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
22
+
23
+ outputs = model.generate(**inputs, max_new_tokens=100)
24
+ print(tokenizer.decode(outputs[0]))
25
+
26
+
27
+
28
  ---
29
  license: mit
30
  language:
 
33
  - codellama/CodeLlama-7b-hf
34
  tags:
35
  - co
36
+ ---