MMoshtaghi commited on
Commit
0e0e438
·
verified ·
1 Parent(s): e4775a7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +55 -3
README.md CHANGED
@@ -4,8 +4,9 @@ tags:
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
7
- - llava
8
  - trl
 
9
  license: apache-2.0
10
  language:
11
  - en
@@ -16,7 +17,58 @@ language:
16
  - **Developed by:** MMoshtaghi
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** unsloth/pixtral-12b-2409-unsloth-bnb-4bit
 
 
 
 
19
 
20
- This llava model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
7
+ - qwen2_vl
8
  - trl
9
+ - qlora
10
  license: apache-2.0
11
  language:
12
  - en
 
17
  - **Developed by:** MMoshtaghi
18
  - **License:** apache-2.0
19
  - **Finetuned from model :** unsloth/pixtral-12b-2409-unsloth-bnb-4bit
20
+ - **Finetuned on dataset:** (unsloth/llava-instruct-mix-vsft-mini)[https://huggingface.co/datasets/unsloth/llava-instruct-mix-vsft-mini]
21
+ - **PEFT method :** (Quantized LoRA)[https://huggingface.co/papers/2305.14314]
22
+
23
+ ## Quick start
24
 
25
+ ```python
26
+ from datasets import load_dataset
27
+ from unsloth import FastVisionModel
28
 
29
+ model, tokenizer = FastVisionModel.from_pretrained(
30
+ model_name = "MMoshtaghi/Pixtral-12B-2409-LoRAAdpt-General",
31
+ load_in_4bit = True,
32
+ )
33
+ FastVisionModel.for_inference(model) # Enable for inference!
34
+
35
+ dataset = load_dataset("unsloth/llava-instruct-mix-vsft-mini", split = "train")
36
+ image = dataset[2]["images"][0]
37
+ instruction = "Is there something interesting about this image?"
38
+
39
+ messages = [
40
+ {"role": "user", "content": [
41
+ {"type": "image"},
42
+ {"type": "text", "text": instruction}
43
+ ]}
44
+ ]
45
+ input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)
46
+ inputs = tokenizer(
47
+ image,
48
+ input_text,
49
+ add_special_tokens = False,
50
+ return_tensors = "pt",
51
+ ).to("cuda")
52
+
53
+ from transformers import TextStreamer
54
+ text_streamer = TextStreamer(tokenizer, skip_prompt = True)
55
+ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 64,
56
+ use_cache = True, temperature = 1.5, min_p = 0.1)
57
+ ```
58
+
59
+ ### Framework versions
60
+
61
+ - TRL: 0.13.0
62
+ - Transformers: 4.47.1
63
+ - Pytorch: 2.5.1+cu121
64
+ - Datasets: 3.2.0
65
+ - Tokenizers: 0.21.0
66
+ - Unsloth: 2025.1.5
67
+
68
+ ## Training procedure
69
+ (Log-in required!)
70
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/open_ai/huggingface/runs/8juqyo5h)
71
+
72
+
73
+ ## Citations
74
+ This VLM model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.