ehartford commited on
Commit
e8dc7be
1 Parent(s): 67878db

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -0
README.md ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: tongyi-qianwen
4
+ base_model: Qwen/Qwen2-72B
5
+ tags:
6
+ - generated_from_trainer
7
+ - axolotl
8
+ datasets:
9
+ - cognitivecomputations/Dolphin-2.9
10
+ - teknium/OpenHermes-2.5
11
+ - m-a-p/CodeFeedback-Filtered-Instruction
12
+ - cognitivecomputations/dolphin-coder
13
+ - cognitivecomputations/samantha-data
14
+ - microsoft/orca-math-word-problems-200k
15
+ - Locutusque/function-calling-chatml
16
+ - internlm/Agent-FLAN
17
+ ---
18
+
19
+ # DolphinVision 7b 🐬
20
+
21
+ Curated and trained by Quan Nguyen (qnguyen3/stablequan), Eric Hartford, and Cognitive Computations
22
+
23
+ [![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/h3K4XGj2RH)
24
+ Discord: https://discord.gg/h3K4XGj2RH
25
+
26
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/DBGu4dJ95RHHN3yOEuXuP.png" width="600" />
27
+
28
+ Our appreciation for the sponsors of DolphinVision:
29
+ - [TensorWave](https://tensorwave.com/) - provided 8x mi300x node used for training, evaluations, and inference
30
+
31
+ DolphinVision is a multimodal model. It is uncensored, and capable to reason and comment regarding images that other popular models would object to.
32
+
33
+
34
+ ```python
35
+ import torch
36
+ import transformers
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer
38
+ from PIL import Image
39
+ import warnings
40
+
41
+ # disable some warnings
42
+ transformers.logging.set_verbosity_error()
43
+ transformers.logging.disable_progress_bar()
44
+ warnings.filterwarnings('ignore')
45
+
46
+ # set device
47
+ torch.set_default_device('cuda') # or 'cpu'
48
+
49
+ model_name = 'cognitivecomputations/dolphin-vision-7b'
50
+
51
+ # create model
52
+ model = AutoModelForCausalLM.from_pretrained(
53
+ model_name,
54
+ torch_dtype=torch.float16,
55
+ device_map='auto',
56
+ trust_remote_code=True)
57
+ tokenizer = AutoTokenizer.from_pretrained(
58
+ model_name,
59
+ trust_remote_code=True)
60
+
61
+ # text prompt
62
+ prompt = 'Describe this image in detail'
63
+
64
+ messages = [
65
+ {"role": "user", "content": f'<image>\n{prompt}'}
66
+ ]
67
+ text = tokenizer.apply_chat_template(
68
+ messages,
69
+ tokenize=False,
70
+ add_generation_prompt=True
71
+ )
72
+
73
+ print(text)
74
+
75
+ text_chunks = [tokenizer(chunk).input_ids for chunk in text.split('<image>')]
76
+ input_ids = torch.tensor(text_chunks[0] + [-200] + text_chunks[1], dtype=torch.long).unsqueeze(0)
77
+
78
+ # image, sample images can be found in images folder
79
+ image = Image.open('/path/to/image.png')
80
+ image_tensor = model.process_images([image], model.config).to(dtype=model.dtype)
81
+
82
+ # generate
83
+ output_ids = model.generate(
84
+ input_ids,
85
+ images=image_tensor,
86
+ max_new_tokens=2048,
87
+ use_cache=True)[0]
88
+
89
+ print(tokenizer.decode(output_ids[input_ids.shape[1]:], skip_special_tokens=True).strip())
90
+ ```
91
+
92
+ ## Examples
93
+
94
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/TFl5SJ4o3v-hKIqA82_qA.png)
95
+ ```
96
+ <|im_start|>user
97
+ : do OCR on all the text
98
+ <|im_start|>assistant
99
+ : The text in the image reads as follows:
100
+
101
+ - STOP USING DOCKER
102
+ - Applications were not meant to be trapped in virtual jars like digital fireflies
103
+ - Years of software development, yet no one questioned why we were putting programs in imaginary boats
104
+ - Let's containerize our monolith. I'll just quickly spin up 17 microservices. Statements dreamt up by the utterly deranged
105
+ - Observe the whale-sized deception they fooled us with all this time
106
+ - These are real images used to explain Docker
107
+ - "Yay Matey, I'd like to run my website in a virtual sea"
108
+ - They have played us for absolute fools
109
+
110
+ This text is a satirical commentary on the use of Docker in software development, using humor and exaggeration to question the practice.
111
+ ```
112
+
113
+ ## Benchmarks
114
+
115
+ TBD