kawchar85 commited on
Commit
67056f3
·
verified ·
1 Parent(s): 9502f74

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +230 -3
README.md CHANGED
@@ -1,3 +1,230 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - unsloth/SmolLM2-1.7B-Instruct
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - prompt-engineering
8
+ - svg-generation
9
+ - vector-graphics
10
+ - prompt-enhancement
11
+ - lora
12
+ - unsloth
13
+ language: en
14
+ ---
15
+
16
+ # SmolLM2-1.7B-Instruct-Prompt-Enhancer
17
+
18
+ ## Model Description
19
+
20
+ SmolLM2-1.7B-Instruct-Prompt-Enhancer is a fine-tuned version of [unsloth/SmolLM2-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM2-1.7B-Instruct) specifically trained for **converting simple image descriptions into SVG-friendly prompts**. This model specializes in transforming basic concepts into detailed, vector-optimized descriptions that emphasize geometric shapes, flat design principles, and SVG-compatible visual elements.
21
+
22
+ ## Key Innovation: SVG-Optimized Prompt Engineering
23
+
24
+ This model addresses a critical gap in vector graphics generation:
25
+ - **Input**: Simple, casual image descriptions ("a lighthouse overlooking the ocean")
26
+ - **Output**: Detailed SVG-friendly prompts with geometric precision and flat design specifications
27
+ - **Purpose**: Optimize text-to-SVG generation by providing vector-appropriate prompts
28
+
29
+ ## Intended Use
30
+
31
+ This model transforms simple descriptions into SVG-friendly prompts by:
32
+ - **Preserving all original elements** while expanding description detail
33
+ - **Adding geometric precision** for complex shapes and arrangements
34
+ - **Specifying SVG constraints** (no gradients, no shadows, clean edges)
35
+ - **Emphasizing flat design** principles for vector compatibility
36
+ - **Providing spatial arrangements** and compositional guidance
37
+
38
+ ## Model Details
39
+
40
+ - **Base Model**: unsloth/SmolLM2-1.7B-Instruct
41
+ - **Model Size**: 1.7B parameters
42
+ - **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
43
+ - **Training Framework**: Transformers + TRL + PEFT + Unsloth
44
+ - **License**: apache-2.0
45
+
46
+ ## Training Details
47
+
48
+ ### Training Configuration
49
+ - **Training Method**: Supervised Fine-Tuning (SFT) with LoRA
50
+ - **LoRA Configuration**:
51
+ - r: 24
52
+ - lora_alpha: 48
53
+ - lora_dropout: 0.05
54
+ - Target modules: `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]`
55
+
56
+ - **Training Parameters**:
57
+ - Epochs: 5
58
+ - Learning Rate: 8e-5
59
+ - Batch Size: 8 (per device)
60
+ - Gradient Accumulation Steps: 2
61
+ - Max Sequence Length: 2048
62
+ - LR Scheduler: Cosine
63
+ - NEFTune Noise Alpha: 5 (for improved generalization)
64
+ - Validation: 10% holdout with early stopping
65
+
66
+ ### Enhanced Dataset
67
+ - **Size**: 13,000 examples of simple→SVG-friendly transformations
68
+ - **Sources**: Generated using Claude Sonnet 3.5 and Gemini Flash 2.0
69
+ - **Quality**: High-quality prompt engineering examples
70
+ - **Coverage**: Diverse visual concepts, geometric patterns, everyday objects, and complex compositions
71
+
72
+ ## Usage
73
+
74
+ ### Installation
75
+
76
+ ```bash
77
+ pip install transformers torch
78
+ ```
79
+
80
+ ### Basic Usage
81
+
82
+ ```python
83
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
84
+ import torch
85
+
86
+ model_path = "kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer"
87
+
88
+ # Load model and tokenizer
89
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
90
+ model = AutoModelForCausalLM.from_pretrained(
91
+ model_path,
92
+ torch_dtype=torch.float16,
93
+ trust_remote_code=True,
94
+ device_map="auto"
95
+ )
96
+
97
+ # Create pipeline
98
+ chat_pipe = pipeline(
99
+ "text-generation",
100
+ model=model,
101
+ tokenizer=tokenizer,
102
+ return_full_text=False,
103
+ )
104
+
105
+ def get_message(simple_prompt):
106
+ system_msg = """\
107
+ You are an expert prompt engineer specializing in converting simple image concepts into SVG-friendly prompts.
108
+ When given a short description, output exactly one enhanced prompt that:
109
+ 1. PRESERVES ALL DETAILS mentioned in the simple prompt - every element must appear in the enhanced version
110
+ 2. NEVER omits or changes any objects, colors, or shapes mentioned in the simple prompt
111
+ 3. Uses geometric precision to represent complex elements
112
+ 4. Includes terms like "vector illustration," "flat design," "minimalist," "geometric shapes," "solid colors"
113
+ 5. Specifies "no gradients", "no shadows", "clean edges", "hard edges"
114
+ 6. Mentions "2D perspective" or "flat style" to avoid 3D rendering
115
+ 7. Emphasizes "solid fill colors" and "uniform stroke weight"
116
+ 8. Always includes specific spatial arrangement of elements (positioned, arranged, distributed)
117
+ 9. Keeps the background simple (solid color, transparent, or minimal)
118
+ 10. Does NOT just paste the simple prompt with minor additions - truly EXPANDS and DESCRIBES it
119
+ 11. Is written in 30-70 words as one fluid, descriptive sentence.
120
+ """
121
+
122
+ user_msg = f'Transform this into an SVG-friendly prompt with geometric shapes and flat design: {simple_prompt}'
123
+ return [
124
+ {"role": "system", "content": system_msg},
125
+ {"role": "user", "content": user_msg}
126
+ ]
127
+
128
+ # Generate SVG-friendly prompt
129
+ description = "a lighthouse overlooking the ocean"
130
+ messages = get_message(description)
131
+
132
+ output = chat_pipe(
133
+ messages,
134
+ max_new_tokens=128,
135
+ do_sample=False,
136
+ )
137
+
138
+ print(f"Original: {description}")
139
+ print(f"Enhanced: {output[0]['generated_text']}")
140
+ ```
141
+
142
+ ## Example Transformations
143
+
144
+ ### Example 1: Abstract Patterns
145
+ **Simple**: "overlapping magenta and cyan trapezoids"
146
+
147
+ **Enhanced**: "Flat vector design with minimalist styling featuring magenta and cyan trapezoidal shapes of varying sizes arranged to create overlapping geometric patterns, solid fill colors, no gradients, no textures, clean hard edges, 2D perspective for efficiency, uniform stroke weight, ideal for SVG format."
148
+
149
+ ### Example 2: Transport
150
+ **Simple**: "ambulance driving to hospital"
151
+
152
+ **Enhanced**: "Vector depiction of an ambulance defined by a rectangular body, circular wheels, and a red cross symbol displayed on the side as simple intersecting rectangles, driving towards a stylized hospital depicted with a rectangular building and a smaller cross, solid color fills, no gradients or shadows, clean hard edges, minimalist emergency vehicle design viewed in 2D."
153
+
154
+ ### Example 3: Household
155
+ **Simple**: "simple bed with two pillows"
156
+
157
+ **Enhanced**: "Flat vector design of a minimalist bed, constructed from a rectangular frame, supporting a rectangular mattress and two square pillows positioned symmetrically at the head, solid fill colors in calming blues and whites, clean edges, no gradients or shadows, 2D top-down view, emphasizing simple geometric forms."
158
+
159
+ ### Example 4: Landscape
160
+ **Simple**: "a lighthouse overlooking the ocean"
161
+
162
+ **Enhanced**: "Vector illustration of coastal scene featuring tall cylindrical structure with white light element positioned above flat blue water surface, geometric lighthouse construction with light positioning, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, oceanic silhouette with beacon indication."
163
+
164
+ ### Example 5: Abstract Patterns
165
+ **Simple**: "khaki triangles and azure crescents"
166
+
167
+ **Enhanced**: "Vector illustration of geometric celestial pattern featuring triangular khaki shapes with crescent curve silhouettes, geometric crescents with triangle elements, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, uniform stroke weight."
168
+
169
+ ### Example 6: Fashion
170
+ **Simple**: "burgundy corduroy pants with patch pockets and silver buttons"
171
+
172
+ **Enhanced**: "Vector illustration of minimalist clothing design featuring burgundy trapezoidal leg shapes with textured pattern indication through subtle geometric lines, rectangular patch pockets positioned symmetrically, circular silver button elements arranged vertically, flat design aesthetic with solid fill colors, no gradients or shadows, clean hard edges, 2D perspective for apparel visualization."
173
+
174
+ ### Example 7: Landscape
175
+ **Simple**: "three trees on a hill"
176
+
177
+ **Enhanced**: "Vector illustration of minimalist landscape featuring three geometric tree structures with triangular canopies and rectangular trunks positioned on an elevated curved hill shape, solid earth tones and greens, no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective with simplified natural forms."
178
+
179
+ ### Example 8: Technology
180
+ **Simple**: "a robot with gears and buttons"
181
+
182
+ **Enhanced**: "Vector design of mechanical figure composed of rectangular torso, circular head, and cylindrical limbs, featuring geometric gear wheels as circular elements with radiating teeth, square and circular button interfaces distributed across the surface, solid metallic colors, no gradients or shadows, clean hard edges, minimalist robotic aesthetic in 2D perspective."
183
+
184
+ ## Key Features
185
+
186
+ ### SVG Optimization Principles
187
+ - **Geometric Precision**: Complex shapes described through basic geometric forms
188
+ - **Flat Design Emphasis**: Consistent specification of 2D perspective and flat styling
189
+ - **Technical Constraints**: Always mentions "no gradients," "no shadows," "clean edges"
190
+ - **Vector Terminology**: Uses "vector illustration," "solid fill colors," "uniform stroke weight"
191
+ - **Spatial Awareness**: Detailed positioning and arrangement descriptions
192
+
193
+ ### Content Preservation
194
+ - **Element Fidelity**: All original objects, colors, and shapes are preserved
195
+ - **Detail Expansion**: Simple concepts are elaborated with geometric precision
196
+ - **Contextual Enhancement**: Spatial relationships and compositions are clarified
197
+ - **Style Consistency**: Maintains coherent SVG-friendly vocabulary throughout
198
+
199
+ ## Performance
200
+
201
+ - **Inference Speed**: ~2-3 seconds per transformation
202
+ - **Output Length**: Optimized for 30-70 words (concise yet comprehensive)
203
+ - **Consistency**: Reliable SVG-specific terminology and constraint specification
204
+ - **Quality**: High-quality prompt engineering with geometric precision
205
+
206
+ ## Limitations
207
+
208
+ - **Specialized Domain**: Optimized for SVG/vector use cases, may not suit other prompt types
209
+ - **Length Constraints**: Designed for concise enhancements (30-70 words)
210
+ - **Style Specificity**: Focused on flat design aesthetic rather than diverse art styles
211
+ - **Vector Focus**: May over-emphasize geometric precision for organic/natural subjects
212
+
213
+ ## Technical Specifications
214
+
215
+ - **Architecture**: Transformer-based language model (1.7B parameters)
216
+ - **Context Length**: 2048 tokens (supports detailed prompt transformations)
217
+ - **Training**: Validation-based with NEFTune noise for improved generalization
218
+ - **Optimization**: LoRA fine-tuning (r=24, alpha=48) with cosine scheduling
219
+ - **Inference**: Optimized for short, precise outputs with deterministic generation
220
+
221
+ ## Citation
222
+
223
+ ```bibtex
224
+ @misc{smollm2-prompt-enhancer-2025,
225
+ title={SmolLM2-1.7B-Instruct-Prompt-Enhancer: Specialized Model for SVG-Friendly Prompt Generation},
226
+ author={kawchar85},
227
+ year={2025},
228
+ url={https://huggingface.co/kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer}
229
+ }
230
+ ```