kawchar85
/

SmolLM2-1.7B-Instruct-Prompt-Enhancer

+---
+license: apache-2.0
+base_model:
+- unsloth/SmolLM2-1.7B-Instruct
+pipeline_tag: text-generation
+tags:
+  - prompt-engineering
+  - svg-generation
+  - vector-graphics
+  - prompt-enhancement
+  - lora
+  - unsloth
+language: en
+---
+# SmolLM2-1.7B-Instruct-Prompt-Enhancer
+## Model Description
+SmolLM2-1.7B-Instruct-Prompt-Enhancer is a fine-tuned version of [unsloth/SmolLM2-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM2-1.7B-Instruct) specifically trained for **converting simple image descriptions into SVG-friendly prompts**. This model specializes in transforming basic concepts into detailed, vector-optimized descriptions that emphasize geometric shapes, flat design principles, and SVG-compatible visual elements.
+## Key Innovation: SVG-Optimized Prompt Engineering
+This model addresses a critical gap in vector graphics generation:
+- **Input**: Simple, casual image descriptions ("a lighthouse overlooking the ocean")
+- **Output**: Detailed SVG-friendly prompts with geometric precision and flat design specifications
+- **Purpose**: Optimize text-to-SVG generation by providing vector-appropriate prompts
+## Intended Use
+This model transforms simple descriptions into SVG-friendly prompts by:
+- **Preserving all original elements** while expanding description detail
+- **Adding geometric precision** for complex shapes and arrangements
+- **Specifying SVG constraints** (no gradients, no shadows, clean edges)
+- **Emphasizing flat design** principles for vector compatibility
+- **Providing spatial arrangements** and compositional guidance
+## Model Details
+- **Base Model**: unsloth/SmolLM2-1.7B-Instruct
+- **Model Size**: 1.7B parameters
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Training Framework**: Transformers + TRL + PEFT + Unsloth
+- **License**: apache-2.0
+## Training Details
+### Training Configuration
+- **Training Method**: Supervised Fine-Tuning (SFT) with LoRA
+- **LoRA Configuration**:
+  - r: 24
+  - lora_alpha: 48
+  - lora_dropout: 0.05
+  - Target modules: `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]`
+- **Training Parameters**:
+  - Epochs: 5
+  - Learning Rate: 8e-5
+  - Batch Size: 8 (per device)
+  - Gradient Accumulation Steps: 2
+  - Max Sequence Length: 2048
+  - LR Scheduler: Cosine
+  - NEFTune Noise Alpha: 5 (for improved generalization)
+  - Validation: 10% holdout with early stopping
+### Enhanced Dataset
+- **Size**: 13,000 examples of simple→SVG-friendly transformations
+- **Sources**: Generated using Claude Sonnet 3.5 and Gemini Flash 2.0
+- **Quality**: High-quality prompt engineering examples
+- **Coverage**: Diverse visual concepts, geometric patterns, everyday objects, and complex compositions
+## Usage
+### Installation
+```bash
+pip install transformers torch
+```
+### Basic Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+import torch
+model_path = "kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer"
+# Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_path,
+    torch_dtype=torch.float16,
+    trust_remote_code=True,
+    device_map="auto"
+)
+# Create pipeline
+chat_pipe = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    return_full_text=False,
+)
+def get_message(simple_prompt):
+    system_msg = """\
+You are an expert prompt engineer specializing in converting simple image concepts into SVG-friendly prompts.
+When given a short description, output exactly one enhanced prompt that:
+1. PRESERVES ALL DETAILS mentioned in the simple prompt - every element must appear in the enhanced version
+2. NEVER omits or changes any objects, colors, or shapes mentioned in the simple prompt
+3. Uses geometric precision to represent complex elements
+4. Includes terms like "vector illustration," "flat design," "minimalist," "geometric shapes," "solid colors"
+5. Specifies "no gradients", "no shadows", "clean edges", "hard edges"
+6. Mentions "2D perspective" or "flat style" to avoid 3D rendering
+7. Emphasizes "solid fill colors" and "uniform stroke weight"
+8. Always includes specific spatial arrangement of elements (positioned, arranged, distributed)
+9. Keeps the background simple (solid color, transparent, or minimal)
+10. Does NOT just paste the simple prompt with minor additions - truly EXPANDS and DESCRIBES it
+11. Is written in 30-70 words as one fluid, descriptive sentence.
+"""
+    user_msg = f'Transform this into an SVG-friendly prompt with geometric shapes and flat design: {simple_prompt}'
+    return [
+        {"role": "system", "content": system_msg},
+        {"role": "user", "content": user_msg}
+    ]
+# Generate SVG-friendly prompt
+description = "a lighthouse overlooking the ocean"
+messages = get_message(description)
+output = chat_pipe(
+    messages,
+    max_new_tokens=128,
+    do_sample=False,
+)
+print(f"Original: {description}")
+print(f"Enhanced: {output[0]['generated_text']}")
+```
+## Example Transformations
+### Example 1: Abstract Patterns
+**Simple**: "overlapping magenta and cyan trapezoids"
+**Enhanced**: "Flat vector design with minimalist styling featuring magenta and cyan trapezoidal shapes of varying sizes arranged to create overlapping geometric patterns, solid fill colors, no gradients, no textures, clean hard edges, 2D perspective for efficiency, uniform stroke weight, ideal for SVG format."
+### Example 2: Transport
+**Simple**: "ambulance driving to hospital"
+**Enhanced**: "Vector depiction of an ambulance defined by a rectangular body, circular wheels, and a red cross symbol displayed on the side as simple intersecting rectangles, driving towards a stylized hospital depicted with a rectangular building and a smaller cross, solid color fills, no gradients or shadows, clean hard edges, minimalist emergency vehicle design viewed in 2D."
+### Example 3: Household
+**Simple**: "simple bed with two pillows"
+**Enhanced**: "Flat vector design of a minimalist bed, constructed from a rectangular frame, supporting a rectangular mattress and two square pillows positioned symmetrically at the head, solid fill colors in calming blues and whites, clean edges, no gradients or shadows, 2D top-down view, emphasizing simple geometric forms."
+### Example 4: Landscape
+**Simple**: "a lighthouse overlooking the ocean"
+**Enhanced**: "Vector illustration of coastal scene featuring tall cylindrical structure with white light element positioned above flat blue water surface, geometric lighthouse construction with light positioning, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, oceanic silhouette with beacon indication."
+### Example 5: Abstract Patterns
+**Simple**: "khaki triangles and azure crescents"
+**Enhanced**: "Vector illustration of geometric celestial pattern featuring triangular khaki shapes with crescent curve silhouettes, geometric crescents with triangle elements, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, uniform stroke weight."
+### Example 6: Fashion
+**Simple**: "burgundy corduroy pants with patch pockets and silver buttons"
+**Enhanced**: "Vector illustration of minimalist clothing design featuring burgundy trapezoidal leg shapes with textured pattern indication through subtle geometric lines, rectangular patch pockets positioned symmetrically, circular silver button elements arranged vertically, flat design aesthetic with solid fill colors, no gradients or shadows, clean hard edges, 2D perspective for apparel visualization."
+### Example 7: Landscape
+**Simple**: "three trees on a hill"
+**Enhanced**: "Vector illustration of minimalist landscape featuring three geometric tree structures with triangular canopies and rectangular trunks positioned on an elevated curved hill shape, solid earth tones and greens, no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective with simplified natural forms."
+### Example 8: Technology
+**Simple**: "a robot with gears and buttons"
+**Enhanced**: "Vector design of mechanical figure composed of rectangular torso, circular head, and cylindrical limbs, featuring geometric gear wheels as circular elements with radiating teeth, square and circular button interfaces distributed across the surface, solid metallic colors, no gradients or shadows, clean hard edges, minimalist robotic aesthetic in 2D perspective."
+## Key Features
+### SVG Optimization Principles
+- **Geometric Precision**: Complex shapes described through basic geometric forms
+- **Flat Design Emphasis**: Consistent specification of 2D perspective and flat styling
+- **Technical Constraints**: Always mentions "no gradients," "no shadows," "clean edges"
+- **Vector Terminology**: Uses "vector illustration," "solid fill colors," "uniform stroke weight"
+- **Spatial Awareness**: Detailed positioning and arrangement descriptions
+### Content Preservation
+- **Element Fidelity**: All original objects, colors, and shapes are preserved
+- **Detail Expansion**: Simple concepts are elaborated with geometric precision
+- **Contextual Enhancement**: Spatial relationships and compositions are clarified
+- **Style Consistency**: Maintains coherent SVG-friendly vocabulary throughout
+## Performance
+- **Inference Speed**: ~2-3 seconds per transformation
+- **Output Length**: Optimized for 30-70 words (concise yet comprehensive)
+- **Consistency**: Reliable SVG-specific terminology and constraint specification
+- **Quality**: High-quality prompt engineering with geometric precision
+## Limitations
+- **Specialized Domain**: Optimized for SVG/vector use cases, may not suit other prompt types
+- **Length Constraints**: Designed for concise enhancements (30-70 words)
+- **Style Specificity**: Focused on flat design aesthetic rather than diverse art styles
+- **Vector Focus**: May over-emphasize geometric precision for organic/natural subjects
+## Technical Specifications
+- **Architecture**: Transformer-based language model (1.7B parameters)
+- **Context Length**: 2048 tokens (supports detailed prompt transformations)
+- **Training**: Validation-based with NEFTune noise for improved generalization
+- **Optimization**: LoRA fine-tuning (r=24, alpha=48) with cosine scheduling
+- **Inference**: Optimized for short, precise outputs with deterministic generation
+## Citation
+```bibtex
+@misc{smollm2-prompt-enhancer-2025,
+  title={SmolLM2-1.7B-Instruct-Prompt-Enhancer: Specialized Model for SVG-Friendly Prompt Generation},
+  author={kawchar85},
+  year={2025},
+  url={https://huggingface.co/kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer}
+}
+```