Smilyai-labs
/

Sam-2.5-4

+---
+license: mit
+language:
+- en
+pipeline_tag: text-generation
+library_name: transformers
+---
+# 🧠 Model Card: Sam-2.5-4 (Pro Version)
+## 📌 Overview
+**Sam-2.5-4** is the Pro continuation of the Sam-2.5 architecture series, designed for modular, multi-domain reasoning across math, dialogue, code, and open-domain tasks. It builds directly on **Sam-2.5-3**, continuing training for four additional epochs to deepen convergence, reduce domain bias, and improve generalization.
+This model is optimized for transparency, ablation-readiness, and deployment on both high-resource and low-resource devices (including Raspberry Pi).
+---
+## 🧬 Model Lineage
+| Version       | Description                                                                 |
+|---------------|------------------------------------------------------------------------------|
+| **Sam-2.5-2** | GSM8K-heavy fine-tune; overfit to math; lacked domain balance               |
+| **Sam-2.5-3** | Emergency patch; retrained from scratch on 4 datasets; balanced capabilities |
+| **Sam-2.5-4** | Pro version; continued training for 4 epochs; refined convergence and fluency|
+---
+## 🧠 Architecture
+- Transformer-based, modular design
+- Registry-driven domain tagging and ablation toggles
+- Shape-adaptive loss functions with domain-aware diagnostics
+- Quantization-ready for Pi deployment
+- Verbose logging for batch-level feedback and anomaly tracing
+- Memory-safe serialization via `safetensors`
+---
+## 📚 Training Datasets
+| Dataset                | Domain Focus                    |
+|------------------------|----------------------------------|
+| **GSM8K**              | Mathematical reasoning          |
+| **MultiWOZ**           | Multi-turn dialogue & task flow |
+| **Alpaca-Code-Cleaned**| Code generation & logic         |
+| **UltraChat-200k**     | Open-domain conversation        |
+- Datasets were concatenated, shuffled, and tagged for domain awareness
+- Replay and mixing strategies used to balance underrepresented domains
+- Training spanned **9 total epochs** (5 in -3, 4 in -4)
+---
+## 📈 Performance Summary
+| Metric                  | Value (Epoch 8–9)               |
+|-------------------------|----------------------------------|
+| **Validation Loss**     | ↓ 2.95 (avg across domains)     |
+| **Max Domain Loss**     | < 3.4 (no domain exceeded)      |
+| **Math Bias**           | Resolved (loss spikes absorbed) |
+| **Dialogue Coherence**  | Improved (MultiWOZ eval)        |
+| **Code Determinism**    | Increased (Alpaca eval)         |
+| **Open-Domain Fluency** | Fewer hallucinations, better grounding |
+---
+## 🧪 Evaluation & Diagnostics
+- Loss spikes in early epochs traced to GSM8K; resolved by epoch 6
+- Batch-level diagnostics printed per domain and token type
+- Attention stability improved on long-context prompts
+- Token transitions cleaner across dialogue and code tasks
+- Validation curve shows smooth convergence post-epoch 5
+---
+## 🧩 Deployment Notes
+- Compatible with Raspberry Pi (quantized + safetensors)
+- Supports CLI-based training diagnostics (loss, ETA, memory)
+- Registry hooks enable domain-specific ablation and extension
+- Ideal for benchmarking on GSM8K, MultiWOZ, UltraChat, and custom blends
+---
+## 🤝 Intended Use
+- Research on modular Transformer architectures
+- Benchmarking across reasoning, dialogue, and code domains
+- Deployment on constrained hardware (e.g. Pi, ARM)
+- Community-driven extension and ablation testing
+---
+## ⚠️ Limitations
+- Still sensitive to prompt phrasing in edge cases
+- Long-context performance may degrade beyond 2k tokens
+- Requires domain tags for optimal generalization
+- Not trained on multimodal inputs (text-only)
+---
+## 🙌 Acknowledgments
+Thanks to the open-source community, dataset curators, and contributors who helped shape Sam-2.5-4. This release reflects our shared commitment to transparent, inspectable, and extensible AI.