🧠 Model Card: Sam-2.5-4 (Pro Version)

πŸ“Œ Overview

Sam-2.5-4 is the Pro continuation of the Sam-2.5 architecture series, designed for modular, multi-domain reasoning across math, dialogue, code, and open-domain tasks. It builds directly on Sam-2.5-3, continuing training for four additional epochs to deepen convergence, reduce domain bias, and improve generalization.

This model is optimized for transparency, ablation-readiness, and deployment on both high-resource and low-resource devices (including Raspberry Pi).


🧬 Model Lineage

Version Description
Sam-2.5-2 GSM8K-heavy fine-tune; overfit to math; lacked domain balance
Sam-2.5-3 Emergency patch; retrained from scratch on 4 datasets; balanced capabilities
Sam-2.5-4 Pro version; continued training for 4 epochs; refined convergence and fluency

🧠 Architecture

  • Transformer-based, modular design
  • Registry-driven domain tagging and ablation toggles
  • Shape-adaptive loss functions with domain-aware diagnostics
  • Quantization-ready for Pi deployment
  • Verbose logging for batch-level feedback and anomaly tracing
  • Memory-safe serialization via safetensors

πŸ“š Training Datasets

Dataset Domain Focus
GSM8K Mathematical reasoning
MultiWOZ Multi-turn dialogue & task flow
Alpaca-Code-Cleaned Code generation & logic
UltraChat-200k Open-domain conversation
  • Datasets were concatenated, shuffled, and tagged for domain awareness
  • Replay and mixing strategies used to balance underrepresented domains
  • Training spanned 9 total epochs (5 in -3, 4 in -4)

πŸ“ˆ Performance Summary

Metric Value (Epoch 8–9)
Validation Loss ↓ 2.95 (avg across domains)
Max Domain Loss < 3.4 (no domain exceeded)
Math Bias Resolved (loss spikes absorbed)
Dialogue Coherence Improved (MultiWOZ eval)
Code Determinism Increased (Alpaca eval)
Open-Domain Fluency Fewer hallucinations, better grounding

πŸ§ͺ Evaluation & Diagnostics

  • Loss spikes in early epochs traced to GSM8K; resolved by epoch 6
  • Batch-level diagnostics printed per domain and token type
  • Attention stability improved on long-context prompts
  • Token transitions cleaner across dialogue and code tasks
  • Validation curve shows smooth convergence post-epoch 5

🧩 Deployment Notes

  • Compatible with Raspberry Pi (quantized + safetensors)
  • Supports CLI-based training diagnostics (loss, ETA, memory)
  • Registry hooks enable domain-specific ablation and extension
  • Ideal for benchmarking on GSM8K, MultiWOZ, UltraChat, and custom blends

🀝 Intended Use

  • Research on modular Transformer architectures
  • Benchmarking across reasoning, dialogue, and code domains
  • Deployment on constrained hardware (e.g. Pi, ARM)
  • Community-driven extension and ablation testing

⚠️ Limitations

  • Still sensitive to prompt phrasing in edge cases
  • Long-context performance may degrade beyond 2k tokens
  • Requires domain tags for optimal generalization
  • Not trained on multimodal inputs (text-only)

πŸ™Œ Acknowledgments

Thanks to the open-source community, dataset curators, and contributors who helped shape Sam-2.5-4. This release reflects our shared commitment to transparent, inspectable, and extensible AI.

Downloads last month
16
Safetensors
Model size
52.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support