Keeby-smilyai commited on
Commit
139c1d6
Β·
verified Β·
1 Parent(s): 8ea8006

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -3
README.md CHANGED
@@ -1,3 +1,107 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ ---
8
+
9
+ # 🧠 Model Card: Sam-2.5-4 (Pro Version)
10
+
11
+ ## πŸ“Œ Overview
12
+
13
+ **Sam-2.5-4** is the Pro continuation of the Sam-2.5 architecture series, designed for modular, multi-domain reasoning across math, dialogue, code, and open-domain tasks. It builds directly on **Sam-2.5-3**, continuing training for four additional epochs to deepen convergence, reduce domain bias, and improve generalization.
14
+
15
+ This model is optimized for transparency, ablation-readiness, and deployment on both high-resource and low-resource devices (including Raspberry Pi).
16
+
17
+ ---
18
+
19
+ ## 🧬 Model Lineage
20
+
21
+ | Version | Description |
22
+ |---------------|------------------------------------------------------------------------------|
23
+ | **Sam-2.5-2** | GSM8K-heavy fine-tune; overfit to math; lacked domain balance |
24
+ | **Sam-2.5-3** | Emergency patch; retrained from scratch on 4 datasets; balanced capabilities |
25
+ | **Sam-2.5-4** | Pro version; continued training for 4 epochs; refined convergence and fluency|
26
+
27
+ ---
28
+
29
+ ## 🧠 Architecture
30
+
31
+ - Transformer-based, modular design
32
+ - Registry-driven domain tagging and ablation toggles
33
+ - Shape-adaptive loss functions with domain-aware diagnostics
34
+ - Quantization-ready for Pi deployment
35
+ - Verbose logging for batch-level feedback and anomaly tracing
36
+ - Memory-safe serialization via `safetensors`
37
+
38
+ ---
39
+
40
+ ## πŸ“š Training Datasets
41
+
42
+ | Dataset | Domain Focus |
43
+ |------------------------|----------------------------------|
44
+ | **GSM8K** | Mathematical reasoning |
45
+ | **MultiWOZ** | Multi-turn dialogue & task flow |
46
+ | **Alpaca-Code-Cleaned**| Code generation & logic |
47
+ | **UltraChat-200k** | Open-domain conversation |
48
+
49
+ - Datasets were concatenated, shuffled, and tagged for domain awareness
50
+ - Replay and mixing strategies used to balance underrepresented domains
51
+ - Training spanned **9 total epochs** (5 in -3, 4 in -4)
52
+
53
+ ---
54
+
55
+ ## πŸ“ˆ Performance Summary
56
+
57
+ | Metric | Value (Epoch 8–9) |
58
+ |-------------------------|----------------------------------|
59
+ | **Validation Loss** | ↓ 2.95 (avg across domains) |
60
+ | **Max Domain Loss** | < 3.4 (no domain exceeded) |
61
+ | **Math Bias** | Resolved (loss spikes absorbed) |
62
+ | **Dialogue Coherence** | Improved (MultiWOZ eval) |
63
+ | **Code Determinism** | Increased (Alpaca eval) |
64
+ | **Open-Domain Fluency** | Fewer hallucinations, better grounding |
65
+
66
+ ---
67
+
68
+ ## πŸ§ͺ Evaluation & Diagnostics
69
+
70
+ - Loss spikes in early epochs traced to GSM8K; resolved by epoch 6
71
+ - Batch-level diagnostics printed per domain and token type
72
+ - Attention stability improved on long-context prompts
73
+ - Token transitions cleaner across dialogue and code tasks
74
+ - Validation curve shows smooth convergence post-epoch 5
75
+
76
+ ---
77
+
78
+ ## 🧩 Deployment Notes
79
+
80
+ - Compatible with Raspberry Pi (quantized + safetensors)
81
+ - Supports CLI-based training diagnostics (loss, ETA, memory)
82
+ - Registry hooks enable domain-specific ablation and extension
83
+ - Ideal for benchmarking on GSM8K, MultiWOZ, UltraChat, and custom blends
84
+
85
+ ---
86
+
87
+ ## 🀝 Intended Use
88
+
89
+ - Research on modular Transformer architectures
90
+ - Benchmarking across reasoning, dialogue, and code domains
91
+ - Deployment on constrained hardware (e.g. Pi, ARM)
92
+ - Community-driven extension and ablation testing
93
+
94
+ ---
95
+
96
+ ## ⚠️ Limitations
97
+
98
+ - Still sensitive to prompt phrasing in edge cases
99
+ - Long-context performance may degrade beyond 2k tokens
100
+ - Requires domain tags for optimal generalization
101
+ - Not trained on multimodal inputs (text-only)
102
+
103
+ ---
104
+
105
+ ## πŸ™Œ Acknowledgments
106
+
107
+ Thanks to the open-source community, dataset curators, and contributors who helped shape Sam-2.5-4. This release reflects our shared commitment to transparent, inspectable, and extensible AI.