๐ง Spiking Neural Network Language Model - Training Checkpoint
Live training checkpoint from the world's first large-scale spiking language model!
Current Training Status
- Training Step: 554,000
- Tokens Processed: 5.67B tokens
- Current Loss: 4.5783
- Spike Rate: 0.0508
- Learning Rate: 8.15e-05
Model Architecture
- Parameters: ~54M
- Architecture: 12-layer Spiking LTC Network
- Hidden Size: 768
- Sequence Length: 1024
- Multi-timescale Processing: Fast โ Medium โ Slow layers
Training Details
- Dataset: PatrickHaller/fineweb-5B
- Target: 3 epochs (~15B tokens total)
- Biological Dynamics: Adaptive thresholds, refractory periods
- Energy Efficiency: ~5% neuron activation vs 100% in Transformers
Scientific Significance
This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:
- Biological neural dynamics can learn language at scale
- Energy efficiency through sparse neural firing
- Multi-timescale processing for hierarchical understanding
Usage
# Download this checkpoint
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
filename="checkpoint_554000.pt"
)
# Load with custom spiking model code
# (See full implementation for complete usage)
๐ฌ This is live research in progress! Check back for updates as training continues.
Training Progress: 37.8% complete towards 15B tokens
- Downloads last month
- 18
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support