๐Ÿง  Spiking Neural Network Language Model - Training Checkpoint

Live training checkpoint from the world's first large-scale spiking language model!

Current Training Status

  • Training Step: 554,000
  • Tokens Processed: 5.67B tokens
  • Current Loss: 4.5783
  • Spike Rate: 0.0508
  • Learning Rate: 8.15e-05

Model Architecture

  • Parameters: ~54M
  • Architecture: 12-layer Spiking LTC Network
  • Hidden Size: 768
  • Sequence Length: 1024
  • Multi-timescale Processing: Fast โ†’ Medium โ†’ Slow layers

Training Details

  • Dataset: PatrickHaller/fineweb-5B
  • Target: 3 epochs (~15B tokens total)
  • Biological Dynamics: Adaptive thresholds, refractory periods
  • Energy Efficiency: ~5% neuron activation vs 100% in Transformers

Scientific Significance

This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:

  1. Biological neural dynamics can learn language at scale
  2. Energy efficiency through sparse neural firing
  3. Multi-timescale processing for hierarchical understanding

Usage

# Download this checkpoint
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
    repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
    filename="checkpoint_554000.pt"
)

# Load with custom spiking model code
# (See full implementation for complete usage)

๐Ÿ”ฌ This is live research in progress! Check back for updates as training continues.

Training Progress: 37.8% complete towards 15B tokens

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support