YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ss_dense

Weight-sparse transformer trained with the procedure from Gao et al. (2025).

Model Details

  • Layers: 4
  • Model Dimension: 512
  • Context Length: 512
  • Head Dimension: 16
  • Vocabulary Size: 4096

Sparsity

  • Weight Sparsity: False
  • Target L0 Fraction: 1
  • Activation Sparsity: False

Training

  • Dataset: SimpleStories/SimpleStories
  • Tokenizer: SimpleStories/SimpleStories-1.25M
  • Total Tokens: 2,000,000,000

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(repo_id="jacobcd52/ss_dense", filename="pytorch_model.bin")
config_path = hf_hub_download(repo_id="jacobcd52/ss_dense", filename="config.json")

# Load (requires the SparseGPT model class from this repo)
state_dict = torch.load(model_path)
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support