upload SAEs and autointerp

Files changed (8) hide show

ef2/autointerp_layer15_res_matryoshka_k256_ef2.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

ef2/config.yaml ADDED Viewed

+sae:
+  type: matryoshka_batch_topk
+  activation_dim: 4096
+  expansion_factor: 2
+  layer_id: 15
+  hookpoint: residual
+  k: 256
+  group_fractions:
+  - 0.5
+  - 0.25
+  - 0.125
+  - 0.0625
+  - 0.0625
+  group_weights: null
+trainer:
+  epochs: 1
+  lr: null
+  l1_penalty: 0.1
+  warmup_steps: 10
+  sparsity_warmup_steps: 0
+  decay_start: null
+  resample_steps: null
+  seed: 42
+  device: cuda:0
+  log_every_n_steps: 1000
+  logger_type: mlflow
+  validate: false
+  auxk_alpha: 0.03125
+  threshold_beta: 0.999
+  threshold_start_step: 1000
+  threshold_dead_features: 100000
+data:
+  dataset_names:
+  - mimic_findings_temporal
+  activations_type: per_token
+  num_workers: 18
+  batch_size: 8192
+  val_samples: 512000
+  train_samples: null
+  norm_act: true
+  norm_to_sqrt_act_dim: false
+  input_unit_norm: false
+  filter_dict: null

ef2/layer15_res_matryoshka_k256_ef2.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed20976e0043d310f219473b2f20e98c84a4281a5d418a4ab822501fb79e605e
+size 268487386

ef4/autointerp_layer15_res_matryoshka_k256_ef4.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

ef4/config.yaml ADDED Viewed

+sae:
+  type: matryoshka_batch_topk
+  activation_dim: 4096
+  expansion_factor: 4
+  layer_id: 15
+  hookpoint: residual
+  k: 256
+  group_fractions:
+  - 0.5
+  - 0.25
+  - 0.125
+  - 0.0625
+  - 0.0625
+  group_weights: null
+trainer:
+  epochs: 1
+  lr: null
+  l1_penalty: 0.1
+  warmup_steps: 10
+  sparsity_warmup_steps: 0
+  decay_start: null
+  resample_steps: null
+  seed: 42
+  device: cuda:0
+  log_every_n_steps: 1000
+  logger_type: mlflow
+  validate: false
+  auxk_alpha: 0.03125
+  threshold_beta: 0.999
+  threshold_start_step: 1000
+  threshold_dead_features: 100000
+data:
+  dataset_names:
+  - mimic_findings_temporal
+  activations_type: per_token
+  num_workers: 18
+  batch_size: 8192
+  val_samples: 512000
+  train_samples: null
+  norm_act: true
+  norm_to_sqrt_act_dim: false
+  input_unit_norm: false
+  filter_dict: null

ef4/layer15_res_matryoshka_k256_ef4.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:58bde75f2707637c25bd760007647181875068f0bdffe4752fdeeb71a3034ad5
+size 536955610

ef8/config.yaml ADDED Viewed

+sae:
+  type: matryoshka_batch_topk
+  activation_dim: 4096
+  expansion_factor: 8
+  layer_id: 15
+  hookpoint: residual
+  k: 256
+  group_fractions:
+  - 0.5
+  - 0.25
+  - 0.125
+  - 0.0625
+  - 0.0625
+  group_weights: null
+trainer:
+  epochs: 1
+  lr: null
+  l1_penalty: 0.1
+  warmup_steps: 10
+  sparsity_warmup_steps: 0
+  decay_start: null
+  resample_steps: null
+  seed: 42
+  device: cuda:0
+  log_every_n_steps: 1000
+  logger_type: mlflow
+  validate: false
+  auxk_alpha: 0.03125
+  threshold_beta: 0.999
+  threshold_start_step: 1000
+  threshold_dead_features: 100000
+data:
+  dataset_names:
+  - mimic_findings_temporal
+  activations_type: per_token
+  num_workers: 18
+  batch_size: 8192
+  val_samples: 512000
+  train_samples: null
+  norm_act: true
+  norm_to_sqrt_act_dim: false
+  input_unit_norm: false
+  filter_dict: null

ef8/layer15_res_matryoshka_k256_ef8.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:81a4f7a3b97b5c31ee1e1b53e791fd50a77cbdfed44e274abbfba3138eed4405
+size 1073892058