Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Nitral-Archive
/
Pinecone-Rune-12b-Token-Surgery-Chatml-v0.1a
like
3
Follow
Nitral's Archive
32
Safetensors
mistral
Model card
Files
Files and versions
xet
Community
main
Pinecone-Rune-12b-Token-Surgery-Chatml-v0.1a
/
README.md
Nitral-AI
Update README.md
3ee9e75
verified
about 1 month ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
Safe
852 Bytes
metadata
base_model:
-
Entropicengine/Pinecone-Rune-12b
Original base model
Entropicengine/Pinecone-Rune-12b
Modified base model used for this train:
Nitral-AI/Pinecone-Rune-12b-chatmlified
Only around 750 entries in rank/alpha 32 4bit-qlora at 3e-6 for 2 epochs. bs 4 grad accum 4, for ebs 16 with cosine.
Dataset here:
https://huggingface.co/datasets/Nitral-AI/antirep_sharegpt
Example Notebook using l4/t4:
https://huggingface.co/Nitral-AI/Pinecone-Rune-12b-Token-Surgery-Chatml/tree/main/TokenSurgeon-Example
Boring Training graph.
Starting loss: 1.74 Final loss 0.95