jekunz commited on
Commit
f00d369
·
verified ·
1 Parent(s): a82a756

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - HuggingFaceFW/fineweb-2
5
+ language:
6
+ - sv
7
+ base_model:
8
+ - HuggingFaceTB/SmolLM2-135M-Instruct
9
+ pipeline_tag: text-generation
10
+ ---
11
+ This is a SmolLM2-135M-Instruct model fine-tuned on the Swedish portion of Fineweb-2. It is intended for my research and has not been evaluated more broadly yet.
12
+
13
+ Training:
14
+ - 1 Epoch
15
+ - Learning rate: 5e-4
16
+ - LR scheduler: Cosine
17
+ - Warmup ratio: 0.05
18
+ - Batch size: 1
19
+ - 4 A100 (40GB) GPUs
20
+ - Gradient accumulation steps: 64
21
+ - Effective batch size: 256
22
+ - Max. context length: 8192 tokens