Number of tokens used to train 1B
#19
by
cz-cb
- opened
Hi! The model card says the 1B model is trained with 3T tokens, but the paper says it used 2T tokens. Which one is the correct number of training tokens?
Hi! The model card says the 1B model is trained with 3T tokens, but the paper says it used 2T tokens. Which one is the correct number of training tokens?