TRI-ML
/

DCLM-1B

Model card Files Files and versions

Resources

View closed (2)

Is this model supported for finetuning with flash attention ?

#4 opened about 1 month ago by

MMLU Performance After Token Training

#3 opened 11 months ago by