--- license: mit tags: - unsloth - gsm8k --- Fine tuning experiment details at https://github.com/Yeok-c/grpo-gsm8k-demo