AI & ML interests

None defined yet.

Recent Activity

leonardlin 
posted an update about 2 months ago
view post
Post
372
I'm excited to announce the official release of our Shisa V2 405B model:
shisa-ai/shisa-v2-llama3.1-405b

It's the strongest model ever trained in Japan, and even goes toe-to-toe w/ GPT-4o and DeepSeek-V3 in JA MT-Bench.

For all the details, be sure to check out post and overview report here: https://shisa.ai/posts/shisa-v2-405b/
leonardlin 
posted an update about 2 months ago
view post
Post
2538
BTW, in case anyone wants to kick the tires, test their 日本語, I have our Shisa V2 405B model up and running temporarily: https://chat.shisa.ai/
  • 3 replies
·
leonardlin 
posted an update 3 months ago
view post
Post
2674
Happy to announce the release of Shisa V2, our latest generation of our bilingual Japanese-English language models. After hundreds of ablations and months of work, we're releasing some of the strongest open Japanese models at 7B, 8B, 12B, 14B, 32B and 70B! Full announcement here https://shisa.ai/posts/shisa-v2/ or visit the Shisa V2 HF collection: shisa-ai/shisa-v2-67fc98ecaf940ad6c49f5689
leonardlin 
posted an update about 1 year ago
view post
Post
2099
My weekened project ended up being doing some testing between torchtune, axolotl, and unsloth. I *think* it's a 1:1 comparison of what LoRA fine-tuning performance looks like between the different hardware I have in my dev boxes (4090, 3090, 7900 XTX, W7900) with a few other interesting tidbits.

Tonight I wrote up a WandB report (the panel editor is super broken in Firefox 😔) that sums up some of the more interesting bits from the results: https://wandb.ai/augmxnt/train-bench/reports/torchtune-vs-axolotl-vs-unsloth-Trainer-Comparison--Vmlldzo4MzU3NTAx
  • 1 reply
·