Ki No Kokoro AI Collective (機の心AI集団)

community

kinokokoro

Activity Feed

AI & ML interests

JA LLM & Data

Recent Activity

ptrdvn authored a paper about 1 month ago

ALoFTRAG: Automatic Local Fine Tuning for Retrieval Augmented Generation

leonardlin updated a model 8 months ago

kinokokoro/cyberagent-mistral-nemo-webnovels

leonardlin updated a model 8 months ago

kinokokoro/cyberagent-mistral-nemo-webnovels-cp2460

View all activity

leonardlin

posted an update 30 days ago

Post

359

I'm excited to announce the official release of our Shisa V2 405B model:
shisa-ai/shisa-v2-llama3.1-405b

It's the strongest model ever trained in Japan, and even goes toe-to-toe w/ GPT-4o and DeepSeek-V3 in JA MT-Bench.

For all the details, be sure to check out post and overview report here: https://shisa.ai/posts/shisa-v2-405b/

ptrdvn

authored a paper about 1 month ago

ALoFTRAG: Automatic Local Fine Tuning for Retrieval Augmented Generation

Paper • 2501.11929 • Published Jan 21 • 1

leonardlin

posted an update about 1 month ago

Post

2532

BTW, in case anyone wants to kick the tires, test their 日本語, I have our Shisa V2 405B model up and running temporarily: https://chat.shisa.ai/

3 replies

leonardlin

posted an update 3 months ago

Post

2670

Happy to announce the release of Shisa V2, our latest generation of our bilingual Japanese-English language models. After hundreds of ablations and months of work, we're releasing some of the strongest open Japanese models at 7B, 8B, 12B, 14B, 32B and 70B! Full announcement here https://shisa.ai/posts/shisa-v2/ or visit the Shisa V2 HF collection: shisa-ai/shisa-v2-67fc98ecaf940ad6c49f5689

leonardlin

updated 2 models 8 months ago

kinokokoro/cyberagent-mistral-nemo-webnovels

Text Generation • 12B • Updated Oct 31, 2024 • 18 • 1

kinokokoro/cyberagent-mistral-nemo-webnovels-cp2460

12B • Updated Oct 30, 2024 • 11

ptrdvn

authored 2 papers 12 months ago

Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets

Paper • 2405.18952 • Published May 29, 2024 • 10

Tagengo: A Multilingual Chat Dataset

Paper • 2405.12612 • Published May 21, 2024 • 3

leonardlin

posted an update about 1 year ago

Post

2097

My weekened project ended up being doing some testing between torchtune, axolotl, and unsloth. I *think* it's a 1:1 comparison of what LoRA fine-tuning performance looks like between the different hardware I have in my dev boxes (4090, 3090, 7900 XTX, W7900) with a few other interesting tidbits.

Tonight I wrote up a WandB report (the panel editor is super broken in Firefox 😔) that sums up some of the more interesting bits from the results: https://wandb.ai/augmxnt/train-bench/reports/torchtune-vs-axolotl-vs-unsloth-Trainer-Comparison--Vmlldzo4MzU3NTAx

1 reply

leonardlin

posted an update about 1 year ago

Post

2512

Maybe of interest, I just finished a long writeup of my weekend project exploring Qwen 2 7B Instruct's Chinese censorship: https://huggingface.co/blog/leonardlin/chinese-llm-censorship-analysis

I also have an accompanying model and dataset (and codebase) for those curious to poke around:

* augmxnt/Qwen2-7B-Instruct-deccp

* augmxnt/deccp

leonardlin

posted an update about 1 year ago

Post

1954

Interesting, I've just seen the my first HF spam on one of my new model uploads: shisa-ai/shisa-v1-llama3-70b - someone has an SEO spam page as a HF space attached to the model!?! Wild. Who do I report this to?

4 replies

leonardlin

posted an update about 1 year ago

Post

1622

For those with an interest in JA language models, this Llama 3 70B test ablation looks like it is the current strongest publicly released, commercially usable, open model available. A lot of caveats I know, but it also matches gpt-3.5-turbo-0125's JA performance, which is worth noting, and is tuned *exclusively* with the old shisa-v1 dataset (so it's chart position will be very short lived).

shisa-ai/shisa-v1-llama3-70b

augmxnt/ultra-orca-boros-en-ja-v1

2 replies

leonardlin

posted an update about 1 year ago

Post

1958

With slurm figured out and ablations humming along, I though I'd update and post my understanding of the legal status of training data in Japan. It is in general, much clearer in the US: https://huggingface.co/blog/leonardlin/ai-training-data-in-japan

NekoMikoReimu

updated a dataset about 1 year ago

kinokokoro/sharegpt_filtered

Viewer • Updated May 19, 2024 • 946 • 11

leonardlin

posted an update about 1 year ago

Post

1383

llm-jp-eval is currently one of the most widely used benchmarks for Japanese LLMs and is half of WandB's comprehensive Nejumi LLM Leaderboard scoring. I was seeing some weirdness in results I was getting and ended up in a bit of a rabbit hole. Here's my article on evaling llm-jp-eval: https://huggingface.co/blog/leonardlin/llm-jp-eval-eval

I've setup a fork of Lightblue's Shaberi testing framework which uses LLM-as-a-Judge style benchmarks as something probably more representative of real world LLM strength in Japanese. Here's how the new base model ablations are looking:

leonardlin

posted an update about 1 year ago

Post

1262

I've been doing some evals and tuning, and this chat template repo maintained by @chujiezheng is great: https://github.com/chujiezheng/chat_templates

Here's also a simple script for checking what the output looks like:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("augmxnt/shisa-7b-v1")
messages = [
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]

print()
print('Chat Template:')
print(tokenizer.chat_template)
print()
print('---')
print()

print(tokenizer.apply_chat_template(messages, tokenize=False))

leonardlin

updated a dataset about 1 year ago

kinokokoro/ichikara-instruction-003

Preview • Updated May 7, 2024 • 14

NekoMikoReimu

updated a model over 1 year ago

kinokokoro/karasu-7B-jinseisoudan-5e-5

Text Generation • 8B • Updated Mar 16, 2024 • 12

leonardlin

updated a model over 1 year ago

kinokokoro/karasu-7B-chat-plus-unleashed-jinseisoudan-2e-4

Text Generation • 8B • Updated Mar 16, 2024 • 14

AI & ML interests

Recent Activity

Team members 3

kinokokoro's activity