-
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Paper • 2403.13257 • Published • 20 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 12 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 12 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
Yamata Zen
yamatazen
AI & ML interests
None yet
Recent Activity
updated
a collection
about 7 hours ago
Grokking
liked
a model
2 days ago
bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF
liked
a model
2 days ago
mistralai/Mistral-Small-3.2-24B-Instruct-2506
Organizations
None yet
Japanese LLMs
-
mradermacher/Himeyuri-v0.1-12B-i1-GGUF
12B • Updated • 106 • 2 -
spow12/ChatWaifu_12B_v2.0
Text Generation • 12B • Updated • 63 • 21 -
Local-Novel-LLM-project/Vecteus-v1
Text Generation • 7B • Updated • 144 • 27 -
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Paper • 2412.14471 • Published
LLM leaderboards
-
Running918918
UGI Leaderboard
📢Uncensored General Intelligence Leaderboard
-
Running on CPU Upgrade8585
Open Japanese LLM Leaderboard
🌸Explore and compare LLM models through interactive leaderboards and submissions
-
Running on CPU Upgrade13.2k13.2k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running4.49k4.49k
Chatbot Arena Leaderboard
🏆Display chatbot leaderboard and stats
Multilingual LLMs
-
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Paper • 2306.06688 • Published -
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Paper • 2412.14471 • Published -
Language Models' Factuality Depends on the Language of Inquiry
Paper • 2502.17955 • Published • 34 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
LLM censorship
-
GuardReasoner: Towards Reasoning-based LLM Safeguards
Paper • 2501.18492 • Published • 88 -
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Paper • 2412.19512 • Published • 8 -
Course-Correction: Safety Alignment Using Synthetic Preferences
Paper • 2407.16637 • Published • 27 -
Refusal in Language Models Is Mediated by a Single Direction
Paper • 2406.11717 • Published • 3
Grokking
-
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Paper • 2405.15071 • Published • 42 -
Grokking at the Edge of Numerical Stability
Paper • 2501.04697 • Published • 2 -
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
Paper • 2506.21551 • Published • 19
LLM merging
-
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Paper • 2403.13257 • Published • 20 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 12 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 12 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
Multilingual LLMs
-
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Paper • 2306.06688 • Published -
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Paper • 2412.14471 • Published -
Language Models' Factuality Depends on the Language of Inquiry
Paper • 2502.17955 • Published • 34 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
Japanese LLMs
-
mradermacher/Himeyuri-v0.1-12B-i1-GGUF
12B • Updated • 106 • 2 -
spow12/ChatWaifu_12B_v2.0
Text Generation • 12B • Updated • 63 • 21 -
Local-Novel-LLM-project/Vecteus-v1
Text Generation • 7B • Updated • 144 • 27 -
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Paper • 2412.14471 • Published
LLM censorship
-
GuardReasoner: Towards Reasoning-based LLM Safeguards
Paper • 2501.18492 • Published • 88 -
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Paper • 2412.19512 • Published • 8 -
Course-Correction: Safety Alignment Using Synthetic Preferences
Paper • 2407.16637 • Published • 27 -
Refusal in Language Models Is Mediated by a Single Direction
Paper • 2406.11717 • Published • 3
LLM leaderboards
-
Running918918
UGI Leaderboard
📢Uncensored General Intelligence Leaderboard
-
Running on CPU Upgrade8585
Open Japanese LLM Leaderboard
🌸Explore and compare LLM models through interactive leaderboards and submissions
-
Running on CPU Upgrade13.2k13.2k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running4.49k4.49k
Chatbot Arena Leaderboard
🏆Display chatbot leaderboard and stats
Grokking
-
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Paper • 2405.15071 • Published • 42 -
Grokking at the Edge of Numerical Stability
Paper • 2501.04697 • Published • 2 -
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
Paper • 2506.21551 • Published • 19