sometimesanotion's picture

sometimesanotion

sometimesanotion

·

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

reacted to sequelbox's post with 🔥 about 5 hours ago

EARLY SNEAK PREVIEW: get a first look at the Celestia 3 science-reasoning dataset, built with DeepSeek's newest R1-0528 reasoning model! Subjects include physics, chemistry, biology, computer science, Earth science, astronomy, and information theory. This early look contains the first 14k rows, all synthetic responses using https://huggingface.co/deepseek-ai/DeepSeek-R1-0528 SEE IT HERE: https://huggingface.co/datasets/sequelbox/Celestia3-DeepSeek-R1-0528-PREVIEW Support our releases: https://huggingface.co/spaces/sequelbox/SupportOpenSource Coming up we'll have more dataset releases, including some novel reasoning and analysis methods - we think an important role for open source researchers is experimenting with new response styles on top of the increasingly excellent base models available to finetune. more to come soon! allegra

liked a dataset about 5 hours ago

sequelbox/Celestia3-DeepSeek-R1-0528-PREVIEW

liked a model about 24 hours ago

Blancy/Qwen3-1.7B-Open-R1-GRPO

View all activity

Organizations

sometimesanotion's activity

upvoted an article 15 days ago

Article

All LLMs Will Be Sparse BitNet Hybrids

By

•

23 days ago

• 11

upvoted an article 4 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

By

•

Jan 30

• 72

upvoted a collection 5 months ago

Open LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 600

upvoted a paper 6 months ago

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 40

upvoted a paper 7 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 50