Kristaller486's picture

Kristaller486

kristaller486

·

krist486

AI & ML interests

NLP, Machine Translation

Recent Activity

reacted to nyuuzyou's post with 👍 4 days ago

🎰 Casino Benchmark: Dataset + Space https://huggingface.co/datasets/nyuuzyou/casino-benchmark https://huggingface.co/spaces/nyuuzyou/casino-benchmark 14 models faced 1,400 simulations of heads-up Blackjack and European Roulette. Shared seeds locked identical cards and spins for each. Key Stats: - 14 models benchmarked - 59,483 rows - 35 MB compressed Parquet - 35,000 scored decisions - Full prompts, JSON responses, reasoning traces, latency - Bankroll tracking from $1,000 start per run Live leaderboard tracks bets, hits, stands, and risk management. Gemini 3 Flash leads at +$3,396. Claude 4.5 Haiku at -$7,788. Traces in the dataset. Leaderboard in the space.

new activity 7 days ago

kristaller486/dots.ocr-1.5:Model removed from HF

updated a collection 7 days ago

Russian TTS datasets

View all activity

Organizations

upvoted a paper about 1 month ago

VIBE: Visual Instruction Based Editor

Paper • 2601.02242 • Published Jan 5 • 63

upvoted a paper about 2 months ago

Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM

Paper • 2512.21580 • Published Dec 25, 2025 • 8

upvoted 3 collections 2 months ago

T-lite-2.1

4 items • Updated Dec 23, 2025 • 3

T-pro-2.1

3 items • Updated Dec 23, 2025 • 5

Kandinsky 5.0 Video Pro Diffusers

Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 4 items • Updated Dec 14, 2025 • 12

upvoted a collection 3 months ago

NeMo Gym

Collection of RL verifiable data for NeMo Gym • 13 items • Updated 3 days ago • 40

upvoted a paper 3 months ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published Dec 11, 2025 • 116

upvoted a changelog 3 months ago

Changelog

Featured Spaces are now easier to spot

Nov 25, 2025

• 67

upvoted a paper 4 months ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6, 2025 • 115

upvoted a collection 5 months ago

Nanonets-OCR2

2 items • Updated Oct 13, 2025 • 25

upvoted a collection 6 months ago

DeepSeek-V3.1

4 items • Updated Nov 27, 2025 • 261

upvoted 3 papers 7 months ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published Aug 13, 2025 • 15

SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 47

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 70

upvoted a collection 7 months ago

T-pro-2.0

Hybrid reasoning model based on Qwen3 32B • 14 items • Updated Dec 20, 2025 • 29

upvoted a collection 8 months ago

Skywork-Reward-V2

Scaling preference data curation to the extreme • 9 items • Updated Jul 4, 2025 • 26

upvoted 3 papers 9 months ago

Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models

Paper • 2506.06751 • Published Jun 7, 2025 • 71

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Paper • 2505.21189 • Published May 27, 2025 • 61

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20, 2025 • 78

upvoted a collection 9 months ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 39 items • Updated Jan 9 • 59