Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

dealermatt72ย 
posted an update about 22 hours ago
view post
Post
2203
Hey Hugging Face community ๐Ÿ‘‹

My name is M. I'm a solo founder and self-taught developer based in Houston, TX. I build AI-powered apps โ€” I have an iOS app called DeFilter currently in App Store review, a security scanning platform called Sentinel, and a job marketplace called HireHuman.fyi for connecting humans with companies that prefer non-AI workers.

I'm also a poker dealer by night, which means I think a lot about reading situations in real time โ€” and that's exactly what sparked this idea.

I'm not the most technical person in the room. But I have a vision, I have drive, and I believe the best projects get built when people with different skills come together around a shared idea.

That's why I'm posting here. I want to build this with the community.

โ€” M (@dealermatt )

  • 1 reply
ยท
anakin87ย 
posted an update about 23 hours ago
view post
Post
6136
How LLM training with RL Environments works?

It all starts with ๐—ฅ๐—ฒ๐—ถ๐—ป๐—ณ๐—ผ๐—ฟ๐—ฐ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐˜„๐—ถ๐˜๐—ต ๐—ฉ๐—ฒ๐—ฟ๐—ถ๐—ณ๐—ถ๐—ฎ๐—ฏ๐—น๐—ฒ ๐—ฅ๐—ฒ๐˜„๐—ฎ๐—ฟ๐—ฑ๐˜€
- question asked
- model generates reasoning + answer
- answer checked against ground truth
- reward drives RL training


In this setup, the environment is simple: fixed questions and answers, rollout logic, reward(s)

Consider a more complex tic-tac-toe env โŒโญ•
It adds:
- dynamic game generation/handling
- tunable opponent skill
- multi-turn interactions

(envs can also include tools)

---

What happens at training?

We use ๐—š๐—ฟ๐—ผ๐˜‚๐—ฝ ๐—ฅ๐—ฒ๐—น๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฃ๐—ผ๐—น๐—ถ๐—ฐ๐˜† ๐—ข๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป with a tic-tac-toe env

No critic model needed, the group is the baseline
Simpler than PPO

1๏ธโƒฃ Rollout generation: from the same board, model plays N games via sampling
2๏ธโƒฃ Each game scored with deterministic rewards (win, format, ...)
3๏ธโƒฃ Mean score computed across the group
4๏ธโƒฃ Each rollout's advantage = its score minus the group mean
5๏ธโƒฃ Model updated to favor trajectories above baseline

๐Ÿ” Repeat


For a deep dive, check out
๐ŸŒฑ https://github.com/anakin87/llm-rl-environments-lil-course
a free hands-on course on RL environments for LLMs
  • 2 replies
ยท
ajibawa-2023ย 
posted an update 3 days ago
view post
Post
5907
Go-Code-Large
Dataset: ajibawa-2023/Go-Code-Large

Go-Code-Large is a large-scale corpus of Go (Golang) programming language source code, comprising 316,427 code samples stored in .jsonl format. The dataset is designed to support research and development in large language model (LLM) pretraining, static analysis, cloud-native systems, and modern backend software engineering.

By offering a focused and curated dataset for Go, this corpus enables experimentation in concurrent programming, distributed systems, and performance-oriented backend servicesโ€”domains where Go is widely adopted.

Go-Code-Large addresses the relative scarcity of large, language-specific datasets for Go, enabling targeted research into idiomatic Go patterns, concurrency primitives, and scalable system design.
  • 2 replies
ยท
ajibawa-2023ย 
posted an update 2 days ago
view post
Post
748
Ruby-Code-Large
Dataset : ajibawa-2023/Ruby-Code-Large

Ruby-Code-Large is a large-scale corpus of Ruby programming language source code comprising 331,743 code samples stored in .jsonl format. The dataset is designed to support research and development in large language model (LLM) pretraining, static analysis, web application development, and software engineering automation within the Ruby ecosystem.

By offering a substantial, language-focused dataset, Ruby-Code-Large enables targeted experimentation in dynamic programming, object-oriented design, and rapid application developmentโ€”areas where Ruby is widely used, particularly in web frameworks and scripting.

Ruby-Code-Large addresses the lack of large, curated, Ruby-specific datasets, enabling focused research on expressive syntax, metaprogramming, and high-level abstractions.
mrmannaย 
posted an update 3 days ago
view post
Post
4475
๐—”๐—œ & ๐—ฆ๐—ง๐—”๐—ง๐—˜ ๐— ๐—”๐—–๐—›๐—œ๐—ก๐—˜
๐˜ž๐˜ฉ๐˜บ ๐˜—๐˜ณ๐˜ฐ๐˜ฅ๐˜ถ๐˜ค๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜‰๐˜ฆ๐˜จ๐˜ช๐˜ฏ๐˜ด ๐˜ž๐˜ฉ๐˜ฆ๐˜ณ๐˜ฆ ๐˜›๐˜ฐ๐˜บ ๐˜ˆ๐˜จ๐˜ฆ๐˜ฏ๐˜ต๐˜ด ๐˜Œ๐˜ฏ๐˜ฅ
Published: 18 Apr 2026 | Towards AI Publication | Medium
Open Link: https://medium.com/towards-artificial-intelligence/ai-state-machine-106387406c5a?sk=047b2f064c673a0095a9e8cc011b6a92


We talk a lot about governance, accuracy, and auditability in AI agents.
But I keep seeing a gap between the words and the engineering behind them.
Many agents have tools, orchestration, memory, graphs, and impressive demos. But when you ask how governance is actually enforced, the answer is often weak.
Prompt-level control is not production governance.
A production agent needs explicit state design: legal transitions, controlled progression, recovery paths, approval boundaries, and separation between memory, decision, policy, and execution.
This article explores the silent crisis unfolding in modern AI development: the urgent need to resurrect the disciplined architecture of state machines
  • 1 reply
ยท
prithivMLmodsย 
posted an update 3 days ago
view post
Post
3709
HY-World-2.0 โ€” A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds is now available on Spaces, and it works both as native Gradio components and in Gradio server mode.

> HY-World-2.0-Demo: prithivMLmods/HY-World-2.0-Demo
> HY-World-2.0 [Server Mode]: prithivMLmods/HY-World-2.0-Demo
> Featuring 3D reconstruction and Gaussian splats with the Rerun viewer, along with camera poses, depth maps, and surface normals.
> In Server Mode, Gradio is served via FastAPI, with FastAPI remaining the top-level server.
> Model: tencent/HY-World-2.0
> GitHub: https://github.com/PRITHIVSAKTHIUR/HY-World-2.0-Demo

๐Ÿค—To learn more, visit the app page or the respective model pages.
consome2ย 
posted an update 1 day ago
view post
Post
684
Built a small site for tracking speech-to-speech, full-duplex, and audio foundation model work.
It covers models, benchmarks, datasets, and some blog posts to organize the landscape in one place.

Still early, but sharing in case it is useful:
https://www.fullduplex.ai/

If you spot missing entries or mistakes, I would really appreciate corrections.
sequelboxย 
posted an update 1 day ago
view post
Post
742
NEW RELEASE: Esper 3.1 for Qwen 3.6!

- Your dedicated DevOps expert: Esper 3.1 maximizes DevOps and architecture helpfulness, powered by high-difficulty DevOps and architecture data generated with DeepSeek-V3.1-Terminus!
- Improved coding performance: challenging code-reasoning datasets stretch DeepSeek-V3.1-Terminus and DeepSeek-V3.2 to the limits, allowing Esper 3.1 to tackle harder coding tasks!
- AI to build AI: our high-difficulty AI expertise data boosts Esper 3.1's MLOps, AI architecture, AI research, and general reasoning skills.

Get it now: ValiantLabs/Qwen3.6-35B-A3B-Esper3.1

We're working on more finetunes for the newest Qwen and Gemma models, and we've also started working on the agentic-first datasets for Esper 4 :) we're going to make open source better and better for your work!

Please note that real life financial and family concerns have popped up and have imposed unfortunate limitations on our ability to devote time to our open-source work :( If you would like to see Esper 4 and our other releases speed up instead of slowing down, this is the best way you can help us: sequelbox/SupportOpenSource

No matter what, we'll keep fighting and we won't give up!

with love,
allegra
aufklarerย 
posted an update 2 days ago
view post
Post
694
After running extensive benchmarks across ASR, TTS, and VAD on Apple Silicon, we found some results that weren't documented anywhere.

The most counterintuitive: INT8 runs 3.3x faster than INT4 on the Neural Engine. A 332 MB CoreML model allocates 1,677 MB at runtime. And the right architecture uses both MLX and CoreML simultaneously โ€” not one or the other.

MLX talks to the GPU โ€” programmable, fast for large transformer inference. CoreML talks to the Neural Engine โ€” fixed-function silicon, 135x real-time for small feedforward models like VAD, near-zero power draw.

All benchmarks are from speech-swift, our open-source Swift library for on-device speech AI: ASR, TTS, VAD, diarization, speech-to-speech โ€” everything running locally on Apple Silicon with no API, no cloud, no data leaving the device.

Models on HF: aufklarer/Qwen3-ASR-0.6B-MLX-4bit ยท aufklarer/parakeet-tdt-0.6b-coreml-int8 ยท aufklarer/PersonaPlex-7B-MLX-4bit

Full article: https://blog.ivan.digital
Library: https://github.com/soniqo/speech-swift
omarkamaliย 
posted an update 2 days ago
view post
Post
690
Just sharing a little breakthrough with Gherbal LID where we managed to distinguish the 15 variants of Arabic with 6 variants above 90%, 10 variants above 85% accuracy, practically distinguishing Moroccan and Algerian (which overlap massively).

It also embraces the duality of MSA and arabic variants pioneered in ALDi by @AMR-KELEG et al.

Now we're only bottlenecked by the availability of high quality data for the low scoring variants such as Iraqi, Libyan, Sudanese, Adeni ...

More on Gherbal at:
https://omneitylabs.com/models/gherbal

  • 1 reply
ยท