hanzlajavaid's picture

hanzlajavaid

hanzla

AI & ML interests

Direct Preference Optimization, Supervised Finetuning, Stable Diffusion

Recent Activity

Organizations

ZeroGPU Explorers's profile picture Journalists on Hugging Face's profile picture MLX Community's profile picture ModularityAI's profile picture Social Post Explorers's profile picture

hanzla's activity

posted an update about 2 months ago
view post
Post
2034
Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner
replied to Jaward's post about 2 months ago
reacted to Jaward's post with 🔥 about 2 months ago
replied to their post about 2 months ago
reacted to clem's post with 🤗 about 2 months ago
view post
Post
4676
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
·
reacted to reddgr's post with 👍 about 2 months ago
reacted to AtAndDev's post with 🔥 about 2 months ago
view post
Post
4252
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
·
reacted to fdaudens's post with 👍 about 2 months ago
view post
Post
2318
Want to build useful newsroom tools with AI? We’re launching a Hugging Face x Journalism Slack channel where journalists turn AI concepts into real newsroom solutions.

Inside the community:
✅ Build open-source AI tools for journalism
✅ Get direct help from the community
✅ Stay updated on new models and datasets
✅ Learn from other journalists’ experiments and builds

The goal? Go from “I read about AI” to “I built an AI tool that supercharged my newsroom.” —no more learning in isolation.

Join us! https://join.slack.com/t/journalistson-tnd8294/shared_invite/zt-30vsmhk4w-dZpeMOoxdhCvfNsqtspPUQ (Please make sure to use a clear identity—no teddybear85, for example 😉)

(If you know people who might be interested, tag them below! The more minds we bring in, the better the tools we build.)

reacted to mrfakename's post with 🚀 about 2 months ago
reacted to their post with 👍 2 months ago
view post
Post
3958
Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.
  • 2 replies
·
posted an update 2 months ago
view post
Post
3958
Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.
  • 2 replies
·
reacted to AtAndDev's post with 🔥 2 months ago
view post
Post
1612
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
posted an update 2 months ago
view post
Post
1256
Gemma 3 is a game changer for on device multimodal applications.

Try for yourself how a 4 billion parameter model can be so good.

hanzla/PlaygroundGemma3
  • 1 reply
·
posted an update about 1 year ago
reacted to mrm8488's post with 🚀 about 1 year ago
view post
Post
6947
Working on a concept GPT-2 (small) that uses KANs instead of MLPs.
The ckpt and training code will be soon on the hub.
·