Kenneth Hamilton's picture

Kenneth Hamilton PRO

ZennyKenny

AI & ML interests

Building and enablement @ montebello.ai Certified vibe coder

Recent Activity

Organizations

scikit-learn's profile picture TorchGeo's profile picture Kornia AI's profile picture Blog-explorers's profile picture OpenLLM France's profile picture Team Tonic's profile picture ZeroGPU Explorers's profile picture Data is Better Together - Russian Language Team's profile picture The Nevsky Collective's profile picture Plan Communications's profile picture MLX Community's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture Data Is Better Together Contributor's profile picture Reasoning datasets competition 's profile picture

ZennyKenny's activity

reacted to ProCreations's post with ๐Ÿš€ 6 days ago
view post
Post
2841
Eyyyy 50 followers ๐Ÿคฏ
  • 1 reply
ยท
reacted to hesamation's post with ๐Ÿš€ 17 days ago
reacted to jeffboudier's post with ๐Ÿš€ 19 days ago
view post
Post
2560
Transcribing 1 hour of audio for less than $0.01 ๐Ÿคฏ

@mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!

How they did it: https://huggingface.co/blog/fast-whisper-endpoints

1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws&region=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true
reacted to AdinaY's post with ๐Ÿ”ฅ 20 days ago
replied to as-cle-bert's post 22 days ago
view reply

Whoa. Reliable open-sourced crawling software is a big win. I'll take it for a spin but I'm optimistic as this is the kind of thing I (and every other AI builder) has been building for years to avoid paying FireCrawl.

reacted to onekq's post with ๐Ÿ”ฅ 23 days ago
view post
Post
3279
This time Gemini is very quick with API support on its 2.5 pro May release. The performance is impressive too, now it is among top contenders like o4, R1, and Claude.

onekq-ai/WebApp1K-models-leaderboard
reacted to wolfram's post with ๐Ÿ”ฅ 24 days ago
view post
Post
7160
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science).

A few take-aways stood out - especially for those interested in local deployment and performance trade-offs:

1๏ธโƒฃ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s.
2๏ธโƒฃ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend.
3๏ธโƒฃ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s.
4๏ธโƒฃ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups.
5๏ธโƒฃ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off).

All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings.

**Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default.

Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
ยท
posted an update 26 days ago
view post
Post
882
Community! ๐Ÿ’ก๐Ÿ’ก๐Ÿ’ก

It's the last day to submit your datasets for the Reasoning Datasets Competition: https://www.bespokelabs.ai/blog/reasoning-datasets-competition

Here are my submissions:
- ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset
- ZennyKenny/cosa-benchmark-dataset
- ZennyKenny/tactical-military-reasoning-v.1.0
- ZennyKenny/tron-dataset-v.1.0

Have a look and drop a โค๏ธ or comment! Check out the entire collection of submissions here: https://huggingface.co/datasets?other=reasoning-datasets-competition
reacted to nyuuzyou's post with ๐Ÿ”ฅ 27 days ago
view post
Post
3683
nyuuzyou/svgfind ๐Ÿ‘€

Well, everything happens for the first time ๐Ÿค—. Thank you all!
reacted to their post with ๐Ÿง  27 days ago
view post
Post
3136
After hearing the news that Marc Andreessen thinks that the only job that is safe from AI replacement is venture capital: https://gizmodo.com/marc-andreessen-says-one-job-is-mostly-safe-from-ai-venture-capitalist-2000596506 ๐Ÿง ๐Ÿง ๐Ÿง 

The Reasoned Capital synthetic dataset suddenly feels much more topical: ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

Really looking forward to potentially expanding this architecture and seeing how algorithmic clever investment truly is! ๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ
posted an update 28 days ago
view post
Post
3136
After hearing the news that Marc Andreessen thinks that the only job that is safe from AI replacement is venture capital: https://gizmodo.com/marc-andreessen-says-one-job-is-mostly-safe-from-ai-venture-capitalist-2000596506 ๐Ÿง ๐Ÿง ๐Ÿง 

The Reasoned Capital synthetic dataset suddenly feels much more topical: ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

Really looking forward to potentially expanding this architecture and seeing how algorithmic clever investment truly is! ๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ
reacted to onekq's post with ๐Ÿ‘ 29 days ago
view post
Post
1753
I didn't noticed that Gemini 2.5 (pro and flash) has been silently launched for API preview. Their performance is solid, but below QwQ 32B and the latest DeepSeek v3.

onekq-ai/WebApp1K-models-leaderboard
  • 2 replies
ยท
replied to their post 29 days ago
posted an update 29 days ago
view post
Post
3362
When I heard the Reasoning Dataset Competition deadline was extended to 9 May, I knew I had time to get in one more entry. ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

With the rise of Vibe Coding, and the potential risks that are introduced by humans letting LLMs build their apps for them, lots of people are (rightfully) concerned about the safety of the code that is hitting prod.

In response to that, I'm happy to present my final submission to the Reasoning Dataset Competition and attempt to start benchmarking the ability of LLMs to identify unsafe and / or exploitable code by way of the CoSa (Code Safety) benchmark: ZennyKenny/cosa-benchmark-dataset

Currently a curated set of 200 examples, calibrated on OpenAI's standard issue models (GPT-4.1, o4 mini, and GPT-3.5 Turbo) as "baseline performance" (70% decile). Check it out and drop a โค๏ธ if you think it could be useful or hit the Community section with suggestions / critiques.
  • 2 replies
ยท
replied to DevinGrey's post 30 days ago
view reply

Just beginning how to learn AI in general and looking for an assistant to help you set up a website? Check out one of the mainline providers like OpenAI, Mistral, DeepSeek, etc. You're going to be able to make plain English requests and have it generate some code for you.

In parallel, enroll in some HF courses (https://huggingface.co/learn) to start mastering the concepts needed to work with less guided models.

Probably worth mentioning that there is a lot more to setting up a website than just the code needed, and there are a ton of services out there (Wordpress, WebFlow, SquareSpace, etc.) that can help you from end (buying and registering a domain) to end (making it look like and do the things you want).

I'd politely suggest using the AI to walk you through that entire process, rather than generating the code you need. And for that you can probably configure your own purpose built Assistant using Hugging Chat: https://huggingface.co/chat/

reacted to clem's post with ๐Ÿ”ฅ about 1 month ago
replied to merterbak's post about 1 month ago
view reply

Kind of surprised to see Microsoft moving into the reasoning space.

reacted to abidlabs's post with ๐Ÿš€ about 1 month ago
view post
Post
2688
Hi folks! Excited to share a new feature from the Gradio team along with a tutorial.

If you don't already know, Gradio is an open-source Python library used to build interfaces for machine learning models. Beyond just creating UIs, Gradio also exposes API capabilities and now, Gradio apps can be launched Model Context Protocol (MCP) servers for LLMs.

If you already know how to use Gradio, there are only two additional things you need to do:
* Add standard docstrings to your function (these will be used to generate the descriptions for your tools for the LLM)
* Set mcp_server=True in launch()


Here's a complete example (make sure you already have the latest version of Gradio installed):


import gradio as gr

def letter_counter(word, letter):
    """Count the occurrences of a specific letter in a word.
    
    Args:
        word: The word or phrase to analyze
        letter: The letter to count occurrences of
        
    Returns:
        The number of times the letter appears in the word
    """
    return word.lower().count(letter.lower())

demo = gr.Interface(
    fn=letter_counter,
    inputs=["text", "text"],
    outputs="number",
    title="Letter Counter",
    description="Count how many times a letter appears in a word"
)

demo.launch(mcp_server=True)



This is a very simple example, but you can add the ability to generate Ghibli images or speak emotions to any LLM that supports MCP. Once you have an MCP running locally, you can copy-paste the same app to host it on [Hugging Face Spaces](https://huggingface.co/spaces/) as well.

All free and open-source of course! Full tutorial: https://www.gradio.app/guides/building-mcp-server-with-gradio
  • 2 replies
ยท
reacted to lukmanaj's post with ๐Ÿ‘ about 1 month ago
view post
Post
2228
Iโ€™m excited to share that Iโ€™ve completed the Hugging Face Agents Course and earned my certificate.

Over the past few months, I explored how to build intelligent, autonomous agents using cutting-edge tools like smolagents, LlamaIndex, and LangGraph. The course covered everything from the fundamentals of agents to advanced topics like fine-tuning for function-calling, observability, evaluation, and even agents in games.

Some key content included:

1. Introduction to AI Agents

2. Agentic RAG use cases

3. Multi-framework implementation: smolagents, LlamaIndex, and LangGraph

4. Building, testing, and certifying a complete agent project

This was a hands-on, practical experience that deepened my understanding of how to design reliable, tool-using LLM agents. Looking forward to leveraging these skills in real-world applications in healthcare, logistics, and beyond.

Many thanks to the Hugging Face team for putting this together.
Letโ€™s build safe and useful agents!

ยท
replied to their post about 1 month ago