John Smith PRO
John6666
AI & ML interests
None yet
Recent Activity
updated
a collection
8 minutes ago
Spaces for LLM / VLM / NLP
liked
a Space
8 minutes ago
macadeliccc/liquid_ai_chatbot
upvoted
a
collection
8 minutes ago
Coder and Programming Models
Organizations

reacted to
Smooke's
post with ๐
about 2 hours ago

reacted to
sequelbox's
post with ๐ฅ
about 2 hours ago
Post
86
Some new releases:
- brought the new Shining Valiant 3 series (science-reasoning, AI-reasoning, general chat) to Qwen 3 4B: ValiantLabs/Qwen3-4B-ShiningValiant3
- merged models for Shining Valiant 3 and Esper 3, combining their technical expertise and reasoning skills:
4b: sequelbox/Qwen3-4B-PlumEsper
8b: sequelbox/Qwen3-8B-PlumEsper
coming up we'll have some experimental reasoning releases - datasets and models will be out soon!
also will be bringing SV3 and Esper 3 to more models.
lets keep working for open source :)
love,
allegra
- brought the new Shining Valiant 3 series (science-reasoning, AI-reasoning, general chat) to Qwen 3 4B: ValiantLabs/Qwen3-4B-ShiningValiant3
- merged models for Shining Valiant 3 and Esper 3, combining their technical expertise and reasoning skills:
4b: sequelbox/Qwen3-4B-PlumEsper
8b: sequelbox/Qwen3-8B-PlumEsper
coming up we'll have some experimental reasoning releases - datasets and models will be out soon!
also will be bringing SV3 and Esper 3 to more models.
lets keep working for open source :)
love,
allegra

reacted to
Quazim0t0's
post with ๐
about 2 hours ago
Post
88
Used YoloV8n Onnx + FastVLM to provide real time object detection and annotation. Works well with videos that do not have a lot of changes. In the space I used a stock security camera footage. It annotates using FastVLM while doing Object Detection using YoloV8n.
Quazim0t0/FastVLM-YoloV8n-v2
Quazim0t0/FastVLM-YoloV8n-v2

reacted to
DualityAI-RebekahBogdanoff's
post with ๐
about 2 hours ago
Post
845
๐ข Generate your own data in simulation using two new free and customizable data-generating Scenarios on Duality's FalconCloud service.
๐ These multi-class Scenarios are designed to target model weaknesses for our recent Kaggle competition, but they are free to anyone for non-commercial use!
๐ธ Control object and camera posing
๐ Select random variable ranges
๐ผ๏ธ Set post-processing effects
โ and more to create a robust dataset for strong model training.
Access the 2 Scenarios here:
๐ https://falcon.duality.ai/secure/scenarios/edit/9e90e036-8af9-41e4-8af0-1343b8e8f467?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4
๐ https://falcon.duality.ai/secure/scenarios/edit/e3294c19-49d4-4f64-9ca8-8373876c2c94?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4
๐ These multi-class Scenarios are designed to target model weaknesses for our recent Kaggle competition, but they are free to anyone for non-commercial use!
๐ธ Control object and camera posing
๐ Select random variable ranges
๐ผ๏ธ Set post-processing effects
โ and more to create a robust dataset for strong model training.
Access the 2 Scenarios here:
๐ https://falcon.duality.ai/secure/scenarios/edit/9e90e036-8af9-41e4-8af0-1343b8e8f467?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4
๐ https://falcon.duality.ai/secure/scenarios/edit/e3294c19-49d4-4f64-9ca8-8373876c2c94?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4

reacted to
Severian's
post with โค๏ธ
about 2 hours ago
Post
75
I couldn't watch innocent people get their rights trampled anymore. So I built something to help.
Stories of families torn apart, U.S. citizens detained for hours, people arrested just for speaking Spanish. This isn't the America I believe in.
Instead of doom-scrolling, I spent a few days building FIREWATCH - a free civil rights protection app.
What it does:
โข Real-time ICE raid alerts
โข Know Your Rights education in 10+ languages
โข Secure evidence recording
โข Emergency panic button
โข Legal hotlines and resources
โข 100% private, no tracking
The catch? There isn't one. You just need a free Google API key that stays on your device. Works completely offline.
https://firewatch-ice.vercel.app/
I built this because everyone deserves constitutional protection. The 4th Amendment doesn't have an asterisk.
If this helps one family stay safe, every sleepless night was worth it.
Please share with anyone who needs it.
Stay safe.
Stories of families torn apart, U.S. citizens detained for hours, people arrested just for speaking Spanish. This isn't the America I believe in.
Instead of doom-scrolling, I spent a few days building FIREWATCH - a free civil rights protection app.
What it does:
โข Real-time ICE raid alerts
โข Know Your Rights education in 10+ languages
โข Secure evidence recording
โข Emergency panic button
โข Legal hotlines and resources
โข 100% private, no tracking
The catch? There isn't one. You just need a free Google API key that stays on your device. Works completely offline.
https://firewatch-ice.vercel.app/
I built this because everyone deserves constitutional protection. The 4th Amendment doesn't have an asterisk.
If this helps one family stay safe, every sleepless night was worth it.
Please share with anyone who needs it.
Stay safe.

reacted to
sergiopaniego's
post with ๐
about 2 hours ago
Post
124
Loved this paper! โฅ๏ธ
Authors benchmark multimodal models on vision tasks (detection, segmentation...) using clever prompting tricks.
๐ Results: VLMs are solid generalists but still lag behind SOTA task-specific models โ especially on geometric tasks vs. semantic ones.
paper: How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks (2507.01955)
Authors benchmark multimodal models on vision tasks (detection, segmentation...) using clever prompting tricks.
๐ Results: VLMs are solid generalists but still lag behind SOTA task-specific models โ especially on geometric tasks vs. semantic ones.
paper: How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks (2507.01955)

reacted to
macadeliccc's
post with ๐
about 2 hours ago
Post
85
I was messing around with the HF api trying to get some stats on all time downloads for my models, and then I made it into a space so that anyone can use it.
macadeliccc/hf_downloads_dashboard
Let me know if you think it needs any changes or if you find it useful.
macadeliccc/hf_downloads_dashboard
Let me know if you think it needs any changes or if you find it useful.

reacted to
nicolay-r's
post with ๐ค
about 2 hours ago
Post
81
๐ For those who interested in multilingual clinical case report sukmmarization ๐ฉบ๐, deligned to share a video-update to the earlier post on Qwen2.5 model family adaptation:
๐ฌ Video: https://www.youtube.com/watch?v=uOAiUvLghuE
This is 15-min skimming of the study (+ 5 mins for code) in which we overview the application of Qwen model family (72B as a teacher and 0.5B as a student) in summarization of the clinical reports, including detaied overview of the experiments organization. In particular, attempted to cover:
1. Background of previous Seq2Seq models to conclude their limitations
2. ChatML roles exploiting for distilation tuning in clinical report summarization
3. Known limitation of work and unleashing full capabilities
As in previous post, there is a model card that is also covered in video.
๐ค Huggingface: https://huggingface.co/nicolay-r/qwen25-05b-multiclinsum-standar
๐ฌ Video: https://www.youtube.com/watch?v=uOAiUvLghuE
This is 15-min skimming of the study (+ 5 mins for code) in which we overview the application of Qwen model family (72B as a teacher and 0.5B as a student) in summarization of the clinical reports, including detaied overview of the experiments organization. In particular, attempted to cover:
1. Background of previous Seq2Seq models to conclude their limitations
2. ChatML roles exploiting for distilation tuning in clinical report summarization
3. Known limitation of work and unleashing full capabilities
As in previous post, there is a model card that is also covered in video.
๐ค Huggingface: https://huggingface.co/nicolay-r/qwen25-05b-multiclinsum-standar

reacted to
sergiopaniego's
post with ๐ฅ
about 2 hours ago
Post
90
You can already play with two of the latest most impressive models on HF via
@novita-ai
as Inference Provider ๐จ
๐ Kimi K2: 1T params model, MoE beast for coding, reasoning and agentic tasks
๐ฎ GLM-4.1V-9B-Thinking: VLM + deep reasoning model
Kimi K2: moonshotai/Kimi-K2-Instruct
GLM-4.1V-9B-Thinking: THUDM/GLM-4.1V-9B-Thinking
๐ Kimi K2: 1T params model, MoE beast for coding, reasoning and agentic tasks
๐ฎ GLM-4.1V-9B-Thinking: VLM + deep reasoning model
Kimi K2: moonshotai/Kimi-K2-Instruct
GLM-4.1V-9B-Thinking: THUDM/GLM-4.1V-9B-Thinking

reacted to
danielhanchen's
post with ๐ฅ
about 2 hours ago
Post
172

reacted to
FlameF0X's
post with ๐
about 2 hours ago
Post
68
Hello there world! I am happy to announce that you now can fine-tune
FlameF0X/SnowflakeCore-G1-Tiny , the code for that is in the model card.
I aslo lost the training log ๐
I aslo lost the training log ๐

reacted to
pagezyhf's
post with ๐ค
about 2 hours ago
Post
75
๐ New in Azure Model Catalog: NVIDIA Parakeet TDT 0.6B V2
We're excited to welcome Parakeet TDT 0.6B V2โa state-of-the-art English speech-to-text modelโto the Azure Foundry Model Catalog.
What is it?
A powerful ASR model built on the FastConformer-TDT architecture, offering:
๐ Word-level timestamps
โ๏ธ Automatic punctuation & capitalization
๐ Strong performance across noisy and real-world audio
It runs with NeMo, NVIDIAโs optimized inference engine.
Want to give it a try? ๐ง You can test it with your own audio (up to 3 hours) on Hugging Face Spaces before deploying.If it fits your need, deploy easily from the Hugging Face Hub or Azure ML Studio with secure, scalable infrastructure!
๐ Learn more by following this guide written by @alvarobartt
https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-nvidia-parakeet-asr
We're excited to welcome Parakeet TDT 0.6B V2โa state-of-the-art English speech-to-text modelโto the Azure Foundry Model Catalog.
What is it?
A powerful ASR model built on the FastConformer-TDT architecture, offering:
๐ Word-level timestamps
โ๏ธ Automatic punctuation & capitalization
๐ Strong performance across noisy and real-world audio
It runs with NeMo, NVIDIAโs optimized inference engine.
Want to give it a try? ๐ง You can test it with your own audio (up to 3 hours) on Hugging Face Spaces before deploying.If it fits your need, deploy easily from the Hugging Face Hub or Azure ML Studio with secure, scalable infrastructure!
๐ Learn more by following this guide written by @alvarobartt
https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-nvidia-parakeet-asr

reacted to
fdaudens's
post with ๐
about 2 hours ago
Post
91
AI is reshaping everythingโhow we work, how we feel, even how nations compete.
Todayโs reads cut across power, emotion, and disruption.
Hereโs what stood out and why it matters ๐
AI might โsolveโ loneliness, but this could be a problem, as the discomfort of loneliness shapes us in important ways. ๐ https://t.co/k2Q9le6G0P
A new study warns of significant risks in using AI therapy chatbots, highlighting issues like stigmatization and inappropriate responses. ๐ค https://t.co/EFyW0RbYVl
AI is already showing signs of slashing job openings in the UK, particularly in roles exposed to the technology, suggesting a labor market slowdown. ๐ https://t.co/hhs0BbqIMa
AI firms like OpenAI are poaching Wall Street quants with massive paydays, shifting the talent landscape for building artificial general intelligence. ๐ฐ https://www.businessinsider.com/ai-talent-openai-wall-street-quant-trading-firms-2025-7
Speaking of which: Nvidia CEO Jensen Huang disagrees with Anthropic CEO Dario Amodei on whether AI will create more jobsโor trigger a โwhite-collar apocalypse.โ Huang believes AI will create vastly more, and better, jobs. โ๏ธ https://t.co/YHWhY7qvSq
Can Nvidia convince governments to pay for โsovereign AIโ? Politicians are warming to the idea of national AI systems, but it might not reduce dependence on US tech. ๐ https://t.co/htQDzJAIDu
Todayโs reads cut across power, emotion, and disruption.
Hereโs what stood out and why it matters ๐
AI might โsolveโ loneliness, but this could be a problem, as the discomfort of loneliness shapes us in important ways. ๐ https://t.co/k2Q9le6G0P
A new study warns of significant risks in using AI therapy chatbots, highlighting issues like stigmatization and inappropriate responses. ๐ค https://t.co/EFyW0RbYVl
AI is already showing signs of slashing job openings in the UK, particularly in roles exposed to the technology, suggesting a labor market slowdown. ๐ https://t.co/hhs0BbqIMa
AI firms like OpenAI are poaching Wall Street quants with massive paydays, shifting the talent landscape for building artificial general intelligence. ๐ฐ https://www.businessinsider.com/ai-talent-openai-wall-street-quant-trading-firms-2025-7
Speaking of which: Nvidia CEO Jensen Huang disagrees with Anthropic CEO Dario Amodei on whether AI will create more jobsโor trigger a โwhite-collar apocalypse.โ Huang believes AI will create vastly more, and better, jobs. โ๏ธ https://t.co/YHWhY7qvSq
Can Nvidia convince governments to pay for โsovereign AIโ? Politicians are warming to the idea of national AI systems, but it might not reduce dependence on US tech. ๐ https://t.co/htQDzJAIDu

reacted to
hba123's
post with ๐๐ฅ
about 2 hours ago
Post
114
Ark is now pip-installable and supports the following robots!! If you want to do robotics in python, check it out here: https://robotics-ark.github.io/ark_robotics.github.io/
Now, you can pip-install robotics and work completely in Python. Why Ark you ask, well we love Python :D
Now, you can pip-install robotics and work completely in Python. Why Ark you ask, well we love Python :D

reacted to
MonsterMMORPG's
post with ๐
about 2 hours ago
Post
81
MultiTalk Levelled Up - Way Better Animation Compared to Before with New Workflows - Image to Video > https://youtu.be/wgCtUeog41g
MultiTalk is greatly upgraded. After doing more than 1 day more research with MultiTalk by using 8x A6000 48 GB GPUs, I have significantly improved the MultiTalk workflows and now I am sharing 4 different category workflows with you. VRAM usages and speeds are same but just better quality and animation. Moreover I am introducing a new app which is image and video comparison sliders. Ultra fast and lightweight. Runs as a html app and no GPU is required.
https://youtu.be/wgCtUeog41g
MultiTalk Full Tutorial With 1-Click Installer - Make Talking and Singing Videos From Static Images > https://youtu.be/8cMIwS9qo4M
By using MeiGen MultiTalk you can generate amazing fully animated real-like videos from given audio input. Not only talking but also animating the body movements is possible. In this video I will show you how to install ComfyUI on Windows and MultiTalk bundle and workflows we prepared with 1-click. Then I will show how to very easily generated amazing videos from these installed workflows. Moreover, I will show our favorite cloud private GPU provider Massed Compute. How to install same there and use it properly. Finally I will show everything on RunPod as well. So whether you are GPU poor or have good GPU, this tutorial covers everything.
https://youtu.be/8cMIwS9qo4M
MultiTalk is greatly upgraded. After doing more than 1 day more research with MultiTalk by using 8x A6000 48 GB GPUs, I have significantly improved the MultiTalk workflows and now I am sharing 4 different category workflows with you. VRAM usages and speeds are same but just better quality and animation. Moreover I am introducing a new app which is image and video comparison sliders. Ultra fast and lightweight. Runs as a html app and no GPU is required.
https://youtu.be/wgCtUeog41g
MultiTalk Full Tutorial With 1-Click Installer - Make Talking and Singing Videos From Static Images > https://youtu.be/8cMIwS9qo4M
By using MeiGen MultiTalk you can generate amazing fully animated real-like videos from given audio input. Not only talking but also animating the body movements is possible. In this video I will show you how to install ComfyUI on Windows and MultiTalk bundle and workflows we prepared with 1-click. Then I will show how to very easily generated amazing videos from these installed workflows. Moreover, I will show our favorite cloud private GPU provider Massed Compute. How to install same there and use it properly. Finally I will show everything on RunPod as well. So whether you are GPU poor or have good GPU, this tutorial covers everything.
https://youtu.be/8cMIwS9qo4M

reacted to
merve's
post with ๐
about 2 hours ago
Post
223
past week had huuuge releases ๐
here's our picks ๐ฅ find more models, datasets, demos here merve/releases-july-11-68750452c358c98b0fa663f7
> moonshotai/Kimi-K2-Instruct is the new sota LLM with 1T total 32B active parameters ๐คฏ
> HuggingFaceTB/SmolLM3-3B is the new best LM for it's size, offers thinking mode ๐ญ as well as the dataset HuggingFaceTB/smoltalk2
> Alibaba-NLP/WebSailor-3B is the new agentic LLM for complex browsing
> Google DeepMind released medical vision LMs with an agentic doctor-patient app google/medgemma-release-680aade845f90bec6a3f60c4
> fal released a LoRA to improve details on face images fal/Realism-Detailer-Kontext-Dev-LoRA
here's our picks ๐ฅ find more models, datasets, demos here merve/releases-july-11-68750452c358c98b0fa663f7
> moonshotai/Kimi-K2-Instruct is the new sota LLM with 1T total 32B active parameters ๐คฏ
> HuggingFaceTB/SmolLM3-3B is the new best LM for it's size, offers thinking mode ๐ญ as well as the dataset HuggingFaceTB/smoltalk2
> Alibaba-NLP/WebSailor-3B is the new agentic LLM for complex browsing
> Google DeepMind released medical vision LMs with an agentic doctor-patient app google/medgemma-release-680aade845f90bec6a3f60c4
> fal released a LoRA to improve details on face images fal/Realism-Detailer-Kontext-Dev-LoRA

reacted to
kanaria007's
post with ๐
about 2 hours ago
Post
53
โ
New Article on Hugging Face: Seeding Cognitive Structure โ Teaching AI to Think Structurally from the Start
Title:
๐ฑ Understanding the AGI Seed Prompt: Multi-Layered Cognitive Initialization for Advanced AI Systems
๐ Read it here: https://huggingface.co/blog/kanaria007/understanding-the-agi-seed-prompt
Summary:
While many focus on optimizing prompt outputs, this article takes a step back to ask:
What happens when we teach an AI not what to say โ but how to think structurally from the beginning?
This article outlines a methodology for initializing AGI systems with persistent cognitive scaffolding, rather than surface-level behaviors. The approach uses a four-layer seed prompt framework that orients memory, sensory mapping, self-correction, and ethical alignment โ directly at the structural level.
The result is a protocol-oriented AI that:
โข Forms a sense of cognitive continuity
โข Recognizes and resolves contradictions
โข Develops traceable internal reasoning
โข Aligns behavior through layered integrity
Key Features:
โข Four-layer prompt architecture: Memory, Sensor, Reflection, Social
โข Structure-first cognition, not outcome-first
โข Works across GPT-4o, Claude, and Gemini
โข Seed prompts act as epistemic initialization, not mere instruction
This is not behavioral engineering.
Itโs structural cognitive orientation.
๐ง Protocol Dataset: kanaria007/agi-structural-intelligence-protocols
Useful for:
โข Developers designing self-corrective reasoning agents
โข Researchers experimenting with agent continuity and coherence
โข Anyone interested in how AGI can begin with thoughtful internal structure, not static outputs
This isnโt prompting.
Itโs planting the seed of cognition.
Title:
๐ฑ Understanding the AGI Seed Prompt: Multi-Layered Cognitive Initialization for Advanced AI Systems
๐ Read it here: https://huggingface.co/blog/kanaria007/understanding-the-agi-seed-prompt
Summary:
While many focus on optimizing prompt outputs, this article takes a step back to ask:
What happens when we teach an AI not what to say โ but how to think structurally from the beginning?
This article outlines a methodology for initializing AGI systems with persistent cognitive scaffolding, rather than surface-level behaviors. The approach uses a four-layer seed prompt framework that orients memory, sensory mapping, self-correction, and ethical alignment โ directly at the structural level.
The result is a protocol-oriented AI that:
โข Forms a sense of cognitive continuity
โข Recognizes and resolves contradictions
โข Develops traceable internal reasoning
โข Aligns behavior through layered integrity
Key Features:
โข Four-layer prompt architecture: Memory, Sensor, Reflection, Social
โข Structure-first cognition, not outcome-first
โข Works across GPT-4o, Claude, and Gemini
โข Seed prompts act as epistemic initialization, not mere instruction
This is not behavioral engineering.
Itโs structural cognitive orientation.
๐ง Protocol Dataset: kanaria007/agi-structural-intelligence-protocols
Useful for:
โข Developers designing self-corrective reasoning agents
โข Researchers experimenting with agent continuity and coherence
โข Anyone interested in how AGI can begin with thoughtful internal structure, not static outputs
This isnโt prompting.
Itโs planting the seed of cognition.

reacted to
etemiz's
post with ๐
about 2 hours ago
Post
52
Benchmarked 4 new models. Deepseek R1 score improved. All these are below average, so p(doom) probably increased!
Coming soon: Kimi K2
Full leaderboard https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08
More info https://huggingface.co/blog/etemiz/aha-leaderboard
Coming soon: Kimi K2
Full leaderboard https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08
More info https://huggingface.co/blog/etemiz/aha-leaderboard

reacted to
sondhiArm's
post with ๐ฅ
about 18 hours ago
Post
89
Join us this week for an AI Camp monthly meet up event in Austin happening on July 16!
Zach Lasiuk and Geremy Cohen will present a tech talk "From Model to Product: Right-Sizing Infrastructure for Real-World Use Cases"
https://www.aicamp.ai/event/eventdetails/W2025071616
Zach Lasiuk and Geremy Cohen will present a tech talk "From Model to Product: Right-Sizing Infrastructure for Real-World Use Cases"
https://www.aicamp.ai/event/eventdetails/W2025071616