AI & ML interests

We develop infrastructure for the evaluation of generated text.

Recent Activity

Abhaykoulย 
posted an update about 19 hours ago
view post
Post
1111
๐Ÿš€ Dhanishtha-2.0-preview-0825 Is Here

The Intermediate Thinking Model just leveled up again.

With sharper reasoning, better tool use, and expanded capabilities, Dhanishtha-2.0-preview-0825 is now live and ready to impress.

๐Ÿง  What Makes Dhanishtha Special?
Unlike typical CoT models that only thinks one time, Dhanishtha thinks iteratively:

> Think โ†’ Answer โ†’ Rethink โ†’ Improve โ†’ Rethink again if needed.

๐Ÿ”— Try it now: HelpingAI/Dhanishtha-2.0-preview-0825

๐Ÿ”ž Dhanishtha NSFW Preview

For those exploring more expressive and immersive roleplay scenarios, weโ€™re also releasing:

HelpingAI/Dhanishtha-nsfw
A specialized version tuned for adult-themed interactions and character-driven roleplay.

๐Ÿ”— Explore it here: HelpingAI/Dhanishtha-nsfw

๐Ÿ’ฌ You can also try all of these live at chat.helpingai.co
ImranzamanMLย 
posted an update 1 day ago
view post
Post
175
Working of Transformer model layers!

I focused on showing the core steps side by side with tokenization, embedding and the transformer model layers, each highlighting the self attention and feedforward parts without getting lost in too much technical depth.

Its showing how these layers work together to understand context and generate meaningful output!

If you are curious about the architecture behind AI language models or want a clean way to explain it, hit me up, Iโ€™d love to share!



#AI #MachineLearning #NLP #Transformers #DeepLearning #DataScience #LLM #AIAgents
Parveshiiiiย 
posted an update 3 days ago
view post
Post
906
๐Ÿš€ Launch Alert: Dev-Stack-Agents
Meet your 50-agent senior AI team โ€” principal-level experts in engineering, AI, DevOps, security, product, and more โ€” all bundled into one modular repo.

+ Code. Optimize. Scale. Secure.
- Full-stack execution, Claude-powered. No human bottlenecks.


๐Ÿ”ง Built for Claude Code
Seamlessly plug into Claudeโ€™s dev environment:

* ๐Ÿง  Each .md file = a fully defined expert persona
* โš™๏ธ Claude indexes them as agents with roles, skills & strategy
* ๐Ÿค– You chat โ†’ Claude auto-routes to the right agent(s)
* โœ๏ธ Want precision? Just call @agent-name directly
* ๐Ÿ‘ฅ Complex task? Mention multiple agents for team execution

Examples:

"@security-auditor please review auth flow for risks"
"@cloud-architect + @devops-troubleshooter โ†’ design a resilient multi-region setup"
"@ai-engineer + @legal-advisor โ†’ build a privacy-safe RAG pipeline"


๐Ÿ”— https://github.com/Parveshiiii/Dev-Stack-Agents
MIT License | Claude-Ready | PRs Welcome

  • 1 reply
ยท
ImranzamanMLย 
posted an update 6 days ago
view post
Post
1608
Hugging Face just made life easier with the new hf CLI!
huggingface-cli to hf

With renaming the CLI, there are new features added like hf jobs. We can now run any script or Docker image on dedicated Hugging Face infrastructure with a simple command. It's a good addition for running experiments and jobs on the fly.

To get started, just run:
pip install -U huggingface_hub

List of hf CLI Commands

Main Commands
hf auth: Manage authentication (login, logout, etc.).
hf cache: Manage the local cache directory.
hf download: Download files from the Hub.
hf jobs: Run and manage Jobs on the Hub.
hf repo: Manage repos on the Hub.
hf upload: Upload a file or a folder to the Hub.
hf version: Print information about the hf version.
hf env: Print information about the environment.

Authentication Subcommands (hf auth)
login: Log in using a Hugging Face token.
logout: Log out of your account.
whoami: See which account you are logged in as.
switch: Switch between different stored access tokens/profiles.
list: List all stored access tokens.

Jobs Subcommands (hf jobs)
run: Run a Job on Hugging Face infrastructure.
inspect: Display detailed information on one or more Jobs.
logs: Fetch the logs of a Job.
ps: List running Jobs.
cancel: Cancel a Job.

hashtag#HuggingFace hashtag#MachineLearning hashtag#AI hashtag#DeepLearning hashtag#MLTools hashtag#MLOps hashtag#OpenSource hashtag#Python hashtag#DataScience hashtag#DevTools hashtag#LLM hashtag#hfCLI hashtag#GenerativeAI
  • 1 reply
ยท
yjerniteย 
posted an update 6 days ago
view post
Post
3896
๐—™๐—ถ๐—ฟ๐˜€๐˜ ๐—š๐—ฃ๐—”๐—œ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜„๐—ถ๐˜๐—ต ๐—˜๐—จ ๐——๐—ฎ๐˜๐—ฎ ๐—ง๐—ฟ๐—ฎ๐—ป๐˜€๐—ฝ๐—ฎ๐—ฟ๐—ฒ๐—ป๐—ฐ๐˜† ๐—ง๐—ฒ๐—บ๐—ฝ๐—น๐—ฎ๐˜๐—ฒ? ๐Ÿ‡ช๐Ÿ‡บ

With the release of the EU data transparency template this week, we finally got to see one of the most meaningful artifacts to come out of the AI Act implementation so far (haven't you heard? AI's all about the data! ๐Ÿ“Š๐Ÿ“š)

The impact of the template will depend on how effectively it establishes a minimum meaningful transparency standard for companies that don't otherwise offer any transparency into their handling of e.g. personal data or (anti?-)competitive practices in commercial licensing - we'll see how those play out as new models are released after August 2nd ๐Ÿ‘€


In the meantime, I wanted to see how the template works for a fully open-source + commercially viable model, so I filled it out for the SmolLM3 - which my colleagues at Hugging Face earlier this month ๐Ÿค— ICYMI, it's fully open-source with 3B parameters and performance matching the best similar-size models (I've switched all my local apps from Qwen3 to it, you should too ๐Ÿ’ก)

Verdict: congrats to the European Commission AI Office for making it so straightforward! Fully open and transparent models remain a cornerstone of informed regulation and governance, but the different organizational needs of their developers aren't always properly accounted for in new regulation. In this case, it took me all of two hours to fill out and publish the template (including reading the guidelines) - so kudos for making it feasible for smaller and distributed organizations ๐Ÿ™Œ Definitely a step forward for transparency ๐Ÿ”

To learn more have a look at:

- The SmolLM3 model: HuggingFaceTB/SmolLM3-3B
- Its filled out Public Summary of Training Content: hfmlsoc/smollm3-eu-data-transparency
- And if you're interested, some previous remarks on regulatory minimum meaningful standards for data disclosure: https://huggingface.co/blog/yjernite/naiac-data-transparency
Abhaykoulย 
posted an update 18 days ago
view post
Post
2986
๐ŸŽ‰ Dhanishtha-2.0-preview-0725 is Now Live

The Intermediate Thinking Model just got even better.
With the new update, Dhanishtha is now sharper, smarter, and trained further on tool use

๐Ÿง  What Makes Dhanishtha Different?
Unlike standard COT models that give one-shot responses, Dhanishtha thinks in layers:

> Think โ†’ Answer โ†’ Rethink โ†’ Improve โ†’ Rethink again if needed.

HelpingAI/Dhanishtha-2.0-preview-0725
albertvillanovaย 
posted an update 23 days ago
view post
Post
512
๐Ÿš€ New in smolagents v1.20.0: Remote Python Execution via WebAssembly (Wasm)

We've just merged a major new capability into the smolagents framework: the CodeAgent can now execute Python code remotely in a secure, sandboxed WebAssembly environment!

๐Ÿ”ง Powered by Pyodide and Deno, this new WasmExecutor lets your agent-generated Python code run safely: without relying on Docker or local execution.

Why this matters:
โœ… Isolated execution = no host access
โœ… No need for Python on the user's machine
โœ… Safer evaluation of arbitrary code
โœ… Compatible with serverless / edge agent workloads
โœ… Ideal for constrained or untrusted environments

This is just the beginning: a focused initial implementation with known limitations. A solid MVP designed for secure, sandboxed use cases. ๐Ÿ’ก

๐Ÿ’ก We're inviting the open-source community to help evolve this executor:
โ€ข Tackle more advanced Python features
โ€ข Expand compatibility
โ€ข Add test coverage
โ€ข Shape the next-gen secure agent runtime

๐Ÿ”— Check out the PR: https://github.com/huggingface/smolagents/pull/1261

Let's reimagine what agent-driven Python execution can look like: remote-first, wasm-secure, and community-built.

This feature is live in smolagents v1.20.0!
Try it out.
Break things. Extend it. Give us feedback.
Let's build safer, smarter agents; together ๐Ÿง โš™๏ธ

๐Ÿ‘‰ https://github.com/huggingface/smolagents/releases/tag/v1.20.0

#smolagents #WebAssembly #Python #AIagents #Pyodide #Deno #OpenSource #HuggingFace #AgenticAI
Parveshiiiiย 
posted an update 27 days ago
view post
Post
2651
๐Ÿง  Glimpses of AGI โ€” A Vision for All Humanity
What if AGI wasnโ€™t just a distant dreamโ€”but a blueprint already unfolding?

Iโ€™ve just published a deep dive called Glimpses of AGI, exploring how scalable intelligence, synthetic reasoning, and alignment strategies are paving a new path forward. This isnโ€™t your average tech commentaryโ€”itโ€™s a bold vision for conscious AI systems that reason, align, and adapt beyond narrow tasks.

๐Ÿ” Read it, upvote it if it sparks something, and letโ€™s ignite a collective conversation about the future of AGI.

https://huggingface.co/blog/Parveshiiii/glimpses-of-agi


Parveshiiiiย 
posted an update 30 days ago
view post
Post
2801
๐Ÿง  MathX-5M by XenArcAI โ€” Scalable Math Reasoning for Smarter LLMs

Introducing MathX-5M, a high-quality, instruction-tuned dataset built to supercharge mathematical reasoning in large language models. With 5 million rigorously filtered examples, it spans everything from basic arithmetic to advanced calculusโ€”curated from public sources and enhanced with synthetic data.

๐Ÿ” Key Highlights:
- Step-by-step reasoning with verified answers
- Covers algebra, geometry, calculus, logic, and more
- RL-validated correctness and multi-stage filtering
- Ideal for fine-tuning, benchmarking, and educational AI

๐Ÿ“‚ - XenArcAI/MathX-5M


  • 1 reply
ยท
Abhaykoulย 
posted an update about 1 month ago
view post
Post
2978
๐ŸŽ‰ Dhanishtha 2.0 Preview is Now Open Source!

The world's first Intermediate Thinking Model is now available to everyone!

Dhanishtha 2.0 Preview brings revolutionary intermediate thinking capabilities to the open-source community. Unlike traditional reasoning models that think once, Dhanishtha can think, answer, rethink, answer again, and continue rethinking as needed using multiple blocks between responses.

๐Ÿš€ Key Features
- Intermediate thinking: Think โ†’ Answer โ†’ Rethink โ†’ Answer โ†’ Rethink if needed...
- Token efficient: Uses up to 79% fewer tokens than DeepSeek R1 on similar queries
- Transparent thinking: See the model's reasoning process in real-time
- Open source: Freely available for research and development


HelpingAI/Dhanishtha-2.0-preview
https://helpingai.co/chat
  • 1 reply
ยท
Nymboย 
posted an update about 1 month ago
view post
Post
2291
Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?
ยท
Abhaykoulย 
posted an update about 1 month ago
view post
Post
4375
Introducing Dhanishtha 2.0: World's first Intermediate Thinking Model

Dhanishtha 2.0 is the world's first LLM designed to think between the responses. Unlike other Reasoning LLMs, which think just once.

Dhanishtha can think, rethink, self-evaluate, and refine in between responses using multiple <think> blocks.
This technique makes it Hinghlt Token efficient it Uses up to 79% fewer tokens than DeepSeek R1
---

You can try our model from: https://helpingai.co/chat
Also, we're gonna Open-Source Dhanistha on July 1st.

---
For Devs:
๐Ÿ”‘ Get your API key at https://helpingai.co/dashboard
from HelpingAI import HAI  # pip install HelpingAI==1.1.1
from rich import print

hai = HAI(api_key="hl-***********************")

response = hai.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What is the value of โˆซ0โˆž๐‘ฅ3/๐‘ฅโˆ’1๐‘‘๐‘ฅ ?"}],
    stream=True,
    hide_think=False # Hide or show models thinking
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)
  • 2 replies
ยท
albertvillanovaย 
posted an update about 1 month ago
view post
Post
1641
๐Ÿš€ SmolAgents v1.19.0 is live!
This release brings major improvements to agent flexibility, UI usability, streaming architecture, and developer experience: making it easier than ever to build smart, interactive AI agents. Here's what's new:

๐Ÿ”ง Agent Upgrades
- Support for managed agents in ToolCallingAgent
- Context manager support for cleaner agent lifecycle handling
- Output formatting now uses XML tags for consistency

๐Ÿ–ฅ๏ธ UI Enhancements
- GradioUI now supports reset_agent_memory: perfect for fresh starts in dev & demos.

๐Ÿ”„ Streaming Refactor
- Streaming event aggregation moved off the Model class
- โžก๏ธ Better architecture & maintainability

๐Ÿ“ฆ Output Tracking
- CodeAgent outputs are now stored in ActionStep
- โœ… More visibility and structure to agent decisions

๐Ÿ› Bug Fixes
- Smarter planning logic
- Cleaner Docker logs
- Better prompt formatting for additional_args
- Safer internal functions and final answer matching

๐Ÿ“š Docs Improvements
- Added quickstart examples with tool usage
- One-click Colab launch buttons
- Expanded reference docs (AgentMemory, GradioUI docstrings)
- Fixed broken links and migrated to .md format

๐Ÿ”— Full release notes:
https://github.com/huggingface/smolagents/releases/tag/v1.19.0

๐Ÿ’ฌ Try it out, explore the new features, and let us know what you build!

#smolagents #opensource #AIagents #LLM #HuggingFace
yjerniteย 
posted an update about 2 months ago
Threatthriverย 
posted an update about 2 months ago
view post
Post
285
New Dataset Released
albertvillanovaย 
posted an update 2 months ago