[MODELS] Discussion

#372
by victor - opened
Hugging Chat org
edited Sep 23, 2024

Here we can discuss about HuggingChat available models.

image.png

victor pinned discussion

what are limits of using these? how many api calls can i send them per month?

How can I know which model am using

How can I know which model am using

at the bottom of your screen:
image.png

Out of all these models, Gemma, which was recently released, has the newest information about .NET. However, I don't know which one has the most accurate answers regarding coding

Gemma seems really biased. With web search on, it says that it doesn't have access to recent information asking it almost anything about recent events. But when I ask it about recent events with Google, I get responses with the recent events.

apparently gemma cannot code?

Gemma is just like Google's Gemini series models, it have a very strong moral limit put on, any operation that may related to file operation, access that might be deep, would be censored and refused to reply.
So even there are solution for such things in its training data, it will just be filtered and ignored.
But still didn't test the coding accuracy that doesn't related to these kind of "dangerous" operations

The new NousResearch/Hermes-3-Llama-3.1-8B is really bad I mean, I like that it adds a lot more detail when writing a story which helps with worldbuilding and character building, but about halfway through writing the model gets really weird and either adds stuff that's not in the prompt at all like weird things or even to the point of messed up things or completely changes to a whole new story.

Honestly, it starts writing in a way that reads like a fever dream because the writing starts off great even amazing sometimes, but every single time it will go off the rails halfway through and almost becomes unreadable.

@MadderHatterMaxI've, I've found the smaller models to have more difficulty with complex pronouns and logic. But the older models also had a more limited context length. To deal with that, some of them had a scrolling context window. The longer the chat, the more context they would lose, going out of scope, forgetting as it were, and then hallucinating to make up for it. If you reference something, but the model doesn't remember it, it will be like, oh, yeah, I remember that, and make stuff up on the spot.
But there were also issues where the model would get caught in a loop, and sometimes problems where they'd spit out gibberish. Reducing it through quantization could make it even worse. I don't know if it throws the weights or parameters off, but they suffer significantly. In time, maybe it will be possible to get a specialized GPU or CPU in a box that could attach through a USB port or something to provide the memory and power for a local larger model. Solutions will come if there is enough interest and buying power. Currently there are a number of interesting projects such as artificial neurons and spintronics and whatnot that could reduce power consumption and increase processing speed. Every once in a while someone comes up with a another technique to handle context. As far as the detail you mentioned, how you phrase your system prompt can make a big difference in the responses you get. If you want immersive sensory descriptions, ask for them. Tell the model you want descriptions of anything that you might interact with, or that is introduced for the first time. If you want worldbuilding, tell the model you want it to focus on worldbuilding. Sometimes it isn't a matter of what the model is capable of, it is more of what way and what we ask of it. Try to cram a lot of data into a smaller model, and you'll probably end up with a lot of fragmentation that the model is trying to make sense of. Bad analogy, but it is kind like trying to see the world clearly through a pane of frosted glass. And then again, the model might just be borked. Sorry, didn't sleep last night, causes me to ramble.

Hi guys the current search feature encounters an error while fetching the internet (its a bug!), @nsarrazin please fix it.
We used to submit this issue on github, then he fixed it for us. But this time, the bug reappeared again.

Hi guys the current search feature encounters an error while fetching the internet (its a bug!), @nsarrazin please fix it.
We used to submit this issue on github, then he fixed it for us. But this time, the bug reappeared again.

here is the full github issue: https://github.com/huggingface/chat-ui/issues/1812
any help?

Dear Hugging Face Team,

First of all, thank you for building and continuously improving HuggingChat — it's an excellent platform that provides an accessible and user-friendly interface to powerful AI models. Your efforts have truly made Hugging Face a go-to destination for developers, researchers, and AI enthusiasts around the world.

As a regular user and fan of HuggingChat, I’d like to propose an enhancement that could significantly enrich the overall user experience:

Suggestion: Integrate More Up-to-Date Open-Source LLMs into HuggingChat (not just in the Playground)

While HuggingChat currently offers great performance, expanding the selection of available models—especially directly within HuggingChat and not only in the Hugging Playground—would provide users with more flexibility, comparison opportunities, and better alignment with the rapidly evolving open-source LLM landscape.

Recommended Models for Inclusion:
1)DeepSeek-V3
2)LLaMA-4 Maverick 03-26 (Experimental)
3)LLaMA-4 Maverick-17B-128E Instruct
4)LLaMA 4 Scout
and MORE Other top-tier open-source models as of May 2025 (newer versions of Gemma, Falcon, Qwen, Mistral Large, Dolphin etc.) unlike only ONE MODEL IS QUITE GOOD IN HUGGINGCHAT (LLAMA 3.3 - 70B) AND ALL OTHERS ARE OLDER/SLOWER AND OUTDATED(YOU CAN REPLACE THEM WITH NEWER AND HIGHLY CAPABLE, OPEN-SOURCE MODELS)...refer the images

image.png

These models represent the latest advancements in open-source LLMs, and their integration into HuggingChat would enable users to experiment with, compare, and build upon a wider variety of model architectures and training paradigms.

✅ Why This Matters:

  • Encourages innovation and diversity in AI usage
  • Helps showcase Hugging Face as a leader in open-source AI access
  • Provides a better and more flexible user experience
  • Supports researchers and developers interested in the cutting edge

Once again, thank you for the amazing tools you’re creating and maintaining. I hope this suggestion adds value to the future roadmap of HuggingChat. Keep up the incredible work! 💙

Screenshot 2025-05-05 170836.jpg

Dear Hugging Face Team,

First of all, thank you for building and continuously improving HuggingChat — it's an excellent platform that provides an accessible and user-friendly interface to powerful AI models. Your efforts have truly made Hugging Face a go-to destination for developers, researchers, and AI enthusiasts around the world.

As a regular user and fan of HuggingChat, I’d like to propose an enhancement that could significantly enrich the overall user experience:

Suggestion: Integrate More Up-to-Date Open-Source LLMs into HuggingChat (not just in the Playground)

While HuggingChat currently offers great performance, expanding the selection of available models—especially directly within HuggingChat and not only in the Hugging Playground—would provide users with more flexibility, comparison opportunities, and better alignment with the rapidly evolving open-source LLM landscape.

Recommended Models for Inclusion:
1)DeepSeek-V3
2)LLaMA-4 Maverick 03-26 (Experimental)
3)LLaMA-4 Maverick-17B-128E Instruct
4)LLaMA 4 Scout
and MORE Other top-tier open-source models as of May 2025 (newer versions of Gemma, Falcon, Qwen, Mistral Large, Dolphin etc.) unlike only ONE MODEL IS QUITE GOOD IN HUGGINGCHAT (LLAMA 3.3 - 70B) AND ALL OTHERS ARE OLDER/SLOWER AND OUTDATED(YOU CAN REPLACE THEM WITH NEWER AND HIGHLY CAPABLE, OPEN-SOURCE MODELS)...refer the images

image.png

These models represent the latest advancements in open-source LLMs, and their integration into HuggingChat would enable users to experiment with, compare, and build upon a wider variety of model architectures and training paradigms.

✅ Why This Matters:

  • Encourages innovation and diversity in AI usage
  • Helps showcase Hugging Face as a leader in open-source AI access
  • Provides a better and more flexible user experience
  • Supports researchers and developers interested in the cutting edge

Once again, thank you for the amazing tools you’re creating and maintaining. I hope this suggestion adds value to the future roadmap of HuggingChat. Keep up the incredible work! 💙

Screenshot 2025-05-05 170836.jpg

since deepseek r1 is already there, v3 is not needed. it is ok to replace llama 3 with llama 4 maverick instead of scout. since maverick performs better.

since deepseek r1 is already there, v3 is not needed. it is ok to replace llama 3 with llama 4 maverick instead of scout. since maverick performs better.

BUT DEEPSEEK-V3 IS BETTER THAN R1 IN CODING. IT IS QWEN-FINED TUNED BTW, NOT THE ORIGINAL R1...THE HUGGINGCHAT VERSION OUTPERFORMS o1-mini, BUT THE ORIGINAL R1 OUTPERFORMS OpenAI's o1

since deepseek r1 is already there, v3 is not needed. it is ok to replace llama 3 with llama 4 maverick instead of scout. since maverick performs better.

BUT DEEPSEEK-V3 IS BETTER THAN R1 IN CODING. IT IS QWEN-FINED TUNED BTW, NOT THE ORIGINAL R1...THE HUGGINGCHAT VERSION OUTPERFORMS o1-mini, BUT THE ORIGINAL R1 OUTPERFORMS OpenAI's o1

wait. no need for v3. bc r2 is about to be released.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment