I agree; currently hosting a custom model is not viable either up front hardware or serves both expensive. Who can host open source models cheaper will get the public attention. As a start up or individual , you can invest in fine tuning and training with high end hardware to make a specialised model get the weight and pipeline perfect but if you want to host it and respond to the scale of the demand or availability of the service it’s difficult not sustainable.
JT PRO
telcom
AI & ML interests
text-to-image, image-to-image
Unlearning, training models.
Recent Activity
updated
a model
about 3 hours ago
telcom/deewaiREALCN
replied to
their
post
about 6 hours ago
NVIDIA’s Groq deal ... I think, inference efficiency is becoming the main driver of profitability, and NVIDIA’s Groq deal is evidence the market is moving from “who can train biggest” to “who can serve cheapest and fastest at scale.” That points to a maturing phase of AI, not necessarily the end of a bubble, but definitely a correction in what “wins” long-term.
What do you think?
posted
an
update
about 10 hours ago
NVIDIA’s Groq deal ... I think, inference efficiency is becoming the main driver of profitability, and NVIDIA’s Groq deal is evidence the market is moving from “who can train biggest” to “who can serve cheapest and fastest at scale.” That points to a maturing phase of AI, not necessarily the end of a bubble, but definitely a correction in what “wins” long-term.
What do you think?
Organizations
replied to
their
post
about 6 hours ago
posted
an
update
about 10 hours ago
Post
69
NVIDIA’s Groq deal ... I think, inference efficiency is becoming the main driver of profitability, and NVIDIA’s Groq deal is evidence the market is moving from “who can train biggest” to “who can serve cheapest and fastest at scale.” That points to a maturing phase of AI, not necessarily the end of a bubble, but definitely a correction in what “wins” long-term.
What do you think?
What do you think?
posted
an
update
3 days ago
Post
158
CIFAR-10 your handing image dataset ...
CIFAR-10 is a small, standard computer-vision dataset used to quickly test and compare ideas.
- 60,000 color images, each 32×32 pixels, labeled into 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.
- Label mapping (important):
- 0 airplane
- 1 automobile
- 2 bird
- 3 cat
- 4 deer
- 5 dog
- 6 frog
- 7 horse
- 8 ship
- 9 truck
- Split: 50,000 train and 10,000 test.
- Why people use it: fast benchmarking for image classifiers (small CNNs, ResNet, ViT), and quick experiments for training pipelines, augmentation, regularization, pruning, distillation, and demos.
- Sizes (downloads): Python version about 163 MB, binary about 162 MB. Hugging Face shows about 144 MB for the dataset files.
- Where to get it: the official CIFAR page (University of Toronto) and the Hugging Face CIFAR-10 dataset page.
uoft-cs/cifar10
If you want something more, check the table below
| Dataset | Resolution | Classes | Best For |
| ImageNet 1K | 224–256×256 | 1000 | Real-world large-scale classification |
| ImageNet-256. | 256×256 | 1000 | Direct high-res training |
| TinyImageNet | 64×64 | 200 | Mid-range benchmark |
| UC Merced Land Use | 256×256 | ~21 | Higher resolution small classification |
| MS COCO | >256×256 | ~80 objects | Detection / segmentation |
CIFAR-10 is a small, standard computer-vision dataset used to quickly test and compare ideas.
- 60,000 color images, each 32×32 pixels, labeled into 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.
- Label mapping (important):
- 0 airplane
- 1 automobile
- 2 bird
- 3 cat
- 4 deer
- 5 dog
- 6 frog
- 7 horse
- 8 ship
- 9 truck
- Split: 50,000 train and 10,000 test.
- Why people use it: fast benchmarking for image classifiers (small CNNs, ResNet, ViT), and quick experiments for training pipelines, augmentation, regularization, pruning, distillation, and demos.
- Sizes (downloads): Python version about 163 MB, binary about 162 MB. Hugging Face shows about 144 MB for the dataset files.
- Where to get it: the official CIFAR page (University of Toronto) and the Hugging Face CIFAR-10 dataset page.
uoft-cs/cifar10
If you want something more, check the table below
| Dataset | Resolution | Classes | Best For |
| ImageNet 1K | 224–256×256 | 1000 | Real-world large-scale classification |
| ImageNet-256. | 256×256 | 1000 | Direct high-res training |
| TinyImageNet | 64×64 | 200 | Mid-range benchmark |
| UC Merced Land Use | 256×256 | ~21 | Higher resolution small classification |
| MS COCO | >256×256 | ~80 objects | Detection / segmentation |
reacted to
John6666's
post with ❤️👍
3 days ago
Post
27127
If your Space stops working after restarting mainly for the last 5 days (https://discuss.huggingface.co/t/my-space-suddenly-went-offline-the-cpu-cannot-restart/151121/22), try some of following.
1. Add
2. Upgrade PyTorch to 2.2.0 or later (
3. Fix Transformers to 4.49.0 or earlier (
4. Fix
5. Specifying
About
https://discuss.huggingface.co/t/error-no-api-found/146226
https://discuss.huggingface.co/t/internal-server-error-bool-not-iterable/149494
Edit:
Zero GPU space has been upgraded from A100 to H200.
This is likely the reason why older versions of PyTorch are no longer supported.
In fact, an error message to that effect was displayed.
zero-gpu-explorers/README#163
1. Add
pydantic==2.10.6 to requirements.txt or upgrade Gradio to the latest version.2. Upgrade PyTorch to 2.2.0 or later (
torch>=2.2.0 for Zero GPU space).3. Fix Transformers to 4.49.0 or earlier (
transformers<=4.49.0for spaces using Transformers or Diffusers).4. Fix
huggingface_hub to the old version (huggingface_hub==0.25.2 for if an error like cached_download is not available occurs or inference does not work properly)5. Specifying
WORKDIR in Dockerfile may cause the application to fail to start with error 137. (Docker Spaces, https://discuss.huggingface.co/t/error-code-137-cache-error/152177)About
pydantic==2.10.6:https://discuss.huggingface.co/t/error-no-api-found/146226
https://discuss.huggingface.co/t/internal-server-error-bool-not-iterable/149494
Edit:
Zero GPU space has been upgraded from A100 to H200.
This is likely the reason why older versions of PyTorch are no longer supported.
In fact, an error message to that effect was displayed.
zero-gpu-explorers/README#163
reacted to
davidmezzetti's
post with 🚀
3 days ago
Post
2914
🧬⚕️🔬 Encoding the World's Medical Knowledge into 970K! We're excited to release this new series of vector embeddings models for medical literature based on our recent BERT Hash work.
And you read it right, we're talking 970,000 parameters for a surprisingly strong performing model. Enjoy!
https://huggingface.co/blog/neuml/biomedbert-hash-nano
And you read it right, we're talking 970,000 parameters for a surprisingly strong performing model. Enjoy!
https://huggingface.co/blog/neuml/biomedbert-hash-nano
Post
270
In using model end points, do you think less commitment, pay-as-you go no subscription model, works better for general public?
🚀 Just launched
👉 [https://deegitals.com](https://deegitals.com)
Currently focused on text-to-image or image-to-image.
Sign up for free and share your feedback.
🚀 Just launched
👉 [https://deegitals.com](https://deegitals.com)
Currently focused on text-to-image or image-to-image.
Sign up for free and share your feedback.
Post
262
Recently I was playing with my model . What is your idea about "unlearning" since I need it 😀
telcom/deewaiREALCN, I have the original one on the main branch and trained version "cp550" and "n_680" on anther branch.
Both trained on telcom/deewaiREALCN-training.
I got three results when doing prompt:
"Athlete portrait, 26-year-old woman, post-training sweat, gym ambient light, chalk dust particles, intense gaze, crisp detail."
Apparently, model is sensitive to the word "old".
You can see the training on more faces improved from main, however, still not ideal...
I am working now on unlearning. I would like to hear about your opinion.
#unlearning
telcom/deewaiREALCN, I have the original one on the main branch and trained version "cp550" and "n_680" on anther branch.
Both trained on telcom/deewaiREALCN-training.
I got three results when doing prompt:
"Athlete portrait, 26-year-old woman, post-training sweat, gym ambient light, chalk dust particles, intense gaze, crisp detail."
Apparently, model is sensitive to the word "old".
You can see the training on more faces improved from main, however, still not ideal...
I am working now on unlearning. I would like to hear about your opinion.
#unlearning
replied to
their
post
5 days ago
Sign up using Gmail or email
posted
an
update
5 days ago
Post
2030
arXiv CS endorsement
It's Javad, my Google Scholar Profile:
https://scholar.google.com/citations?user=bja6GwoAAAAJ&hl=en
I would like to share my articles with you on Hugging Face, I'm asking for endorsement* in Computer Science arxiv.org.
If you would like to endorse me, please visit the following URL:
https://arxiv.org/auth/endorse?x=NVUAPL
If that URL does not work for you, please visit
http://arxiv.org/auth/endorse.php
and enter the following six-digit alphanumeric string:
Endorsement Code: NVUAPL
Thanks you in advance.
Javad Taghia
* Who is qualified to endorse?
To endorse another user to submit to the cs.AI (Artificial Intelligence) subject class, an arXiv submitter must have submitted 3 papers to any of cs.AI, cs.AR, cs.CC, cs.CE, cs.CG, cs.CL, cs.CR, cs.CV, cs.CY, cs.DB, cs.DC, cs.DL, cs.DM, cs.DS, cs.ET, cs.FL, cs.GL, cs.GR, cs.GT, cs.HC, cs.IR, cs.IT, cs.LG, cs.LO, cs.MA, cs.MM, cs.MS, cs.NA, cs.NE, cs.NI, cs.OH, cs.OS, cs.PF, cs.PL, cs.RO, cs.SC, cs.SD, cs.SE, cs.SI or cs.SY earlier than three months ago and less than five years ago.
It's Javad, my Google Scholar Profile:
https://scholar.google.com/citations?user=bja6GwoAAAAJ&hl=en
I would like to share my articles with you on Hugging Face, I'm asking for endorsement* in Computer Science arxiv.org.
If you would like to endorse me, please visit the following URL:
https://arxiv.org/auth/endorse?x=NVUAPL
If that URL does not work for you, please visit
http://arxiv.org/auth/endorse.php
and enter the following six-digit alphanumeric string:
Endorsement Code: NVUAPL
Thanks you in advance.
Javad Taghia
* Who is qualified to endorse?
To endorse another user to submit to the cs.AI (Artificial Intelligence) subject class, an arXiv submitter must have submitted 3 papers to any of cs.AI, cs.AR, cs.CC, cs.CE, cs.CG, cs.CL, cs.CR, cs.CV, cs.CY, cs.DB, cs.DC, cs.DL, cs.DM, cs.DS, cs.ET, cs.FL, cs.GL, cs.GR, cs.GT, cs.HC, cs.IR, cs.IT, cs.LG, cs.LO, cs.MA, cs.MM, cs.MS, cs.NA, cs.NE, cs.NI, cs.OH, cs.OS, cs.PF, cs.PL, cs.RO, cs.SC, cs.SD, cs.SE, cs.SI or cs.SY earlier than three months ago and less than five years ago.
posted
an
update
9 days ago
Post
270
In using model end points, do you think less commitment, pay-as-you go no subscription model, works better for general public?
🚀 Just launched
👉 [https://deegitals.com](https://deegitals.com)
Currently focused on text-to-image or image-to-image.
Sign up for free and share your feedback.
🚀 Just launched
👉 [https://deegitals.com](https://deegitals.com)
Currently focused on text-to-image or image-to-image.
Sign up for free and share your feedback.
replied to
StJohnDeakins's
post
16 days ago
Huggingface
replied to
StJohnDeakins's
post
17 days ago
"DNA"
reacted to
StJohnDeakins's
post with 🚀
17 days ago
Post
1999
Hey all 👋
A Quick one for any founders building with Small Language Models in mobile apps: We’re opening 10 Innovation Partner spots this month for our Device Native AI (DNA) platform.
What you get:
- Device Native AI SDK (AI processes data on-device, not cloud 📲)
- 99% off for 3 months, then 90% off for the rest of the year (no lock-in)
- Direct engineering access + feature releases
- It's an Innovation community, so at least some participation is required
Perfect if you're building consumer apps and want:
✓ Hyper-personalization without privacy risks
✓ Zero cloud AI token costs
✓ Early access to next-gen mobile AI
Limited spots, and on a first-come basis, so DM me "DNA" for more info and an access code. Cheers Singe 🐵
A Quick one for any founders building with Small Language Models in mobile apps: We’re opening 10 Innovation Partner spots this month for our Device Native AI (DNA) platform.
What you get:
- Device Native AI SDK (AI processes data on-device, not cloud 📲)
- 99% off for 3 months, then 90% off for the rest of the year (no lock-in)
- Direct engineering access + feature releases
- It's an Innovation community, so at least some participation is required
Perfect if you're building consumer apps and want:
✓ Hyper-personalization without privacy risks
✓ Zero cloud AI token costs
✓ Early access to next-gen mobile AI
Limited spots, and on a first-come basis, so DM me "DNA" for more info and an access code. Cheers Singe 🐵
reacted to
dhruv3006's
post with 🚀
17 days ago
Post
2098
Switching between API Client, browser, and API documentation tools to test and document APIs can harm your flow and leave your docs outdated.
This is what usually happens: While debugging an API in the middle of a sprint, the API Client says that everything's fine, but the docs still show an old version.
So you jump back to the code, find the updated response schema, then go back to the API Client, which gets stuck, forcing you to rerun the tests.
Hours can go by just trying to sync all this up (and that’s if you catch the inconsistencies at all).
The reason? Using disconnected tools for specs, tests, and docs. Doing manual updates, stale docs, and a lot of context switching.
Voiden takes a different approach: Puts specs, tests & docs all in one Markdown file, stored right in the repo.
Everything stays in sync, versioned with Git, and updated in one place, inside your editor.
Download Voiden here: https://voiden.md/download
This is what usually happens: While debugging an API in the middle of a sprint, the API Client says that everything's fine, but the docs still show an old version.
So you jump back to the code, find the updated response schema, then go back to the API Client, which gets stuck, forcing you to rerun the tests.
Hours can go by just trying to sync all this up (and that’s if you catch the inconsistencies at all).
The reason? Using disconnected tools for specs, tests, and docs. Doing manual updates, stale docs, and a lot of context switching.
Voiden takes a different approach: Puts specs, tests & docs all in one Markdown file, stored right in the repo.
Everything stays in sync, versioned with Git, and updated in one place, inside your editor.
Download Voiden here: https://voiden.md/download
replied to
melvindave's
post
17 days ago
great step forward, try to do some training afterwards to fine tune it. It's a good step forward.
reacted to
melvindave's
post with 🚀
17 days ago
Post
2562
Currently having a blast learning the transformers library.
I noticed that model cards usually have Transformers code as usage examples.
So I tried to figure out how to load a model just using the transformers library without using ollama, lmstudio, or llamacpp.
Learned how to install dependencies required to make it work like pytorch and CUDA. I also used Conda for python environment dependencies.
Once I got the model loaded and sample inference working, I made an API to serve it.
I know it's very basic stuff for machine learning experts here in HF but I'm completely new to this so I'm happy to get it working!
Model used: Qwen/Qwen3-VL-8B-Instruct
GPU: NVIDIA GeForce RTX 3090
Here's the result of my experimentation
I noticed that model cards usually have Transformers code as usage examples.
So I tried to figure out how to load a model just using the transformers library without using ollama, lmstudio, or llamacpp.
Learned how to install dependencies required to make it work like pytorch and CUDA. I also used Conda for python environment dependencies.
Once I got the model loaded and sample inference working, I made an API to serve it.
I know it's very basic stuff for machine learning experts here in HF but I'm completely new to this so I'm happy to get it working!
Model used: Qwen/Qwen3-VL-8B-Instruct
GPU: NVIDIA GeForce RTX 3090
Here's the result of my experimentation
posted
an
update
17 days ago
Post
262
Recently I was playing with my model . What is your idea about "unlearning" since I need it 😀
telcom/deewaiREALCN, I have the original one on the main branch and trained version "cp550" and "n_680" on anther branch.
Both trained on telcom/deewaiREALCN-training.
I got three results when doing prompt:
"Athlete portrait, 26-year-old woman, post-training sweat, gym ambient light, chalk dust particles, intense gaze, crisp detail."
Apparently, model is sensitive to the word "old".
You can see the training on more faces improved from main, however, still not ideal...
I am working now on unlearning. I would like to hear about your opinion.
#unlearning
telcom/deewaiREALCN, I have the original one on the main branch and trained version "cp550" and "n_680" on anther branch.
Both trained on telcom/deewaiREALCN-training.
I got three results when doing prompt:
"Athlete portrait, 26-year-old woman, post-training sweat, gym ambient light, chalk dust particles, intense gaze, crisp detail."
Apparently, model is sensitive to the word "old".
You can see the training on more faces improved from main, however, still not ideal...
I am working now on unlearning. I would like to hear about your opinion.
#unlearning