Yatharth Gupta's picture

Yatharth Gupta

Warlord-K

AI & ML interests

Generative AI, Natural Language Processing, Reinforcement Learning

Recent Activity

updated a Space about 2 months ago
Warlord-K/AI-Teller
published a Space about 2 months ago
Warlord-K/AI-Teller
updated a model 3 months ago
Alpha-VLLM/Lumina-Image-2.0
View all activity

Organizations

Segmind's profile picture fal's profile picture Cynaptics Club, IIT Indore's profile picture Social Post Explorers's profile picture AuraFlow's profile picture Kanha AI's profile picture

Warlord-K's activity

replied to gokaygokay's post 10 months ago
view reply

This is great! Florence 2 is an amazing model for its size but it hallucinates a lot in terms of positioning and color, that's reduced with this model but I think it still persists

replied to gokaygokay's post 10 months ago
view reply

What are your thoughts on Florence2? Do you think finetuning it on these datasets will help on the captioning task?

reacted to gokaygokay's post with โค๏ธ 10 months ago
view post
Post
5976
I've fine-tuned three types of PaliGemma image captioner models for generating prompts for Text2Image models. They generate captions similar to prompts we give to the image generation models. I used google/docci and google/imageinwords datasets for fine-tuning.

This one gives you longer captions.

gokaygokay/SD3-Long-Captioner

This one gives you middle size captions.

https://huggingface.co/spaces/gokaygokay/SD3-Long-Captioner-V2

And this one gives you shorter captions.

https://huggingface.co/spaces/gokaygokay/SDXL-Captioner

ยท
replied to their post 11 months ago
view reply

Do you mean the consistency between generations? Could you elaborate a little?

replied to their post 11 months ago
posted an update 12 months ago
view post
Post
1507
What are some areas that Image generation models are currently lacking in?
ยท