Vision Support?
#2
by
sidkandan
- opened
"This current checkpoint only supports text input. We are actively working to roll out full multimodal features and are collaborating with open-source partners to bring Gemma 3n to the open-source community in the coming weeks."
- Any estimates on when Image Input / Vision Capabilities will be available?
Thanks again for all the hard work to optimize this! :)
This comment has been hidden (marked as Off-Topic)
Hi @sidkandan ,
This is the preview repo of Gemma 3n models. In Hugging face repos which are presented for these 3n models in both 2B and 4B models with this capabilities. To know more about Gemma 3n models Kindly refer this link.
And , Thanks so much for your enthusiasm for all full multimodal capabilities of Gemma 3n -litert- preview models ! We have noticed this request and will definitely route it to the concerned team for consideration. Thank you.