huggingPartyParis

community

https://partiful.com/e/oWOMGoPxB5D37qw5F8yN

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

ananthu-aniraj authored a paper 26 days ago

Inherently Faithful Attention Maps for Vision Transformers

Borchmann authored a paper 3 months ago

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA

altndrr authored a paper 3 months ago

On Large Multimodal Models as Open-World Image Classifiers

View all activity

Jofthomas

posted an update about 2 months ago

Post

3457

Meet our new agentic model : 𝗗𝗲𝘃𝘀𝘁𝗿𝗮𝗹

Devstral is an open-source LLM built software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌.

𝗞𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 :
• 🤖 𝗔𝗴𝗲𝗻𝘁𝘀 : perfect for Agentic coding
• 🍃 𝗹𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁: Devstral is a 𝟮𝟰𝗕 parameter based on Mistral small.
• ©️ 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬, meaning fully open-source !
• 📄 A 𝟭𝟮𝟴𝗸 context window.

📚Blog : https://mistral.ai/news/devstral
⚡API : The model is also available on our API under the name 𝗱𝗲𝘃𝘀𝘁𝗿𝗮𝗹-𝘀𝗺𝗮𝗹𝗹-𝟮𝟱𝟬𝟱
🤗 repo : mistralai/Devstral-Small-2505

Can't wait to see what you will build with it !

1 reply

josh-r-meyer

authored a paper about 2 months ago

BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus

Paper • 2207.03546 • Published Jul 7, 2022 • 2

leonsick

authored a paper 3 months ago

Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding

Paper • 2504.06719 • Published Apr 9 • 9

dovpie

authored a paper 3 months ago

Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation

Paper • 2503.21780 • Published Mar 27 • 9

not-lain

posted an update 4 months ago

Post

3866

🚀AraClip is now fully integrated with Hugging Face 🤗

AraClip is a specialized CLIP model that was created by @pain and optimized for Arabic text-image retrieval tasks🔥

🔗 Try it out 🔗
🤖 model: Arabic-Clip/araclip
🧩 Gradio demo: Arabic-Clip/Araclip-Simplified
🌐 website: https://arabic-clip.github.io/Arabic-CLIP/

2 replies

ivas-tri

authored a paper 4 months ago

Should VLMs be Pre-trained with Image Data?

Paper • 2503.07603 • Published Mar 10 • 3

not-lain

posted an update 5 months ago

Post

4514

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co/blog/not-lain/kv-caching/
some takeaways :

4 replies

thethomasboyer

authored a paper 6 months ago

PhenDiff: Revealing Invisible Phenotypes with Conditional Diffusion Models

Paper • 2312.08290 • Published Dec 13, 2023 • 3

not-lain

posted an update 6 months ago

Post

1784

we now have more than 2000 public AI models using ModelHubMixin🤗

not-lain

posted an update 6 months ago

Post

4131

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

leonsick

authored 3 papers 8 months ago

Leveraging Self-Supervised Vision Transformers for Neural Transfer Function Design

Paper • 2309.01408 • Published Sep 4, 2023

Evaluating Text to Image Synthesis: Survey and Taxonomy of Image Quality Metrics

Paper • 2403.11821 • Published Mar 18, 2024 • 3

CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation

Paper • 2411.16319 • Published Nov 25, 2024

not-lain

posted an update 8 months ago

Post

2437

ever wondered how you can make an API call to a visual-question-answering model without sending an image url 👀

you can do that by converting your local image to base64 and sending it to the API.

recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg

API request example 🛠️:

from loadimg import load_img
from huggingface_hub import InferenceClient

# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" ) 

client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

messages = [
	{
		"role": "user",
		"content": [
			{
				"type": "text",
				"text": "Describe this image in one sentence."
			},
			{
				"type": "image_url",
				"image_url": {
					"url": my_b64_img # base64 allows using images without uploading them to the web
				}
			}
		]
	}
]

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct", 
	messages=messages, 
	max_tokens=500,
	stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

devendrachaplot

authored a paper 9 months ago

Pixtral 12B

Paper • 2410.07073 • Published Oct 9, 2024 • 67

Simontwice

authored a paper 9 months ago

Pixtral 12B

Paper • 2410.07073 • Published Oct 9, 2024 • 67

imthanhlv

authored a paper 11 months ago

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 63

Jofthomas

posted an update 11 months ago

Post

7641

Everchanging Quest is out !

It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.

Come test it on the space :
Jofthomas/Everchanging-Quest