Gautier Evennou's picture

Gautier Evennou

Gevennou

·

AI & ML interests

PhD in ML on Multimodal

Organizations

Gevennou's activity

upvoted a paper about 1 month ago

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 90

upvoted a paper about 2 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 100

upvoted 2 papers 5 months ago

StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images

Paper • 2406.13735 • Published Jun 19 • 5

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15 • 65

upvoted 2 papers 8 months ago

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Paper • 2403.11781 • Published Mar 18 • 17

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 64

upvoted 2 papers 10 months ago

TOFU: A Task of Fictitious Unlearning for LLMs

Paper • 2401.06121 • Published Jan 11 • 15

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Paper • 2401.01827 • Published Jan 3 • 15

upvoted 3 papers 11 months ago

MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices

Paper • 2312.16886 • Published Dec 28, 2023 • 19

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

Paper • 2312.11392 • Published Dec 18, 2023 • 19

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Paper • 2312.00845 • Published Dec 1, 2023 • 36

upvoted 4 papers over 1 year ago

CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields

Paper • 2307.11526 • Published Jul 21, 2023 • 11

Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language

Paper • 2306.16410 • Published Jun 28, 2023 • 27

Retrieval-Enhanced Contrastive Vision-Text Models

Paper • 2306.07196 • Published Jun 12, 2023 • 7

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 113