Mohamed Sadek Saadi

yohoji

yohoji

AI & ML interests

None yet

Recent Activity

reacted to jasoncorkill's post with 🔥 15 days ago

🔥 Yesterday was a fire day! We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool! Whether you're working on model evaluation, alignment, or fine-tuning, this is for you. 1. Text-to-Video Dataset (Pika 2.2 model): https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-pika2.2 2. Text-to-Image Dataset (Reve-AI Halfmoon): https://huggingface.co/datasets/Rapidata/Reve-AI-Halfmoon_t2i_human_preference Let’s train AI on AI-generated content with humans in the loop. Let’s make generative models that actually get us.

reacted to jasoncorkill's post with ❤️ 21 days ago

🚀 Rapidata: Setting the Standard for Model Evaluation Rapidata is proud to announce our first independent appearance in academic research, featured in the Lumina-Image 2.0 paper. This marks the beginning of our journey to become the standard for testing text-to-image and generative models. Our expertise in large-scale human annotations allows researchers to refine their models with accurate, real-world feedback. As we continue to establish ourselves as a key player in model evaluation, we’re here to support researchers with high-quality annotations at scale. Reach out to [email protected] to see how we can help. https://huggingface.co/papers/2503.21758

reacted to jasoncorkill's post with 🚀 21 days ago

View all activity

Organizations

None yet

yohoji's activity

reacted to jasoncorkill's post with 🔥 15 days ago

Post

3040

🔥 Yesterday was a fire day!
We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool!

Whether you're working on model evaluation, alignment, or fine-tuning, this is for you.

1. Text-to-Video Dataset (Pika 2.2 model):
Rapidata/text-2-video-human-preferences-pika2.2

2. Text-to-Image Dataset (Reve-AI Halfmoon):
Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Let’s train AI on AI-generated content with humans in the loop.
Let’s make generative models that actually get us.

reacted to jasoncorkill's post with ❤️🚀🔥 21 days ago

Post

2374

🚀 Rapidata: Setting the Standard for Model Evaluation

Rapidata is proud to announce our first independent appearance in academic research, featured in the Lumina-Image 2.0 paper. This marks the beginning of our journey to become the standard for testing text-to-image and generative models. Our expertise in large-scale human annotations allows researchers to refine their models with accurate, real-world feedback.

As we continue to establish ourselves as a key player in model evaluation, we’re here to support researchers with high-quality annotations at scale. Reach out to [email protected] to see how we can help.

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework (2503.21758)

reacted to jasoncorkill's post with 🔥🔥 27 days ago

Post

2253

🔥 It's out! We published the dataset for our evaluation of @OpenAI 's new 4o image generation model.

Rapidata/OpenAI-4o_t2i_human_preference

Yesterday we published the first large evaluation of the new model, showing that it absolutely leaves the competition in the dust. We have now made the results and data available here! Please check it out and ❤️ !

liked 3 datasets 28 days ago

reacted to jasoncorkill's post with 🔥 28 days ago

Post

2045

🚀 First Benchmark of @OpenAI 's 4o Image Generation Model!

We've just completed the first-ever (to our knowledge) benchmarking of the new OpenAI 4o image generation model, and the results are impressive!

In our tests, OpenAI 4o image generation absolutely crushed leading competitors, including @black-forest-labs , @google , @xai-org , Ideogram, Recraft, and @deepseek-ai , in prompt alignment and coherence! They hold a gap of more than 20% to the nearest competitor in terms of Bradley-Terry score, the biggest we have seen since the beginning of the benchmark!

The benchmarks are based on 200k human responses collected through our API. However, the most challenging part wasn't the benchmarking itself, but generating and downloading the images:

- 5 hours to generate 1000 images (no API available yet)
- Just 10 minutes to set up and launch the benchmark
- Over 200,000 responses rapidly collected

While generating the images, we faced some hurdles that meant that we had to leave out certain parts of our prompt set. Particularly we observed that the OpenAI 4o model proactively refused to generate certain images:

🚫 Styles of living artists: completely blocked
🚫 Copyrighted characters (e.g., Darth Vader, Pokémon): initially generated but subsequently blocked

Overall, OpenAI 4o stands out significantly in alignment and coherence, especially excelling in certain unusual prompts that have historically caused issues such as: 'A chair on a cat.' See the images for more examples!

1 reply

reacted to jasoncorkill's post with 🔥👍 about 1 month ago

Post

3808

At Rapidata, we compared DeepL with LLMs like DeepSeek-R1, Llama, and Mixtral for translation quality using feedback from over 51,000 native speakers. Despite the costs, the performance makes it a valuable investment, especially in critical applications where translation quality is paramount. Now we can say that Europe is more than imposing regulations.

Our dataset, based on these comparisons, is now available on Hugging Face. This might be useful for anyone working on AI translation or language model evaluation.

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

1 reply

liked 2 datasets about 1 month ago

Rapidata/text-2-video-human-preferences-veo2

Viewer • Updated Mar 11 • 760 • 234 • 14

Rapidata/text-2-video-human-preferences-wan2.1

Viewer • Updated Mar 11 • 787 • 710 • 17

liked a dataset about 2 months ago

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

Viewer • Updated Mar 10 • 845 • 53 • 16

reacted to jasoncorkill's post with 🚀 about 2 months ago

Post

3856

Has OpenGVLab Lumina Outperformed OpenAI’s Model?

We’ve just released the results from a large-scale human evaluation (400k annotations) of OpenGVLab’s newest text-to-image model, Lumina. Surprisingly, Lumina outperforms OpenAI’s DALL-E 3 in terms of alignment, although it ranks #6 in our overall human preference benchmark.

To support further development in text-to-image models, we’re making our entire human-annotated dataset publicly available. If you’re working on model improvements and need high-quality data, feel free to explore.

We welcome your feedback and look forward to any insights you might share!

Rapidata/OpenGVLab_Lumina_t2i_human_preference

liked a dataset about 2 months ago

Rapidata/OpenGVLab_Lumina_t2i_human_preference

Viewer • Updated Feb 26 • 13k • 234 • 13

reacted to jasoncorkill's post with 🚀🚀 2 months ago

Post

2559

This dataset was collected in roughly 4 hours using the Rapidata Python API, showcasing how quickly large-scale annotations can be performed with the right tooling!

All that at less than the cost of a single hour of a typical ML engineer in Zurich!

The new dataset of ~22,000 human annotations evaluating AI-generated videos based on different dimensions, such as Prompt-Video Alignment, Word for Word Prompt Alignment, Style, Speed of Time flow and Quality of Physics.

Rapidata/text-2-video-Rich-Human-Feedback

1 reply

reacted to jasoncorkill's post with 👀 2 months ago

Post

2858

Integrating human feedback is vital for evolving AI models. Boost quality, scalability, and cost-effectiveness with our crowdsourcing tool!

..Or run A/B tests and gather thousands of responses in minutes. Upload two images, ask a question, and watch the insights roll in!

Check it out here and let us know your feedback: https://app.rapidata.ai/compare