C4AI Community

community

https://cohere.com/research

CohereForAI

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

beyzaermis authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

johndang-cohere authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

viraat authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

View all activity

C4AI-Community's activity

sanderland

authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 2 days ago • 17

MarziehFadaee

authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 2 days ago • 17

beyzaermis

authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 2 days ago • 17

johndang-cohere

authored a paper about 21 hours ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published 2 days ago • 17

hesamation

posted an update 1 day ago

Post

1800

What, How, Where, and How Well? This paper reviews test-time scaling methods and all you need to know about them:
> parallel, sequential, hybrid, internal scaling
> how to scale (SFT, RL, search, verification)
> metrics and evals of test-time scaling

🔗paper: What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models (2503.24235)

If you want to learn what inference-time compute scaling is @rasbt has a great blog post on that:
https://magazine.sebastianraschka.com/p/state-of-llm-reasoning-and-inference-scaling

hesamation

posted an update 3 days ago

Post

1667

this paper lists ways to make reasoning LLMs more efficient:
> enforce token limits per reasoning step
> route tasks to different models (small/large)
> compress reasoning chains during SFT
> reward based on reasoning length
> parallel search at test-time
and more...
@Xiaoye08 @yaful @Warrieryes
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond (2503.21614)

mmhamdy

posted an update 4 days ago

Post

1484

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

💡 Examples of ideas explored in the article:

✅ What was the inspiration for the attention mechanism?
✅ How did we go from attention to self-attention?
✅ Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

Aurelien-Morgan

posted an update 5 days ago

Post

1910

Almost there !
https://test.pypi.org/project/test-010-retrain-pipelines/

takarajordan

posted an update 9 days ago

Post

1838

Takara takes 3rd place in the {tech:munich} AI hackathon with Fudeno!

A little over 2 weeks ago @aldigobbler and I set out to create the largest MultiModal SVG dataset ever created, we succeeded in this and when I was in Munich, Germany I took it one step further and made an entire app with it!

We fine-tuned Mistral Small, made a Next.JS application and blew some minds, taking 3rd place out of over 100 hackers. So cool!

If you want to see the dataset, please see below.

takara-ai/fudeno-instruct-4M

louisbrulenaudet

posted an update 11 days ago

Post

826

I’ve just released logfire-callback on PyPI, designed to facilitate monitoring of Hugging Face Transformer training loops using Pydantic Logfire 🤗

The callback will automatically log training start with configuration parameters, periodic metrics and training completion ⏱️

Install the package using pip:

pip install logfire-callback

First, ensure you have a Logfire API token and set it as an environment variable:

export LOGFIRE_TOKEN=your_logfire_token

Then use the callback in your training code:

from transformers import Trainer, TrainingArguments
from logfire_callback import LogfireCallback

# Initialize your model, dataset, etc.

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    # ... other training arguments
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    callbacks=[LogfireCallback()]  # Add the Logfire callback here
)

trainer.train()

If you have any feedback, please reach out at @louisbrulenaudet

tellarin

authored a paper 16 days ago

Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills

Paper • 2503.12533 • Published 18 days ago • 61

AhmadMustafa

authored a paper 18 days ago

On the Limitations of Vision-Language Models in Understanding Image Transforms

Paper • 2503.09837 • Published 22 days ago • 10

ljvmiranda921

authored 2 papers 22 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 33

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published 23 days ago • 95

not-lain

posted an update 22 days ago

Post

1737

🚀AraClip is now fully integrated with Hugging Face 🤗

AraClip is a specialized CLIP model that was created by @pain and optimized for Arabic text-image retrieval tasks🔥

🔗 Try it out 🔗
🤖 model: Arabic-Clip/araclip
🧩 Gradio demo: Arabic-Clip/Araclip-Simplified
🌐 website: https://arabic-clip.github.io/Arabic-CLIP/