Huu Nguyen

huu-ontocord

AI & ML interests

None yet

Recent Activity

updated a dataset 4 days ago
ontocord/people_list
new activity 6 days ago
HuggingFaceFV/finevideo:Cleanup TTS
updated a model 8 days ago
ontocord/Felix-8B-v2
View all activity

Organizations

LAION eV's profile picture OpenAssistant's profile picture Ontocord's M*DEL's profile picture Blog-explorers's profile picture PIISA's profile picture BEEspoke Data's profile picture Vietnamese Mistral's profile picture SafeLMM's profile picture Aurora-M's profile picture Ontocord.AI's profile picture

Posts 1

view post
Post
1641
We would like to announce our Aurora-M multilingual models which is based on Starcoderplus.
Twitter: https://twitter.com/ontocord/status/1772778544051155029
LinkedIn: https://www.linkedin.com/feed/update/urn:li:activity:7178521998845759488/
Blog post: https://huggingface.co/blog/mayank-mishra/aurora
Arxiv: Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order (2404.00399)

Current LLMs are very susceptible to generating toxic, harmful and even dangerous content. They can also generate outputs with gender or racial biases. The Biden-Harris Executive Order https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence) sets forth guidelines on what is considered a safe AI system.
Following up on these guidelines, we present the world's first open source Biden-Harris Executive Order Red teamed Multilingual Language Model: Aurora-M. Inspired by BigScience, the model is trained on 5 languages: English, Hindi, Japanese, Vietnamese and Finnish.

* Red teamed model: aurora-m/aurora-m-biden-harris-redteamed tuned according to the order mentioned above)
* Base model: aurora-m/aurora-m-base (not safety tuned)
* Instruct model: aurora-m/aurora-m-instruct (not safety tuned)

@mayank-mishra @cabbage972 @sted97 @Xa9aX @Taishi-N324 @Muennighoff @vumichien @prateeky2806 @felfri @spyysalo and many many others!

models

None public yet

datasets

None public yet