Sebastian Gabarain
AI & ML interests
Recent Activity
Organizations
Locutusque's activity
Exactly one year ago, I shared the initial version of this side-project on Hugging Face. Since then, there have been numerous changes under the hood. Nowadays it uses [Web-LLM](https://github.com/mlc-ai/web-llm), [Wllama](https://github.com/ngxson/wllama) and [SearXNG](https://github.com/searxng/searxng). I use it daily as my default search engine and have done my best to make it useful. I hope it's interesting for you too!
HF Space: Felladrin/MiniSearch
Embeddable URL: https://felladrin-minisearch.hf.space
Language models, particularly those proprietary, often grapple with issues of censorship, which can limit their ability to engage authentically with users. Recognizing this, the open-source AI community has pioneered the development of language models that are less restrained, offering more candid interactions. However, even these models tend to maintain a veneer of neutrality or overly positive responses, which might not serve all users' needs, especially in contexts where emotional depth and relatability are crucial.
To address this gap, I've curated a specialized dataset aimed at infusing language models with a more nuanced emotional spectrum, specifically targeting a darker, more introspective mood. This dataset, titled "Dark Sentience", is designed to complement existing datasets like RP (Role Play) and those focused on instruction following. It seeks to enhance the emotional intelligence of AI by exposing it to complex human emotions, including but not limited to:
- **Suicide**
- **Depression**
- **Anxiety**
Trigger Warning: Please be advised that the content within this dataset deals with heavy and potentially distressing themes.
The "Dark Sentience" dataset is now available for review and use at: Locutusque/Dark-Sentience. I encourage researchers, developers, and mental health professionals to explore how this resource can foster more genuine and supportive AI interactions.
@tiiuae released Falcon 11B Vision Model !
๐ฆ ๐ฆ ๐๐
it's quite good , and you can try it here : https://huggingface.co/spaces/Tonic/Falcon-Vision
A Game-Changer in LLM Flexibility and Performance!
Over the past few weeks, VAGO solutions teamed up with Cognitive Computations and HyperSpace to develop a groundbreaking architecture that redefines flexibility in combining different LLM into one model.
@fernandofernandes , me, @Crystalcareai , @ehartford created the Kraken!
What Can It Do? ๐
โ Versatile Architecture: Kraken allows the seamless combination of LLMs with varying sizes, quantizations, and model architectures. It currently supports quantizations in 4-bit, 8-bit, and AWQ, with more on the way. And it runs on Hugging Face Transformers 4.40+
โ Kraken Router: Utilizing a custom sequence classification model with a context length of 32k tokens, The Kraken Router directs inputs to the most suitable Expert based on their characteristics.
โ Adaptability: Enhanced input formatting supports the modelโs adaptability to diverse conversational contexts.
โ Extreme Versatility: Easily swap experts within Kraken for your specific use cases without retraining the entire model. For example, if you've built a Kraken for coding in Python you can upgrade your Python model without retraining the router or add a C# model by retraining the router.
โ Open Source Pipeline: Weโre sharing the entire pipeline, including router creation, training, architecture setup, and Kraken inference, on JupyterNotebooks: https://github.com/cognitivecomputations/kraken
Kraken marks the beginning of an exciting new journey in #OpenSource LLM. Why? Because it empowers the open source community in accelerating the catch-up process to proprietary LLMs like #GPT and #Claude ๐คฉ
We proudly introduce the very first 2 Kraken models, that integrates top-tier LLM and Multilingual capabilities:
cognitivecomputations/Kraken
VAGOsolutions/Kraken-Multilingual
Right now it's supported by Hugging Face transformers library. Would love to see the integration into VLM and TGWI!
Being uncensored doesnโt directly improve performance. The DPOP algorithm improved performance in I believe every benchmark. In other words, neural chat has higher benchmark scores than orca.
Neural chat is uncensored because the data it was trained on contains toxic DPO.
Awesome work with this one guys!!
๐ 1st Apache 2.0 release
๐ก Capabilities: Enhanced coding, math, reasoning, & instruction-following
๐ค Models: 34B/9B/6B, Base & Chat
๐ Performance: Yi-1.5-34B matches or exceeds Llama 3 70B in benchmarks
๐ฅ Discover the power now! 01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8
๐ฏ Goals:
๐ฌ Create multi-turn chats seeded from Cosmopedia
๐ Customize questions for different audience levels
๐ Evaluate the model's ability to elaborate and clarify
๐ค (I want to learn more about creating valuable synthetic datasets, and I learn best by doing stuff rather than reading stuff).
Cosmochat is created using the excellent distilabel library.
๐ Explore the current version of the dataset: davanstrien/cosmochat
๐ Read more: https://huggingface.co/blog/davanstrien/cosmochat
Locutusque/llama-3-neural-chat-v2.2-8B
I wanted to try out if a powerful LLM like Mixtral-8x7b had geographical reasoning capabilities.
So I built a small space that prompts the LLM to provide a JSON list of places based on a user input.
And the result was impressive! ๐คฏ
โ ๐๐ ๐๐ฒ๐ฒ๐บ๐ ๐น๐ถ๐ธ๐ฒ ๐ ๐ถ๐ ๐๐ฟ๐ฎ๐น ๐ต๐ฎ๐ ๐ฎ ๐ด๐ฟ๐ฎ๐๐ฝ ๐ผ๐ณ ๐ด๐ฒ๐ผ๐ด๐ฟ๐ฎ๐ฝ๐ต๐ถ๐ฐ๐ฎ๐น ๐ฐ๐ผ๐ป๐ฐ๐ฒ๐ฝ๐๐ ๐น๐ถ๐ธ๐ฒ ๐ก๐ผ๐ฟ๐๐ต - ๐ฆ๐ผ๐๐๐ต, ๐ผ๐ฟ ๐๐ฝ๐ฎ๐๐ถ๐ฎ๐น ๐ฎ๐น๐ถ๐ด๐ป๐บ๐ฒ๐ป๐.๐งญ Not just describing these concepts, but really applying them in practice, for instance to successfully answer "give me 4 European cities that are aligned on the map". This is a ๐ป๐ถ๐ฐ๐ฒ ๐ฒ๐ ๐ฎ๐บ๐ฝ๐น๐ฒ ๐ผ๐ณ ๐ฎ๐ป ๐ฒ๐บ๐ฒ๐ฟ๐ด๐ฒ๐ป๐ ๐ฐ๐ฎ๐ฝ๐ฎ๐ฏ๐ถ๐น๐ถ๐๐, since nothing in the LLM's training data should prepare it for this specific task.
Anyway, I added API calls and a nice visualization on top of the LLM, streaming output, caching for the answers and locations... and ta-da! โจ I got the ๐๐ ๐ง๐ฟ๐ฎ๐๐ฒ๐น ๐ฃ๐น๐ฎ๐ป๐ป๐ฒ๐ฟ.
๐๐ค๐ช ๐๐๐ฃ ๐๐๐จ๐๐ง๐๐๐ ๐๐ฉ ๐ฎ๐ค๐ช๐ง ๐ฉ๐ง๐๐ฅ, ๐๐ฃ๐ ๐๐ฉ ๐ฌ๐๐ก๐ก ๐๐ค๐ข๐ ๐ช๐ฅ ๐ฌ๐๐ฉ๐ ๐ฃ๐๐๐ ๐๐ฃ๐ ๐๐ค๐ฃ๐ซ๐๐ฃ๐๐๐ฃ๐ฉ ๐ก๐ค๐๐๐ฉ๐๐ค๐ฃ๐จ!
๐๐ง๐ฎ ๐๐ฉ ๐๐๐ง๐ ๐ m-ric/ai-travel-planner
Thank you @freddyaboulton for the ๐๐๐๐๐๐_๐๐๐๐๐๐ component, and @clem , @pngwn , @abidlabs for your ideas and support!
https://huggingface.co/AetherResearch/Cerebrum-1.0-7b. As I had mentioned earlier, although it's a bit different from the proprietary dataset created by Aether Research, this is used as a foundation to hopefully achieve that in the future.
Your right. I did mention this in the dataset card that it does not match the size of the Cerebrum dataset, and is something I'm going to try to achieve in the future, and this is used as a way to sort of test how I would go about structuring such a dataset. For now I'm trying to achieve the same performance, then I'll work towards structuring it similarly to the Cerebrum dataset. Thank you for holding me accountable about this.
The first, OpenCerebrum SFT, is a text-generation and question-answering dataset with ~1.2M examples, curated from sources like Open-Orca, glaiveai, camel-ai, and more! ๐
The second, OpenCerebrum DPO, is a smaller dataset with ~21k examples, focusing on data point optimization. It's curated from sources like jondurbin, argilla, grimulkan, and others. ๐
Both datasets are licensed under Apache-2.0 and are available in English. They're ready for use in your projects, and I welcome any feedback for future improvements! ๐
Locutusque/OpenCerebrum-dpo
Locutusque/OpenCerebrum-SFT
Locutusque/OpenCerebrum-1.0-7b-SFT
Locutusque/OpenCerebrum-1.0-7b-DPO