Locutusque (Sebastian Gabarain)

reacted to their post with 👍 3 months ago

Post

2937

🎉 Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! ✨️

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! 🧠

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! ✍️

posted an update 3 months ago

Post

2937

🎉 Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! ✨️

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! 🧠

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! ✍️

reacted to nroggendorff's post with 😔 5 months ago

Post

3735

im so tired

3 replies

·

reacted to Felladrin's post with 👍 8 months ago

Post

3655

MiniSearch is celebrating its 1st birthday! 🎉

Exactly one year ago, I shared the initial version of this side-project on Hugging Face. Since then, there have been numerous changes under the hood. Nowadays it uses [Web-LLM](https://github.com/mlc-ai/web-llm), [Wllama](https://github.com/ngxson/wllama) and [SearXNG](https://github.com/searxng/searxng). I use it daily as my default search engine and have done my best to make it useful. I hope it's interesting for you too!

HF Space: Felladrin/MiniSearch
Embeddable URL: https://felladrin-minisearch.hf.space

1 reply

·

reacted to hfposts's post with 🤯 8 months ago

Post

3999

1+2=3

2 replies

·

reacted to nroggendorff's post with 👀 9 months ago

Post

1046

@ehartford https://huggingface.co/CohereForAI/c4ai-command-r-plus dolphin when?

posted an update 9 months ago

Post

3120

**Exploring Realistic Emotional Depth in AI Language Models**

Language models, particularly those proprietary, often grapple with issues of censorship, which can limit their ability to engage authentically with users. Recognizing this, the open-source AI community has pioneered the development of language models that are less restrained, offering more candid interactions. However, even these models tend to maintain a veneer of neutrality or overly positive responses, which might not serve all users' needs, especially in contexts where emotional depth and relatability are crucial.

To address this gap, I've curated a specialized dataset aimed at infusing language models with a more nuanced emotional spectrum, specifically targeting a darker, more introspective mood. This dataset, titled "Dark Sentience", is designed to complement existing datasets like RP (Role Play) and those focused on instruction following. It seeks to enhance the emotional intelligence of AI by exposing it to complex human emotions, including but not limited to:

- **Suicide**
- **Depression**
- **Anxiety**

Trigger Warning: Please be advised that the content within this dataset deals with heavy and potentially distressing themes.

The "Dark Sentience" dataset is now available for review and use at: https://huggingface.co/datasets/Locutusque/Dark-Sentience. I encourage researchers, developers, and mental health professionals to explore how this resource can foster more genuine and supportive AI interactions.

reacted to Tar9897's post with 👍 about 1 year ago

Post

1893

Octave-X releases their proprietary model Tenzin. For now the access will be given to a select few and will gradually open up. Our model is different from other models in the way it learns. It is not fed heaps of information but starts learning exactly like a human by first studying grammar patterns, then learning then number system, then learning to synthesize words and then sentences and so on. Patience is key with Tenzin. It keeps learning 24/7 with/without user-input. We have decided to keep our model closed-source given the novel algorithms integrated into it along with our novel ideas. Please expect our datacard soon which will be followed by our research paper. You can check us out at https://octave-x.com/

reacted to lunarflu's post with 🔥 about 1 year ago

Post

2031

cooking up something....anyone interested in a daily activity tracker for HF?

12 replies

·

reacted to Tonic's post with 🔥 about 1 year ago

Post

1026

🙋🏻‍♂️ Hey there folks ,

@tiiuae released Falcon 11B Vision Model !

🦅🦅👀👀

it's quite good , and you can try it here : https://huggingface.co/spaces/Tonic/Falcon-Vision

reacted to DavidGF's post with 🔥 about 1 year ago

Post

1666

The kraken has awakened!
A Game-Changer in LLM Flexibility and Performance!

Over the past few weeks, VAGO solutions teamed up with Cognitive Computations and HyperSpace to develop a groundbreaking architecture that redefines flexibility in combining different LLM into one model.

@fernandofernandes , me, @Crystalcareai , @ehartford created the Kraken!

What Can It Do? 🐙
✅ Versatile Architecture: Kraken allows the seamless combination of LLMs with varying sizes, quantizations, and model architectures. It currently supports quantizations in 4-bit, 8-bit, and AWQ, with more on the way. And it runs on Hugging Face Transformers 4.40+

✅ Kraken Router: Utilizing a custom sequence classification model with a context length of 32k tokens, The Kraken Router directs inputs to the most suitable Expert based on their characteristics.

✅ Adaptability: Enhanced input formatting supports the model’s adaptability to diverse conversational contexts.

✅ Extreme Versatility: Easily swap experts within Kraken for your specific use cases without retraining the entire model. For example, if you've built a Kraken for coding in Python you can upgrade your Python model without retraining the router or add a C# model by retraining the router.

✅ Open Source Pipeline: We’re sharing the entire pipeline, including router creation, training, architecture setup, and Kraken inference, on JupyterNotebooks: https://github.com/cognitivecomputations/kraken

Kraken marks the beginning of an exciting new journey in #OpenSource LLM. Why? Because it empowers the open source community in accelerating the catch-up process to proprietary LLMs like #GPT and #Claude 🤩

We proudly introduce the very first 2 Kraken models, that integrates top-tier LLM and Multilingual capabilities:
cognitivecomputations/Kraken
VAGOsolutions/Kraken-Multilingual
Right now it's supported by Hugging Face transformers library. Would love to see the integration into VLM and TGWI!

replied to their post about 1 year ago

Being uncensored doesn’t directly improve performance. The DPOP algorithm improved performance in I believe every benchmark. In other words, neural chat has higher benchmark scores than orca.

replied to their post about 1 year ago

Neural chat is uncensored because the data it was trained on contains toxic DPO.

replied to lorinma's post about 1 year ago

Awesome work with this one guys!!

reacted to lorinma's post with 🔥 about 1 year ago

Post

1744

🎉 Big reveal: 01.AI Yi-1.5 models are in town!

📜 1st Apache 2.0 release
💡 Capabilities: Enhanced coding, math, reasoning, & instruction-following
🤖 Models: 34B/9B/6B, Base & Chat
🏆 Performance: Yi-1.5-34B matches or exceeds Llama 3 70B in benchmarks
🔥 Discover the power now! 01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8

4 replies

·

reacted to davanstrien's post with 🔥 about 1 year ago

Post

2891

Introducing CosmoChat, a multiturn chat dataset based on Cosmopedia that I'm working on in the open on the Hub.

🎯 Goals:
💬 Create multi-turn chats seeded from Cosmopedia
🎓 Customize questions for different audience levels
🔍 Evaluate the model's ability to elaborate and clarify
🤓 (I want to learn more about creating valuable synthetic datasets, and I learn best by doing stuff rather than reading stuff).

Cosmochat is created using the excellent distilabel library.

🔗 Explore the current version of the dataset: davanstrien/cosmochat
📝 Read more: https://huggingface.co/blog/davanstrien/cosmochat

2 replies

·

posted an update about 1 year ago

Post

4600

Introducing llama-3-neural-chat-v2.2-8b! This powerful conversational AI model builds on Meta's Llama 3, fine-tuned by Locutusque for enhanced performance in coding, math & writing.

Locutusque/llama-3-neural-chat-v2.2-8B

4 replies

·

posted an update about 1 year ago

Post

4398

I created a Twitter account a while back. I finally decided to make it public SebastianG74019. For those of you following @Locutusque on Twitter, that is not me! 😂

2 replies

·

reacted to m-ric's post with 🔥 about 1 year ago

Post

2231

𝗡𝗲𝘄 𝗦𝗽𝗮𝗰𝗲: 𝘼𝙄 𝙏𝙧𝙖𝙫𝙚𝙡 𝙥𝙡𝙖𝙣𝙣𝙚𝙧 🗺️🏕️ Plan your next vacation in a few minutes!

I wanted to try out if a powerful LLM like Mixtral-8x7b had geographical reasoning capabilities.
So I built a small space that prompts the LLM to provide a JSON list of places based on a user input.

And the result was impressive! 🤯

⇒ 𝗜𝘁 𝘀𝗲𝗲𝗺𝘀 𝗹𝗶𝗸𝗲 𝗠𝗶𝘅𝘁𝗿𝗮𝗹 𝗵𝗮𝘀 𝗮 𝗴𝗿𝗮𝘀𝗽 𝗼𝗳 𝗴𝗲𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰𝗮𝗹 𝗰𝗼𝗻𝗰𝗲𝗽𝘁𝘀 𝗹𝗶𝗸𝗲 𝗡𝗼𝗿𝘁𝗵 - 𝗦𝗼𝘂𝘁𝗵, 𝗼𝗿 𝘀𝗽𝗮𝘁𝗶𝗮𝗹 𝗮𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁.🧭 Not just describing these concepts, but really applying them in practice, for instance to successfully answer "give me 4 European cities that are aligned on the map". This is a 𝗻𝗶𝗰𝗲 𝗲𝘅𝗮𝗺𝗽𝗹𝗲 𝗼𝗳 𝗮𝗻 𝗲𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗰𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝘆, since nothing in the LLM's training data should prepare it for this specific task.

Anyway, I added API calls and a nice visualization on top of the LLM, streaming output, caching for the answers and locations... and ta-da! ✨ I got the 𝗔𝗜 𝗧𝗿𝗮𝘃𝗲𝗹 𝗣𝗹𝗮𝗻𝗻𝗲𝗿.

𝙔𝙤𝙪 𝙘𝙖𝙣 𝙙𝙚𝙨𝙘𝙧𝙞𝙗𝙚 𝙞𝙩 𝙮𝙤𝙪𝙧 𝙩𝙧𝙞𝙥, 𝙖𝙣𝙙 𝙞𝙩 𝙬𝙞𝙡𝙡 𝙘𝙤𝙢𝙚 𝙪𝙥 𝙬𝙞𝙩𝙝 𝙣𝙞𝙘𝙚 𝙖𝙣𝙙 𝙘𝙤𝙣𝙫𝙚𝙣𝙞𝙚𝙣𝙩 𝙡𝙤𝙘𝙖𝙩𝙞𝙤𝙣𝙨!

𝙏𝙧𝙮 𝙞𝙩 𝙝𝙚𝙧𝙚 👉 m-ric/ai-travel-planner

Thank you @freddyaboulton for the 𝚐𝚛𝚊𝚍𝚒𝚘_𝚏𝚘𝚕𝚒𝚞𝚖 component, and @clem , @pngwn , @abidlabs for your ideas and support!

1 reply

·

replied to their post about 1 year ago

https://huggingface.co/AetherResearch/Cerebrum-1.0-7b. As I had mentioned earlier, although it's a bit different from the proprietary dataset created by Aether Research, this is used as a foundation to hopefully achieve that in the future.

Sebastian Gabarain

AI & ML interests

Recent Activity

Organizations

Locutusque's activity