AI & ML interests

None defined yet.

NevskyCollective's activity

ZennyKennyΒ 
posted an update 2 days ago
view post
Post
1965
A few new Russian-language synthetic datasets. The labelling is good, but some of the syntax and grammar is not great.

Great for Russian-language classification models, probably not great for fine-tuning Russian-langauge text generation.

- Virtual Assistant Query / Responses: ZennyKenny/ru_virtual_assistant_chatgpt_distill
- LLM Query / Responses: ZennyKenny/russian_llm_response_chatgpt_distill

Crazy how much language drift is still an issue, especially given that Russian constitutes nearly 5% of the content on the internet.
ZennyKennyΒ 
posted an update 8 days ago
view post
Post
1905
Besides being the coolest named benchmark in the game, HellaSwag is an important measurement of Π·Π΄Ρ€Π°Π²Ρ‹ΠΉ ΡΠΌΡ‹ΡΠ»ΡŒ (or common sense) in LLMs.

- More on HellaSwag: https://github.com/rowanz/hellaswag

I spent the afternoon benchmarking YandexGPT Pro 4th Gen, one of the Russian tech giant's premier models.

- Yandex HF Org: yandex
- More on Yandex models: https://yandex.cloud/ru/docs/foundation-models/concepts/yandexgpt/models

The eval notebook is available on GitHub and the resulting dataset is already on the HF Hub!

- Eval Notebook: https://github.com/kghamilton89/ai-explorer/blob/main/yandex-hellaswag/hellaswag-assess.ipynb
- Eval Dataset: ZennyKenny/yandexgptpro_4th_gen-hellaswag

And of course, everyone wants to see the results so have a look at the results in the context of other zero-shot experiments that I was able to find!
  • 2 replies
Β·
ZennyKennyΒ 
posted an update 29 days ago
view post
Post
523
It took me a while, but I've finally got it working: ZennyKenny/note-to-text

Using a Meta LLaMa checkpoint from Unsloth and some help from the HF community, you can capture handwritten notes and convert them into digital format in just a few second.

Really exciting times for AI builders on Hugging Face.
  • 2 replies
Β·
ZennyKennyΒ 
posted an update about 1 month ago
view post
Post
1890
I've spent most of time working with AI on user-facing apps like Chatbots and TextGen, but today I decided to work on something that I think has a lot of applications for Data Science teams: ZennyKenny/comment_classification

This Space supports uploading a user CSV and categorizing the fields based on user-defined categories. The applications of AI in production are truly endless. πŸš€
ZennyKennyΒ 
posted an update about 1 month ago
view post
Post
2210
Really excited to start contributing to the SWE Arena project: https://swe-arena.com/

Led by IBM PhD fellow @terryyz , our goal is to advance research in code generation and app development by frontier LLMs.

ZennyKennyΒ 
posted an update about 2 months ago
view post
Post
1991
Okay this is pretty crazy. Snowflake has CortexAI and Uber is already teasing QueryGPT, both of which prominently feature plain text to SQL features to query your database.

I decided to see how hard it would be to put together something similar using πŸ€— smolagents. Turns out, it was pretty straightforward. I managed to get it done in London Luton airport this afternoon.

ZennyKenny/sqlAgent
  • 2 replies
Β·
ZennyKennyΒ 
posted an update about 2 months ago
view post
Post
3443
I've completed the first unit of the just-launched Hugging Face Agents Course. I would highly recommend it, even for experienced builders, because it is a great walkthrough of the smolagents library and toolkit.
ZennyKennyΒ 
posted an update 2 months ago
view post
Post
445
GradientBoostingClassifier is an algorithm supported by the Python SciKit library, and now you can quickly train an ML model using this powerful technique on any (viable) dataset in the Hugging Face Hub without a line of code.

Love finishing a project right when the late night starts to turn into the early morning: sklearn-docs/GradientBoostingClassifier

Long time listener, first time caller, but always pleased to contribute, even if only adjacently, to the power of SciKit.
ZennyKennyΒ 
posted an update 2 months ago
view post
Post
448
Really pleased with the Bring Your Own Model (BYOM) feature in Brave Browser: https://brave.com/blog/byom-nightly/

Takes about 5 minutes to configure your own locally running LLM as an in-browser assistant. Totally local, totally private, totally yours.
  • 1 reply
Β·
ZennyKennyΒ 
posted an update 3 months ago
view post
Post
425
On-demand audio transcription is an often-requested service without many good options on the market.

Using Hugging Face Spaces with Gradio SDK and the OpenAI Whisper model, I've put together a simple interface that supports the transcription and summarisation of audio files up to five minutes in length, completely open source and running on CPU upgrade. The cool thing is that it's built without a dedicated inference endpoint, completely on public infrastructure.

Check it out: ZennyKenny/AudioTranscribe

I wrote a short article about the backend mechanics for those who are interested: https://huggingface.co/blog/ZennyKenny/on-demand-public-transcription
  • 1 reply
Β·
ZennyKennyΒ 
posted an update 4 months ago
ZennyKennyΒ 
posted an update 4 months ago
view post
Post
1224
I've joined the Bluesky community. Interested to see what decentralized social media looks like in action: https://bsky.app/profile/kghamilton.bsky.social

Looking forward to following other AI builders, tech enthusiasts, goth doomscrollers, and ironic meme creators.
ZennyKennyΒ 
posted an update 4 months ago
view post
Post
385
Using AI to teach English as a Foreign Language? EFL teachers often have busy schedules, variable class sizes, and unexpected cancellations. Introducting VocabSova: ZennyKenny/VocabSova

VocabSova is a simple chatbot interface that helps teachers create topical vocabulary lists, custom worksheets using that vocabulary, and group activities on a defined theme for a specific English-speaking level (according to CEFR international standards).

There is a great use case for AI in nearly every field, and language learning is a particularly apt domain in my opinion. VocabSova is in active development during its Alpha release, all feedback welcome.
ZennyKennyΒ 
posted an update 7 months ago
view post
Post
695
Very excited to have made the list and been invited to OpenAI DevDay 2024 at the London event 30 October! Looking forward to seeing what the future of AI dev holds, connecting with other professionals in the field, and advocating for open source AI!

https://openai.com/devday/
ZennyKennyΒ 
updated a Space 7 months ago
ZennyKennyΒ 
posted an update 11 months ago
view post
Post
1176
Thanks to the incredible collaboration of 14 community annotators, @davanstrien of HF and @dvilasuero et. al of Argilla, DIBT ( DIBT) is pleased to make available a Russian-language dataset of 500 of the best curated LLM prompts translated to Russian and available for use: https://huggingface.co/datasets/DIBT/MPEP_RUSSIAN.

More to come from the MPEP initiative! Interested in annotating or leading a language team? https://github.com/huggingface/data-is-better-together/tree/main/prompt_translation
  • 2 replies
Β·
ZennyKennyΒ 
posted an update about 1 year ago
view post
Post
2038
Are you interested in contributing to open source multilingual AI with Hugging Face and Argilla?

The MPEP initiative (https://github.com/huggingface/data-is-better-together/tree/main/prompt_translation) of the Data is Better Together project offers the opportunity to do just that by helping to create multilingual model checkpoints.

If you're interested in contributing to the Russian-language dataset, please get in touch as I am the Russian-language lead. If you're interested in contributing to another language, the MPEP link above has all the information you need to do so. πŸ€—
  • 2 replies
Β·