Aaron C Wacker PRO

awacke1

AI & ML interests

AGI and ML Pipelines, Ambient IoT AI, Behavior Cognitive and Memory AI, Clinical Medical and Nursing AI, Genomics AI, GAN Gaming GAIL AR VR XR and Simulation AI, Graph Ontology KR KE AI, Languages and NLP AI, Quantum Compute GPU TPU NPU AI, Vision Image Document and Audio/Video AI

Organizations

awacke1's activity

replied to ajibawa-2023's post 12 days ago
view reply

Most excellent Dataset sir. This is quite useful. Thanks! Aaron

reacted to ajibawa-2023's post with ❤️ 12 days ago
view post
Post
2814
New Dataset: Software-Architecture
Link: ajibawa-2023/Software-Architecture

I am releasing a Large Dataset covering topics related to Software-Architecture. This dataset consists of around 450,000 lines of data in jsonl.

I have included following topics:

Architectural Frameworks

Architectural Patterns for Reliability

Architectural Patterns for Scalability

Architectural Patterns

Architectural Quality Attributes

Architectural Testing

Architectural Views

Architectural Decision-Making

Advanced Research

Cloud-Based Architectures

Component-Based Architecture

Data Architecture

Emerging Trends

Event-Driven Architecture

Evolvability and Maintainability

Microservices and Monolithic

Microservices Architecture

Security Architecture

Service-Oriented Architecture

Software Design Principles

and Many More!

This dataset is useful in LLM development. Also those who are working on developing Software development related LLMs then this dataset can be useful.

This dataset is very useful to Researchers as well.
·
reacted to albertvillanova's post with ❤️ 12 days ago
view post
Post
2956
🚀 Exciting update! You can now compare multiple models side-by-side with the Hugging Face Open LLM Comparator! 📊

open-llm-leaderboard/comparator

Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?
replied to their post 14 days ago
view reply

Thankyou for the tips and insight on Gradio 5. Much appreciated,

posted an update 17 days ago
view post
Post
1487
Since 2022 I have been trying to understand how to support advancement of the two best python patterns for AI development which are:
1. Streamlit
2. Gradio

The reason I chose them in this order was the fact that the streamlit library had the timing drop on gradio by being available with near perfection about a year or two before training data tap of GPT.

Nowadays its important that if you want current code to be right on generation it requires understanding of consistency in code method names so no manual intervention is required with each try.

With GPT and Claude being my top two for best AI pair programming models, I gravitate towards streamlit since aside from common repeat errors on cache and experimental functions circa 2022 were not solidified.
Its consistency therefore lacks human correction needs. Old dataset error situations are minimal.

Now, I seek to make it consistent on gradio side. Why? Gradio lapped streamlit for blocks paradigm and API for free which are I feel are amazing features which change software engineering forever.

For a few months I thought BigCode would become the new best model due to its training corpus datasets, yet I never felt it got to market as the next best AI coder model.

I am curious on Gradio's future and how. If the two main models (GPT and Claude) pick up the last few years, I could then code with AI without manual intervention. As it stands today Gradio is better if you could get the best coding models to not repeatedly confuse old syntax as current syntax yet we do live in an imperfect world!

Is anyone using an AI pair programming model that rocks with Gradio's latest syntax? I would like to code with a model that knows how to not miss the advancements and syntax changes that gradio has had in the past few years. Trying grok2 as well.

My IDE coding love is HF. Its hands down faster (100x) than other cloud paradigms. Any tips on models best for gradio coding I can use?

--Aaron
·
reacted to as-cle-bert's post with 🧠 24 days ago
view post
Post
1340
Hi there HuggingFacers!

Have you ever dreamt of an improbable books crossover, like Frodo from 𝘓𝘰𝘳𝘥 𝘰𝘧 𝘵𝘩𝘦 𝘙𝘪𝘯𝘨𝘴 becoming the main character of the 𝘖𝘥𝘺𝘴𝘴𝘦𝘺 or Emma Bovary from 𝘔𝘢𝘥𝘢𝘮𝘦 𝘉𝘰𝘷𝘢𝘳𝘺 acting as a modern-days Shakespearean Juliet?

Well, all of this is now possible! I'm thrilled to introduce my latest opensource product for storytelling: 𝐛𝐨𝐨𝐤𝐬-𝐦𝐢𝐱𝐞𝐫-𝐚𝐢 𝐯𝟎.𝟎.𝟎 !

Built with ReactJS and shipped directly to you on Spaces thanks to Docker, this webapp combines the power of two AI tools:

- gpt-4o-mini by OpenAI, which takes care of cooking new and intriguing plots starting from the user's instructions, the titles and the summaries of the two books to mix (summaries are scraped through Wikipedia)
- text2img realtime API by ModelsLab, which provides a stable diffusion pipeline to create a thumbnail for your newly-generated story

Everything is provided under a simple and intuitive UI, which uses chatscope's React template kit.
Curious of trying? The app is already live at:

as-cle-bert/books-mixer-ai

And you can also have a tour of the GitHub repo (and leave a little ⭐ while you're there):

https://github.com/AstraBert/books-mixer-ai

The documentation is still under construction, but will become available soon😊

Have fun!📚📚
reacted to bartowski's post with 🔥 24 days ago
view post
Post
10975
In regards to the latest mistral model and GGUFs for it:

Yes, they may be subpar and may require changes to llama.cpp to support the interleaved sliding window

Yes, I got excited when a conversion worked and released them ASAP

That said, generation seems to work right now and seems to mimic the output from spaces that are running the original model

I have appended -TEST to the model names in an attempt to indicate that they are not final or perfect, but if people still feel mislead and that it's not the right thing to do, please post (civilly) below your thoughts, I will highly consider pulling the conversions if that's what people think is best. After all, that's what I'm here for, in service to you all !
·
posted an update 24 days ago
view post
Post
636
Today I was able to solve a very difficult coding session with GPT-4o which ended up solving integrations on a very large scale. So I decided to look a bit more into how its reasoners work. Below is a fun markdown emoji outline about what I learned today and what I'm pursuing.

Hope you enjoy! Cheers, Aaron.

Also here are my favorite last 4 spaces I am working on:
1. GPT4O: awacke1/GPT-4o-omni-text-audio-image-video
2. Claude:
awacke1/AnthropicClaude3.5Sonnet-ACW
3. MSGraph M365: awacke1/MSGraphAPI
4. Azure Cosmos DB: Now with Research AI! awacke1/AzureCosmosDBUI

# 🚀 OpenAI's O1 Models: A Quantum Leap in AI

## 1. 🤔 From 🦜 to 🧠: O1's Evolution

- **Thinking AI**: O1 ponders before replying; GPT models just predict. 💡

## 2. 📚 AI Memory: 💾 + 🧩 = 🧠

- **Embeddings & Tokens**: Words ➡️ vectors, building knowledge. 📖

## 3. 🔍 Swift Knowledge Retrieval

- **Vector Search & Indexing**: O1 finds info fast, citing reliable sources. 🔎📖

## 4. 🌳 Logic Trees with Mermaid Models

- **Flowchart Reasoning**: O1 structures thoughts like diagrams. 🎨🌐

## 5. 💻 Coding Mastery

- **Multilingual & Current**: Speaks many code languages, always up-to-date. 💻🔄

## 6. 🏆 Breaking Records

- **92.3% MMLU Score**: O1 outperforms humans, setting new AI standards. 🏅

## 7. 💡 Versatile Applications

- **Ultimate Assistant**: From fixing code to advancing research. 🛠️🔬

## 8. 🏁 Racing Toward AGI

- **OpenAI Leads**: O1 brings us closer to true AI intelligence. 🚀

## 9. 🤖 O1's Reasoning Pillars

- **🧠 Chain of Thought**: Step-by-step logic.
- **🎲 MCTS**: Simulates options, picks best path.
- **🔍 Reflection**: Self-improves autonomously.
- **🏋️‍♂️ Reinforcement Learning**: Gets smarter over time.

---

*Stay curious, keep coding!* 🚀
  • 1 reply
·
posted an update 26 days ago
view post
Post
569
I have finally completed a working full Azure and Microsoft MS Graph API implementation which can use all the interesting MS AI features in M365 products to manage CRUD patterns for the graph features across products.

This app shows initial implementation of security, authentication, scopes, and access to Outlook, Calendar, Tasks, Onedrive and other apps for CRUD pattern as AI agent service skills to integrate with your AI workflow.


Below are initial screens showing integration:

URL: awacke1/MSGraphAPI
Discussion: awacke1/MSGraphAPI#5

Best of AI on @Azure and @Microsoft on @HuggingFace :
https://huggingface.co/microsoft
https://www.microsoft.com/en-us/research/
---
Aaron
posted an update about 1 month ago
view post
Post
982
Updated my 📺RTV🖼️ - Real Time Video AI app this morning.
URL: awacke1/stable-video-diffusion

It uses Stable Diffusion to dynamically create videos from images in input directory or uploaded using A10 GPU on Huggingface.


Samples below.

I may transition this to Zero GPU if I can. During Christmas when I revised this I had my highest billing from HF yet due to GPU usage. It is still the best turn key GPU out and Image2Video is a killer app. Thanks HF for the possibilities!
posted an update about 1 month ago
posted an update 2 months ago
view post
Post
586
I am integrating Azure Cosmos DB, the database system that backs GPT conversations into my workflow, and experimenting with new patterns to accelerate dataset evolution for evaluation and training of AI.

While initially using it for research prompts and research outputs using my GPT-4o client here which can interface and search ArXiv, I am excited to try out some new features specifically for AI at scale. Research on memory augmentation is shown. awacke1/GPT-4o-omni-text-audio-image-video

awacke1/AzureCosmosDBUI
reacted to lhoestq's post with 🚀 2 months ago
view post
Post
2934
✨ Easy Synthetic Dataset File Generation using LLM DataGen ! Link: https://huggingface.co/spaces/lhoestq/LLM_DataGen

features + how it works:

✍️ Generate the dataset content you want just by entering a file name
💡 Optionally specify the column names you need
💨 The dataset is streamed and generated on-the-fly in JSON Lines format
✅ Generation is constrained to always output valid JSON

How does this work ?
1/ Enter a file name
2/ The model generates column names for such a file. Using structured generation, it can generate 2 to 5 column names using lower case characters and underscores. I use a prompt that asks to generate column names for a realistic dataset and low temperature.
3/ The columns are used to update the Finite State Machine for the dataset content structured generation, so that it is used to generate JSON objects using those columns
4/ The model generates JSON objects using structured generation again, using the updated Finite State Machine. I use a prompt that asks for realistic data and a temperature of 1.

> Why update a Finite State Machine instead of re-creating one ?

Creating one can take up to 30sec, while updating one takes 0.1s (though it requires to manipulate a graph which is not easy to implement)

> Batched generation is faster, why not use it ?

Generate in batches is faster but tends to generate duplicates for this demo.
Further work can be to provide different prompts (one per sequence in the batch) to end up with a different distribution of sequences in each batch. Or implement a custom sampler that would forbid generating the same data in sequences of the same batch.

> How does structured generation work ?

I used the outlines library with transformers to to define a JSON schema that the generation has to follow. It uses a Finite State Machine with token_id as transitions.

Let me know what you think ! And feel free to duplicate/modify it to try other models/prompts or sampling methods :)
reacted to lhoestq's post with 🔥 2 months ago
view post
Post
3874
Hey ! I'm working on a 100% synthetic Dataset Hub here (you can search for any kind of datasets an the app invents them). The link is here: infinite-dataset-hub/infinite-dataset-hub

Question for the Community:

Which models should I use to generate images and audio samples for those datasets ? 🤗
  • 4 replies
·
posted an update 4 months ago
view post
Post
1341
I just launched an exciting new multiplayer app powered by GPT-4o, enabling collaborative AI-driven queries in a single shared session!

### 🔗 Try It Out! 👉 Check out the GPT-4o Multiplayer App
Experience the future of collaborative AI by visiting our space on Hugging Face: awacke1/ChatStreamlitMultiplayer

🎉 This innovative tool lets you and your team reason over:

###📝 Text
###🖼️ Image
###🎵 Audio
###🎥 Video

## 🔍 Key Features

### Shared Contributions
Collaborate in real-time, seeing each other's inputs and contributions.
Enhances teamwork and fosters a collective approach to problem-solving.

### Diverse Media Integration
Seamlessly analyze and reason with text, images, audio, and video.
Breakthrough capabilities in handling complex media types, including air traffic control images and audio.

## 🛠️ Real-World Testing
This morning, we tested the app using images and audio from air traffic control—a challenge that was nearly impossible to handle with ease just a few years ago. 🚁💬

🌱 The Future of AI Collaboration
We believe AI Pair Programming is evolving into a new era of intelligence through shared contributions and teamwork. As we continue to develop, this app will enable groups to:

Generate detailed text responses 📝
Collaborate on code responses 💻
Develop new AI programs together 🤖
replied to Wauplin's post 4 months ago
view reply

Such good news thanks! With this we can now create AI pipelines with much greater simplicity to make models interchangeable service parts. I think for cutting edge techniques like MoE gating networks, Self Reward and Comparison across models, Memory across AI pipelines, etc this becomes the differentiator to make it all much easier. I hope that by operating key models like GPT-4o, Claude 3.5 Sonnet, Gemma, Llama, and other front runners in this open pattern unlocks better more powerful AI coding patterns.

posted an update 4 months ago
view post
Post
2568
✨🚀 Claude Sonnet 3.5 API. It's already weaving digital magic!
🧠💻 Try it at my space: 🔗 awacke1/AnthropicClaude3.5Sonnet-ACW

Kudos to @AnthropicAI for this elegant API! 👏 #AI #CodeMagic #AnthropicAI Thanks Huggingface for hosting the best hub in the world for AI development!

replied to their post 4 months ago
view reply

It uses my openai key and org id and is hard to run in an open fashion due to usage. It uses the billed model.

replied to their post 4 months ago
view reply

You can use whisper-1 for now and that pattern works great. The speech wav stream recorder is not in the code for openai yet. I use a streamlit recorder in order to get speech in which is working but I am looking for a better speech in/out technique. The audio to text is used as well and is how the video modality inputs its transcript for additive data input with the image slices from video. One thing I also did not see yet was the image generator inside the client api. That would be nice to add as well and also the speech synthesis.

reacted to VictorSanh's post with 🤗 6 months ago
view post
Post
2692
Glad to see Idefics2 making its way into the awesome OpenVLM Leaderboard which ranks VLMs. 🏆
2nd in its category (<10B parameters and open weights)!

While InternLM-XComposer2 uses proprietary data, Idefics2 is built solely using openly available data.

Leaderboard: opencompass/open_vlm_leaderboard
Model: HuggingFaceM4/idefics2-8b
·