Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

prithivMLmodsΒ 
posted an update 1 day ago
nroggendorffΒ 
posted an update about 13 hours ago
view post
Post
781
im so tired
  • 3 replies
Β·
singhsidhukuldeepΒ 
posted an update 2 days ago
view post
Post
3210
Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
β€’ Lightweight Local Encoder that converts bytes to patch representations
β€’ Powerful Global Latent Transformer that processes patches
β€’ Local Decoder that converts patches back to bytes

>> Technical Advantages
β€’ Matches performance of Llama 3 at 8B parameters while being more efficient
β€’ Superior handling of non-English languages and rare character sequences
β€’ Remarkable 99.9% accuracy on spelling tasks
β€’ Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
  • 2 replies
Β·
KseniaseΒ 
posted an update about 10 hours ago
view post
Post
466
**15 Agentic Systems and Frameworks of 2024**

This year, we started our β€œAI Agents and Agentic Workflows” series (https://www.turingpost.com/t/AI-Agents) to explore everything about AI agents step by step: all the vocabulary, how they work, and how to build them.
The huge interest in this series and the large number of studies conducted on agents showed that it was one of the most popular and important themes of the year. In 2025, most likely, agents will reach new highs – we will be covering that for you. Now, let’s review the agentic systems that have emerged this year.

Here is a list of 15 agentic systems and frameworks of 2024:

1. GUI Agents: A Survey (2412.13501)

2. Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level (2411.03562)

3. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2408.06292)

4. MALT: Improving Reasoning with Multi-Agent LLM Training (2412.01928)

5. Agent S: An Open Agentic Framework that Uses Computers Like a Human (2410.08164)

6. Automated Design of Agentic Systems (2408.08435)

7. AgentInstruct: Toward Generative Teaching with Agentic Flows (2407.03502)

8. AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant (2410.18603)

9. WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents (2410.07484)

10. Generative Agent Simulations of 1,000 People (2411.10109)

11. DynaSaur: Large Language Agents Beyond Predefined Actions (2411.01747)

12. PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking (2410.12375)

13. Generative World Explorer (2411.11844)

14. Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines (2412.14684)

15. AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions (2410.20424)

Thanks for reading Turing Post!
Subscribe to receive new posts straight into your inbox -> https://www.turingpost.com/subscribe
aadityaΒ 
posted an update 2 days ago
view post
Post
2853
Last Week in Medical AI: Top Research Papers/Models πŸ”₯
πŸ… (December 15 – December 21, 2024)

Medical LLM & Other Models
- MedMax: Mixed-Modal Biomedical Assistant
- Advanced multimodal instruction tuning
- Enhanced biomedical knowledge integration
- Comprehensive assistant capabilities
- MGH Radiology Llama 70B
- Specialized radiology focus
- State-of-the-art performance
- Enhanced report generation capabilities
- HC-LLM: Historical Radiology Reports
- Context-aware report generation
- Historical data integration
- Improved accuracy in diagnostics

Frameworks & Methods
- ReflecTool: Reflection-Aware Clinical Agents
- Process-Supervised Clinical Notes
- Federated Learning with RAG
- Query Pipeline Optimization

Benchmarks & Evaluations
- Multi-OphthaLingua
- Multilingual ophthalmology benchmark
- Focus on LMICs healthcare
- Bias assessment framework
- ACE-M3 Evaluation Framework
- Multimodal medical model testing
- Comprehensive capability assessment
- Standardized evaluation metrics

LLM Applications
- Patient-Friendly Video Reports
- Medical Video QA Systems
- Gene Ontology Annotation
- Healthcare Recommendations

Special Focus: Medical Ethics & AI
- Clinical Trust Impact Study
- Mental Health AI Challenges
- Hospital Monitoring Ethics
- Radiology AI Integration

Now you can watch and listen to the latest Medical AI papers daily on our YouTube and Spotify channels as well!

- Full thread in detail:
https://x.com/OpenlifesciAI/status/1870504774162063760
- Youtube Link: youtu.be/SbFp4fnuxjo
- Spotify: https://t.co/QPmdrXuWP9
luigi12345Β 
posted an update 1 day ago
view post
Post
1909
PERFECT FINAL PROMPT for Coding and Debugging.
Step 1: Generate the prompt that if sent to you will make you adjust the script so it meets each and every of the criteria it needs to meet to be 100% bug free and perfect.

Step 2: adjust the script following the steps and instructions in the prompt created in Step 1.

  • 1 reply
Β·
nicolay-rΒ 
posted an update 1 day ago
view post
Post
1300
πŸ“’ So far I noticed that 🧠 reasoning with llm πŸ€– in English is tend to be more accurate than in other languages.
However, besides the GoogleTrans and other open transparent translators, I could not find one that could be easy to use solutions to avoid:
1.πŸ”΄ Third-party framework installation
2.πŸ”΄ Text chunking
3.πŸ”΄ support of meta-annotation like spans / objects / etc.

πŸ’Ž To cope problem of IR from non-english texts, I am happy to share the bulk-translate 0.25.0. 🎊

⭐ https://github.com/nicolay-r/bulk-translate

bulk-translate is a tiny Python 🐍 no-string framework that allows translate series of texts with the pre-annotated fixed-spans that are invariant for translator.

It supports πŸ‘¨β€πŸ’» API for quick data translation with (optionaly) annotated objects in texts (see figure below) in Python 🐍
I make it accessible as much as possible for RAG and / or LLM-powered app downstreams:
πŸ“˜ https://github.com/nicolay-r/bulk-translate/wiki

All you have to do is to provide iterator of texts, where each text:
1. βœ… String object
2. βœ… List of strings and nested lists that represent spans (value + any ID data).

πŸ€– By default I provide a wrapper over googletrans which you can override with your own πŸ”₯
https://github.com/nicolay-r/bulk-translate/blob/master/models/googletrans_310a.py
ehristoforuΒ 
posted an update 1 day ago
view post
Post
2039
βœ’οΈ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🀯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

πŸ€— For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.
wenhuachΒ 
posted an update about 23 hours ago
fuzzy-mittenzΒ 
posted an update 1 day ago
view post
Post
618
So a cool thing happened,
Nomic/GPT4ALL released a "Reasoning/Thinking"(QwQ/o1/o3 type) Model using JavaScript functions to calculate things like the haversine function for distance between two places and so on, it's VERY cool the complex calculative/recursive AI in such a small package..

I was able to adapt their methods to one of my small models "Replicant" 2gb and created a new model with importance matrix Quantization using "THE_KEY" Dataset for better inference in the coding model I pulled from Whiterabbitneo's Qwen2.5 model... I give you Reasoning Rabbit.. enjoy

IntelligentEstate/o3-ReasoningRabbit_Q2.5-Cd-7B-IQ4_XS-GGUF
-IntelligentEstate/o3-ReasoningRabbit_Q2.5-Cd-7B-IQ4_XS-GGUF

IntelligentEstate/Replicant_Warder-o3-Q2.5_3B-iQ5_K_S-GGUF
IntelligentEstate/Replicant_Warder-o3-Q2.5_3B-iQ5_K_S-GGUF

-WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B