โ Hosting our own inference was not enough: now the Hub 4 new inference providers: fal, Replicate, SambaNova Systems, & Together AI.
Check model cards on the Hub: you can now, in 1 click, use inference from various providers (cf video demo)
Their inference can also be used through our Inference API client. There, you can use either your custom provider key, or your HF token, then billing will be handled directly on your HF account, as a way to centralize all expenses.
๐ธ Also, PRO users get 2$ inference credits per month!
Exciting breakthrough in Retrieval-Augmented Generation (RAG): Introducing MiniRAG - a revolutionary approach that makes RAG systems accessible for edge devices and resource-constrained environments.
Key innovations that set MiniRAG apart:
Semantic-aware Heterogeneous Graph Indexing - Combines text chunks and named entities in a unified structure - Reduces reliance on complex semantic understanding - Creates rich semantic networks for precise information retrieval
Lightweight Topology-Enhanced Retrieval - Leverages graph structures for efficient knowledge discovery - Uses pattern matching and localized text processing - Implements query-guided reasoning path discovery
Impressive Performance Metrics - Achieves comparable results to LLM-based methods while using Small Language Models (SLMs) - Requires only 25% of storage space compared to existing solutions - Maintains robust performance with accuracy reduction ranging from just 0.8% to 20%
The researchers from Hong Kong University have also contributed a comprehensive benchmark dataset specifically designed for evaluating lightweight RAG systems under realistic on-device scenarios.
This breakthrough opens new possibilities for: - Edge device AI applications - Privacy-sensitive implementations - Real-time processing systems - Resource-constrained environments
The full implementation and datasets are available on GitHub: HKUDS/MiniRAG
Reminder: Donโt. Use. ChatGPT. As. A. Calculator. Seriously. ๐ค
Loved listening to @sasha on Hard Forkโit really made me think.
A few takeaways that hit home: - Individual culpability only gets you so far. The real priority: demanding accountability and transparency from companies. - Evaluate if generative AI is the right tool for certain tasks (like search) before using it.
๐ซ...And we're live!๐ซ Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents" https://huggingface.co/blog/ethics-soc-7 Our analyses found: - There's a spectrum of "agent"-ness - *Safety* is a key issue, leading to many other value-based concerns Read for details & what to do next! With @evijit , @giadap , and @sasha
๐ค๐ค ๐ป Speaking of AI agents ... ...Is easier with the right words ;)
My colleagues @meg@evijit@sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!