Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

reacted to singhsidhukuldeep's post with ๐Ÿš€ about 4 hours ago
While everyone is buzzing about DeepSeek AI R1's groundbreaking open-source release, ByteDance has quietly launched something remarkable - Trae, an adaptive AI IDE that's redefining the development experience and unlike competitors like Cursor, it' completely FREE! Trae is a sophisticated development environment built on Microsoft's VSCode foundation(with a nice skin on top), offering unlimited free access to both OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet models. Technical Highlights: - Real-time AI pair programming with comprehensive codebase understanding - Natural language commands for code generation and project-level development - Intelligent task decomposition for automated planning and execution - Seamless VS Code and Cursor configuration compatibility - Multi-language support with specialized optimization for English and Chinese interfaces Currently available for macOS (Windows version in development), Trae is distributed through ByteDance's Singapore subsidiary, Spring (SG) Pte. What sets it apart is its ability to handle mixed-language workflows and enhanced localization features that address common pain points in existing IDEs. The AI assistant can generate code snippets, optimize logic, and even create entire projects from scratch through natural language prompts. It also features an innovative AI Chat system accessible via keyboard shortcuts for real-time coding assistance. For developers looking to enhance their productivity without breaking the bank, Trae offers enterprise-grade AI capabilities completely free during its initial release. This move by ByteDance signals a significant shift in the AI IDE landscape, challenging established players with a robust, accessible alternative. Try it at trae.ai
liked a model about 7 hours ago
deepseek-ai/Janus-Pro-7B
View all activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

mkurman's activity

reacted to singhsidhukuldeep's post with ๐Ÿš€ about 4 hours ago
view post
Post
533
While everyone is buzzing about DeepSeek AI R1's groundbreaking open-source release, ByteDance has quietly launched something remarkable - Trae, an adaptive AI IDE that's redefining the development experience and unlike competitors like Cursor, it' completely FREE!

Trae is a sophisticated development environment built on Microsoft's VSCode foundation(with a nice skin on top), offering unlimited free access to both OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet models.

Technical Highlights:
- Real-time AI pair programming with comprehensive codebase understanding
- Natural language commands for code generation and project-level development
- Intelligent task decomposition for automated planning and execution
- Seamless VS Code and Cursor configuration compatibility
- Multi-language support with specialized optimization for English and Chinese interfaces

Currently available for macOS (Windows version in development), Trae is distributed through ByteDance's Singapore subsidiary, Spring (SG) Pte. What sets it apart is its ability to handle mixed-language workflows and enhanced localization features that address common pain points in existing IDEs.

The AI assistant can generate code snippets, optimize logic, and even create entire projects from scratch through natural language prompts. It also features an innovative AI Chat system accessible via keyboard shortcuts for real-time coding assistance.

For developers looking to enhance their productivity without breaking the bank, Trae offers enterprise-grade AI capabilities completely free during its initial release. This move by ByteDance signals a significant shift in the AI IDE landscape, challenging established players with a robust, accessible alternative.

Try it at trae.ai
reacted to sagar007's post with โค๏ธ about 12 hours ago
view post
Post
2268
๐Ÿš€ Just built a Perplexity-inspired AI search assistant using Gradio, DeepSeek, and DuckDuckGo!
Ask it anything, and itโ€™ll:

Scour the web for answers ๐Ÿ“š

Cite sources like a pro ๐Ÿ”—

Even talk back with TTS (thanks, Kokoro!) ๐ŸŽ™๏ธ

Ask it anything, and itโ€™ll:

Scour the web for answers ๐Ÿ“š


Check it out โ†’ sagar007/DeepSeekR1_Search
posted an update 1 day ago
view post
Post
1221
Iโ€™ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! ๐Ÿš€

mkurman/Qwen2.5-14B-DeepSeek-R1-1M
reacted to kadirnar's post with ๐Ÿ”ฅ 8 days ago
view post
Post
2625
I created my own AI image and video from scratch using the fal.ai platform ๐Ÿ’ซ

Workflow: Flux Lora Training + Upscale + Kling AI(1.6)
ยท
replied to their post 10 days ago
posted an update 10 days ago
reacted to Jaward's post with ๐Ÿš€๐Ÿ”ฅ 17 days ago
reacted to prithivMLmods's post with ๐Ÿš€๐Ÿ”ฅ 22 days ago
view post
Post
5893
Reasoning SmolLM2 ๐Ÿš€

๐ŸŽฏFine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

๐Ÿ”ฅBlog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

๐Ÿ”ผ Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

๐Ÿค  Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M




reacted to openfree's post with ๐Ÿ”ฅ 22 days ago
view post
Post
5197
# ๐Ÿงฌ Protein Genesis AI: Design Proteins with Just a Prompt

## ๐Ÿค” Current Challenges in Protein Design

Traditional protein design faces critical barriers:
- ๐Ÿ’ฐ High costs ($1M - $10M+) & long development cycles (2-3 years)
- ๐Ÿ”ฌ Complex equipment and expert knowledge required
- ๐Ÿ“‰ Low success rates (<10%)
- โฐ Time-consuming experimental validation

## โœจ Our Solution: Protein Genesis AI

Transform protein design through simple natural language input:
"Design a protein that targets cancer cells"
"Create an enzyme that breaks down plastic"


### Key Features
- ๐Ÿค– AI-powered automated design
- ๐Ÿ“Š Real-time analysis & optimization
- ๐Ÿ”ฌ Instant 3D visualization
- ๐Ÿ’พ Immediate PDB file generation

## ๐ŸŽฏ Applications

### Medical & Industrial
- ๐Ÿฅ Drug development
- ๐Ÿ’‰ Antibody design
- ๐Ÿญ Industrial enzymes
- โ™ป๏ธ Environmental solutions

### Research & Education
- ๐Ÿ”ฌ Basic research
- ๐Ÿ“š Educational tools
- ๐Ÿงซ Experimental design
- ๐Ÿ“ˆ Data analysis

## ๐Ÿ’ซ Key Advantages

- ๐Ÿ‘จโ€๐Ÿ’ป No coding or technical expertise needed
- โšก Results in minutes (vs. years)
- ๐Ÿ’ฐ 90% cost reduction
- ๐ŸŒ Accessible anywhere

## ๐ŸŽ“ Who Needs This?
- ๐Ÿข Biotech companies
- ๐Ÿฅ Pharmaceutical research
- ๐ŸŽ“ Academic institutions
- ๐Ÿงช Research laboratories

## ๐ŸŒŸ Why It Matters
Protein Genesis AI democratizes protein design by transforming complex processes into simple text prompts. This breakthrough accelerates scientific discovery, potentially leading to faster drug development and innovative biotechnology solutions. The future of protein design starts with a simple prompt! ๐Ÿš€

openfree/ProteinGenesis
ยท
reacted to singhsidhukuldeep's post with ๐Ÿ‘€ 22 days ago
view post
Post
3411
Exciting breakthrough in e-commerce recommendation systems!
Walmart Global Tech researchers have developed a novel Triple Modality Fusion (TMF) framework that revolutionizes how we make product recommendations.

>> Key Innovation
The framework ingeniously combines three distinct data types:
- Visual data to capture product aesthetics and context
- Textual information for detailed product features
- Graph data to understand complex user-item relationships

>> Technical Architecture
The system leverages a Large Language Model (Llama2-7B) as its backbone and introduces several sophisticated components:

Modality Fusion Module
- All-Modality Self-Attention (AMSA) for unified representation
- Cross-Modality Attention (CMA) mechanism for deep feature integration
- Custom FFN adapters to align different modality embeddings

Advanced Training Strategy
- Curriculum learning approach with three complexity levels
- Parameter-Efficient Fine-Tuning using LoRA
- Special token system for behavior and item representation

>> Real-World Impact
The results are remarkable:
- 38.25% improvement in Electronics recommendations
- 43.09% boost in Sports category accuracy
- Significantly higher human evaluation scores compared to traditional methods

Currently deployed in Walmart's production environment, this research demonstrates how combining multiple data modalities with advanced LLM architectures can dramatically improve recommendation accuracy and user satisfaction.
  • 2 replies
ยท
reacted to Sri-Vigneshwar-DJ's post with ๐Ÿ”ฅ 23 days ago
view post
Post
2339
Combining smolagents with Anthropicโ€™s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
reacted to ezgikorkmaz's post with ๐Ÿ”ฅ 23 days ago
posted an update 23 days ago
view post
Post
1906
I kindly invite you to try my experimental Llama 3.2 3B with o1-like thinking.

It utilizes Thoughts when needed, so don't be surprised when it's not. It also has a minor bug that requires further fine-tuning (sometimes it starts with the <|python_tag|> instead of <Thought>).

Enjoy!

Give some likes and whatever to make me feel better and motivated to keep going ๐Ÿ˜‚

mkurman/llama-3.2-MEDIT-3B-o1
reacted to reddgr's post with ๐Ÿ‘€ about 2 months ago
view post
Post
1856
Thought it would only make sense to share this here. Lately, one of my favorite activities has been annotating prompts and putting them into datasets ( reddgr/tl-test-learn-prompts reddgr/rq-request-question-prompts reddgr/nli-chatbot-prompt-categorization), which I then use to classify and select chatbot conversations for my website. It's quite fun to use this widget on the lmsys/lmsys-chat-1m, but I also use it on my 2 years of talking to chatbots (soon to be dataset, but still a lot of web scraping and ETL work left)... This one in the picture was probably one of the first prompts I wrote to an LLM:
posted an update about 2 months ago
view post
Post
346
How Do I Contribute (HDIC)

Exciting times to come? We are working on a layer self-esteem technique to score their contribution to the final prediction. For now, it unlocks a lot of knowledge already stored in weights we couldn't force the model to extract by further fine-tuning!
reacted to AdinaY's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
1359
HunyuanVideo ๐Ÿ“น The new open video generation model by Tencent!
๐Ÿ‘‰ tencent/HunyuanVideo
zh-ai-community/video-models-666afd86cfa4e4dd1473b64c
โœจ 13B parameters: Probably the largest open video model to date
โœจ Unified architecture for image & video generation
โœจ Powered by advanced features: MLLM Text Encoder, 3D VAE, and Prompt Rewrite
โœจ Delivers stunning visuals, diverse motion, and unparalleled stability
๐Ÿ”“ Fully open with code & weights
reacted to singhsidhukuldeep's post with ๐Ÿค— about 2 months ago
view post
Post
1316
Exciting breakthrough in Document AI! Researchers from UNC Chapel Hill and Bloomberg have developed M3DocRAG, a revolutionary framework for multi-modal document understanding.

The innovation lies in its ability to handle complex document scenarios that traditional systems struggle with:
- Process 40,000+ pages across 3,000+ documents
- Answer questions requiring information from multiple pages
- Understand visual elements like charts, tables, and figures
- Support both closed-domain (single document) and open-domain (multiple documents) queries

Under the hood, M3DocRAG operates through three sophisticated stages:

>> Document Embedding:
- Converts PDF pages to RGB images
- Uses ColPali to project both text queries and page images into a shared embedding space
- Creates dense visual embeddings for each page while maintaining visual information integrity

>> Page Retrieval:
- Employs MaxSim scoring to compute relevance between queries and pages
- Implements inverted file indexing (IVFFlat) for efficient search
- Reduces retrieval latency from 20s to under 2s when searching 40K+ pages
- Supports approximate nearest neighbor search via Faiss

>> Question Answering:
- Leverages Qwen2-VL 7B as the multi-modal language model
- Processes retrieved pages through a visual encoder
- Generates answers considering both textual and visual context

The results are impressive:
- State-of-the-art performance on MP-DocVQA benchmark
- Superior handling of non-text evidence compared to text-only systems
- Significantly better performance on multi-hop reasoning tasks

This is a game-changer for industries dealing with large document volumesโ€”finance, healthcare, and legal sectors can now process documents more efficiently while preserving crucial visual context.
ยท
reacted to cfahlgren1's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
1936
You can just ask things ๐Ÿ—ฃ๏ธ

"show me messages in the coding category that are in the top 10% of reward model scores"

Download really high quality instructions from the Llama3.1 405B synthetic dataset ๐Ÿ”ฅ

argilla/magpie-ultra-v1.0