Felix Tuma's picture

37 49

Felix Tuma

floom

·

AI & ML interests

NLP

Organizations

None yet

floom's activity

upvoted a paper about 1 month ago

TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published Oct 1 • 28

upvoted a paper about 2 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 130

upvoted 6 papers 3 months ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs

Paper • 2408.12060 • Published Aug 22 • 4

Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

Paper • 2408.12570 • Published Aug 22 • 29

Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15 • 38

FocusLLM: Scaling LLM's Context by Parallel Decoding

Paper • 2408.11745 • Published Aug 21 • 23

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Paper • 2408.08291 • Published Aug 15 • 9

upvoted a paper 4 months ago

LLM Circuit Analyses Are Consistent Across Training and Scale

Paper • 2407.10827 • Published Jul 15 • 4

upvoted a collection 4 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 346

upvoted 10 papers 5 months ago

Symbolic Learning Enables Self-Evolving Agents

Paper • 2406.18532 • Published Jun 26 • 11

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21 • 61

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17 • 48

Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization

Paper • 2406.11431 • Published Jun 17 • 4

Measuring memorization in RLHF for code completion

Paper • 2406.11715 • Published Jun 17 • 6

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11 • 37

Open-Endedness is Essential for Artificial Superhuman Intelligence

Paper • 2406.04268 • Published Jun 6 • 11

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Paper • 2406.01014 • Published Jun 3 • 30

To Believe or Not to Believe Your LLM

Paper • 2406.02543 • Published Jun 4 • 31

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63