Interés - a JuanRafap Collection

JuanRafap 's Collections

Interés

updated about 14 hours ago

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 34
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 50
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 65
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 44
Game-theoretic LLM: Agent Workflow for Negotiation Games

Paper • 2411.05990 • Published Nov 8, 2024 • 7
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16, 2024 • 45
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Paper • 2411.19146 • Published Nov 28, 2024 • 16
Snowflake/snowflake-arctic-embed-m-v2.0

Sentence Similarity • Updated Dec 14, 2024 • 10.6k • 53
Snowflake/snowflake-arctic-embed-l-v2.0

Sentence Similarity • Updated Dec 14, 2024 • 76k • 99
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Paper • 2412.04862 • Published Dec 6, 2024 • 50
ruliad/deepthought-8b-llama-v0.01-alpha

Text Generation • Updated Dec 7, 2024 • 1.87k • 143
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published Nov 29, 2024 • 57
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Paper • 2412.02592 • Published Dec 3, 2024 • 21
RL Zero: Zero-Shot Language to Behaviors without any Supervision

Paper • 2412.05718 • Published Dec 7, 2024 • 4
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

Paper • 2412.10704 • Published Dec 14, 2024 • 15
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment

Paper • 2412.13746 • Published Dec 18, 2024 • 9
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published Dec 16, 2024 • 6
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Paper • 2412.13194 • Published Dec 17, 2024 • 12
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Paper • 2412.14711 • Published Dec 19, 2024 • 16
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published Dec 20, 2024 • 17
Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 73
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

Paper • 2412.14590 • Published Dec 19, 2024 • 14
Learned Compression for Compressed Learning

Paper • 2412.09405 • Published Dec 12, 2024 • 13
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 45
ericsonwillians/distilbert-base-uncased-steam-sentiment

Text Classification • Updated Dec 12, 2024 • 118
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 37
Personalized Graph-Based Retrieval for Large Language Models

Paper • 2501.02157 • Published 24 days ago • 28
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 97
Multi-task retriever fine-tuning for domain-specific and efficient RAG

Paper • 2501.04652 • Published 19 days ago • 10
Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 18 days ago • 80
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Paper • 2501.02576 • Published 23 days ago • 15
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published 24 days ago • 87
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Paper • 2501.03226 • Published 21 days ago • 37
Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 11 days ago • 98
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published 11 days ago • 35
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation

Paper • 2501.08617 • Published 13 days ago • 10
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 15 days ago • 88
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published 12 days ago • 10
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Paper • 2501.06590 • Published 16 days ago • 8
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 26 days ago • 48
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published 10 days ago • 13
Control LLM: Controlled Evolution for Intelligence Retention in LLM

Paper • 2501.10979 • Published 9 days ago • 3
Autonomy-of-Experts Models

Paper • 2501.13074 • Published 5 days ago • 37
Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective

Paper • 2501.11110 • Published 8 days ago • 2
Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

Paper • 2412.09078 • Published Dec 12, 2024
LLM2: Let Large Language Models Harness System 2 Reasoning

Paper • 2412.20372 • Published 30 days ago
TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection

Paper • 2412.08024 • Published Dec 11, 2024
Table as Thought: Exploring Structured Thoughts in LLM Reasoning

Paper • 2501.02152 • Published 24 days ago
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 6 days ago • 226