PDFTriage: Question Answering over Long, Structured Documents Paper • 2309.08872 • Published Sep 16, 2023 • 54
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 77
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise Paper • 2310.19019 • Published Oct 29, 2023 • 9
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 73
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 57
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 80
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 180
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding Paper • 2401.04398 • Published Jan 9, 2024 • 24
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26, 2024 • 72
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains Paper • 2402.05140 • Published Feb 6, 2024 • 22
AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Paper • 2402.07625 • Published Feb 12, 2024 • 15
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper • 2402.10986 • Published Feb 16, 2024 • 78
RAFT: Adapting Language Model to Domain Specific RAG Paper • 2403.10131 • Published Mar 15, 2024 • 69
TnT-LLM: Text Mining at Scale with Large Language Models Paper • 2403.12173 • Published Mar 18, 2024 • 20
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 69
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30, 2024 • 22
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning Paper • 2406.09170 • Published Jun 13, 2024 • 27
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20, 2024 • 90
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 93
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation Paper • 2406.19215 • Published Jun 27, 2024 • 30
Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER Paper • 2407.01272 • Published Jul 1, 2024 • 8
LETS-C: Leveraging Language Embedding for Time Series Classification Paper • 2407.06533 • Published Jul 9, 2024 • 2
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Paper • 2407.09413 • Published Jul 12, 2024 • 10
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Paper • 2407.12854 • Published Jul 9, 2024 • 31
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18, 2024 • 40
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28, 2024 • 63
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28, 2024 • 23
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25, 2024 • 32
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29, 2024 • 42
Knowledge Mechanisms in Large Language Models: A Survey and Perspective Paper • 2407.15017 • Published Jul 22, 2024 • 34
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3, 2024 • 50
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27, 2024 • 26
Efficient Detection of Toxic Prompts in Large Language Models Paper • 2408.11727 • Published Aug 21, 2024 • 13
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22, 2024 • 64
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 52
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9, 2024 • 48
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 72
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17, 2024 • 23
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 137
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18, 2024 • 44
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 37
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B Paper • 2409.11055 • Published Sep 17, 2024 • 17
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16, 2024 • 41
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Paper • 2409.09269 • Published Sep 14, 2024 • 9
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 36
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24, 2024 • 42
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26, 2024 • 47
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30, 2024 • 56
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published Oct 1, 2024 • 31
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
Embodied-RAG: General non-parametric Embodied Memory for Retrieval and Generation Paper • 2409.18313 • Published Sep 26, 2024 • 3
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published Oct 3, 2024 • 52
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise Paper • 2410.03017 • Published Oct 3, 2024 • 27
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3, 2024 • 48
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7, 2024 • 21
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Paper • 2410.06458 • Published Oct 9, 2024 • 8
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9, 2024 • 6
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 46
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents Paper • 2410.03450 • Published Oct 4, 2024 • 36
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Paper • 2410.06508 • Published Oct 9, 2024 • 10
Vector-ICL: In-context Learning with Continuous Vector Representations Paper • 2410.05629 • Published Oct 8, 2024 • 3
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization Paper • 2410.08815 • Published Oct 11, 2024 • 48
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9, 2024 • 36
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 39
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14, 2024 • 19
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI Paper • 2410.11096 • Published Oct 14, 2024 • 13
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models Paper • 2410.13085 • Published Oct 16, 2024 • 22
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16, 2024 • 32
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration Paper • 2410.12183 • Published Oct 16, 2024 • 3
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published Oct 17, 2024 • 59
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published Oct 18, 2024 • 37
Improve Vision Language Model Chain-of-thought Reasoning Paper • 2410.16198 • Published Oct 21, 2024 • 26
MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published Oct 22, 2024 • 15
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark Paper • 2410.19168 • Published Oct 24, 2024 • 19
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published Nov 4, 2024 • 35
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 68
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published Nov 25, 2024 • 39
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published Nov 21, 2024 • 34
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published Nov 21, 2024 • 30
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 36
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS Paper • 2411.18478 • Published Nov 27, 2024 • 36
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 36
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System Paper • 2412.20005 • Published Dec 28, 2024 • 18
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Paper • 2501.00316 • Published Dec 31, 2024 • 22
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following Paper • 2501.08187 • Published Jan 14 • 24
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions Paper • 2412.08864 • Published Dec 12, 2024 • 1
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 16 • 37
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding Paper • 2501.18362 • Published 27 days ago • 21
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 28 days ago • 23
Teaching Language Models to Critique via Reinforcement Learning Paper • 2502.03492 • Published 22 days ago • 23
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published 16 days ago • 19
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published 21 days ago • 22
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking Paper • 2502.02339 • Published 22 days ago • 22
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Paper • 2502.08127 • Published 15 days ago • 49
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published 15 days ago • 9
InductionBench: LLMs Fail in the Simplest Complexity Class Paper • 2502.15823 • Published 7 days ago • 6