David Quispe
daqc
AI & ML interests
Education
Recent Activity
upvoted
an
article
about 18 hours ago
Mastering Tensor Dimensions in Transformers
upvoted
a
paper
4 days ago
Why Language Models Hallucinate
upvoted
a
paper
7 days ago
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Organizations
Agents
Low-Resource Data
Multilingual
-
Facilitating large language model Russian adaptation with Learned Embedding Propagation
Paper • 2412.21140 • Published • 18 -
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Paper • 2412.13375 • Published -
LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks
Paper • 2412.12499 • Published • 1 -
Development of Pre-Trained Transformer-based Models for the Nepali Language
Paper • 2411.15734 • Published
SLMs
Reinforcement
-
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
Paper • 2412.16849 • Published • 9 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 417
Frameworks
LRMs
Thinkers
Reasoning LLMs
-
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 104 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 101 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 283 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 417
Read later
Safety
on-Device (phone)
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 256 -
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
Paper • 2408.01800 • Published • 85 -
SlimLM: An Efficient Small Language Model for On-Device Document Assistance
Paper • 2411.09944 • Published • 12 -
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper • 2408.13933 • Published • 16
Domain-specific
Dataset Best Practices
LRMs
Agents
Thinkers
Low-Resource Data
Reasoning LLMs
-
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 104 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 101 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 283 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 417
Multilingual
-
Facilitating large language model Russian adaptation with Learned Embedding Propagation
Paper • 2412.21140 • Published • 18 -
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Paper • 2412.13375 • Published -
LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks
Paper • 2412.12499 • Published • 1 -
Development of Pre-Trained Transformer-based Models for the Nepali Language
Paper • 2411.15734 • Published
Read later
SLMs
Safety
Reinforcement
-
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
Paper • 2412.16849 • Published • 9 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 417
on-Device (phone)
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 256 -
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
Paper • 2408.01800 • Published • 85 -
SlimLM: An Efficient Small Language Model for On-Device Document Assistance
Paper • 2411.09944 • Published • 12 -
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper • 2408.13933 • Published • 16
Frameworks
Domain-specific