Jialiang Cheng
Julius-L
·
AI & ML interests
None yet
Recent Activity
liked
a dataset
11 days ago
Salesforce/wikitext
liked
a dataset
20 days ago
allenai/tulu-3-sft-mixture
liked
a dataset
about 1 month ago
HuggingFaceFW/fineweb
Organizations
None yet
Generation
Finetuning
Pretraining
Model Merging
Quantization
Unseen Papers
-
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Paper • 2410.18533 • Published • 44 -
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
LongReward: Improving Long-context Large Language Models with AI Feedback
Paper • 2410.21252 • Published • 18
multimodal dataset
-
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Paper • 2412.04626 • Published • 14 -
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI
Paper • 2411.14522 • Published • 39 -
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Paper • 2411.03823 • Published • 50 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19
Long Context
Memory Efficient Training
Model Architecture
Sparsification
LLM Technical Reports
-
The Llama 3 Herd of Models
Paper • 2407.21783 • Published • 117 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
A Survey of Small Language Models
Paper • 2410.20011 • Published • 45
inference acceleration
multimodal dataset
-
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Paper • 2412.04626 • Published • 14 -
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI
Paper • 2411.14522 • Published • 39 -
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Paper • 2411.03823 • Published • 50 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19
Generation
Long Context
Finetuning
Memory Efficient Training
Pretraining
Model Architecture
Model Merging
Sparsification
Quantization
LLM Technical Reports
-
The Llama 3 Herd of Models
Paper • 2407.21783 • Published • 117 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
A Survey of Small Language Models
Paper • 2410.20011 • Published • 45
Unseen Papers
-
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Paper • 2410.18533 • Published • 44 -
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
LongReward: Improving Long-context Large Language Models with AI Feedback
Paper • 2410.21252 • Published • 18