Qian Liu's picture

Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Recent Activity

authored a paper about 18 hours ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

upvoted a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

commented on a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

View all activity

Organizations

commented a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published 4 days ago • 6 •

New activity in sail/Sailor2-20B 2 months ago

Improve language tag

#4 opened 2 months ago by

commented a paper 4 months ago

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Paper • 2503.15450 • Published Mar 19 • 11 •

New activity in OpenCoder-LLM/opc-sft-stage1 8 months ago

License

#5 opened 8 months ago by

New activity in OpenCoder-LLM/opc-annealing-corpus 8 months ago

License

#3 opened 8 months ago by

New activity in OpenCoder-LLM/opc-fineweb-code-corpus 8 months ago

Code elements inside web page are badly processed for FineWeb

#2 opened 8 months ago by

commented a paper 9 months ago

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Paper • 2410.07137 • Published Oct 9, 2024 • 8 •

New activity in SivilTaram/starcoder2-documentation 9 months ago

release plan for the rest of the-stack-v2-train-extras

#2 opened 9 months ago by

New activity in microsoft/tapex-large-finetuned-wtq 10 months ago

is it possible to support multiple languages, like Chinese?

#5 opened 12 months ago by

New activity in bigcode/the-stack-v2 11 months ago

"Documentation" data?

#8 opened over 1 year ago by

Where is the-stack-v2-train-extras?

#17 opened over 1 year ago by

question about starcoder 2 jupyter notebook conversion

#29 opened 11 months ago by

commented a paper 11 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

commented 2 papers 12 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57 •

New activity in sail/regmix-data 12 months ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

New activity in sail/regmix-data-sample about 1 year ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

commented 3 papers about 1 year ago

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 39 •

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 39 •

Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14, 2024 • 41 •