5 5

Rummy

yang31210999

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Shadow-FT: Tuning Instruct via Base

upvoted a collection 3 months ago

Llama 4

new activity 3 months ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503:Mistral3ForConditionalGeneration has no vLLM implementation and the Transformers implementation is not compatible with vLLM. Try setting VLLM_USE_V1=0.

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Shadow-FT: Tuning Instruct via Base

Paper • 2505.12716 • Published May 19 • 2

upvoted a collection 3 months ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29 • 564

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 3 months ago

Mistral3ForConditionalGeneration has no vLLM implementation and the Transformers implementation is not compatible with vLLM. Try setting VLLM_USE_V1=0.

👍 4

#16 opened 4 months ago by

pedrojfb99

New activity in google/gemma-3-27b-it 3 months ago

evals (PT vs IT)

👍 1

#30 opened 4 months ago by

erichartford

New activity in yang31210999/Llama3.1-1B-Neo-BAAI-1000k 4 months ago

Add library name, pipeline tag, paper link, and Github link

#1 opened 4 months ago by

nielsr

New activity in yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-BAAI-100k 4 months ago

Enhance model card with metadata, paper link, and basic usage

#1 opened 4 months ago by

nielsr

New activity in yang31210999/Llama-3.2-1B-Instruct-Neo-BAAI-10k 4 months ago

Add pipeline tag, library name and link to Github repository

#1 opened 4 months ago by

nielsr

upvoted a paper 6 months ago

Improving Video Generation with Human Feedback

Paper • 2501.13918 • Published Jan 23 • 50

updated a collection 8 months ago

LLM-Neo

Collection

Model hub for LLM-Neo, including Llama3.1-Neo-1B-100w and Minitron-4B-Depth-Neo-10w. • 3 items • Updated Nov 20, 2024 • 6

updated a model 8 months ago

yang31210999/Llama-3.2-1B-Instruct-Neo-BAAI-10k

Text Generation • 1B • Updated Feb 28 • 41

upvoted a paper 8 months ago

Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 67

updated a model 9 months ago

yang31210999/H200-pile-0.01-15-10-5-neo-rank64-lr2e-4

Updated Oct 23, 2024

upvoted an article 10 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

and 5 others •

Sep 18, 2024

• 258

updated 2 models 10 months ago

yang31210999/Llama3.1-1B-Neo-BAAI-1000k

Text Generation • 2B • Updated Feb 28 • 44 • 2

yang31210999/Llama-3.1-Minitron-4B-Depth-Neo-BAAI-100k

Text Generation • 5B • Updated Feb 28 • 12 • 1

updated a collection 10 months ago

LLM-Neo

Collection

Model hub for LLM-Neo, including Llama3.1-Neo-1B-100w and Minitron-4B-Depth-Neo-10w. • 3 items • Updated Nov 20, 2024 • 6

Rummy

AI & ML interests

Recent Activity

Organizations

yang31210999's activity

Mistral3ForConditionalGeneration has no vLLM implementation and the Transformers implementation is not compatible with vLLM. Try setting VLLM_USE_V1=0.

evals (PT vs IT)

Add library name, pipeline tag, paper link, and Github link

Enhance model card with metadata, paper link, and basic usage

Add pipeline tag, library name and link to Github repository

Fine-tuning LLMs to 1.58bit: extreme quantization made easy