15 20 15

ct2

ct-2

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

meituan-longcat/LongCat-Flash-Chat:Any plan to release 120b and 20-30b level models?

upvoted a paper 5 days ago

Metis: Training Large Language Models with Advanced Low-Bit Quantization

upvoted an article 7 days ago

The Hacker's Guide to Building an AI Supercluster

View all activity

Organizations

None yet

New activity in meituan-longcat/LongCat-Flash-Chat 3 days ago

Any plan to release 120b and 20-30b level models?

👍 2

#5 opened 8 days ago by

Sunny2038

upvoted a paper 5 days ago

Metis: Training Large Language Models with Advanced Low-Bit Quantization

Paper • 2509.00404 • Published 9 days ago • 5

upvoted an article 7 days ago

Article

The Hacker's Guide to Building an AI Supercluster

•

8 days ago

• 7

liked a model 18 days ago

NexaAI/OmniNeural-4B

Updated 4 days ago • 528 • 144

New activity in moonshotai/Kimi-K2-Instruct about 2 months ago

is kimi k2 trained with fp8?

#30 opened about 2 months ago by

ct-2

liked a model 2 months ago

ai21labs/AI21-Jamba-Mini-1.7

52B • Updated Jul 6 • 6.64k • 33

upvoted 2 collections 2 months ago

Jamba 1.7

Collection

The AI21 Jamba family of models are hybrid SSM-Transformer foundation models, blending speed, efficient long context processing, and accuracy. • 4 items • Updated Jul 2 • 12

BitVLA

Collection

1-bit Vision-Language-Action Models for Robotics Manipulation • 9 items • Updated Jun 30 • 3

liked 3 models 4 months ago

upvoted a paper 4 months ago

BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47

upvoted a paper 5 months ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74

upvoted a collection 5 months ago

BitNet

Collection

🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1 • 51

liked a model 5 months ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 6.04k • 1.16k

upvoted a paper 5 months ago

TransMamba: Flexibly Switching between Transformer and Mamba

Paper • 2503.24067 • Published Mar 31 • 21

liked 2 models 6 months ago

Virg1n/LightLM

Updated Mar 24 • 2

VPTQ-community/deepseek-r1_v_8_k_65536_mixed_mp4

Updated Mar 12 • 15 • 2

upvoted 2 papers 6 months ago

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

Paper • 2502.15872 • Published Feb 21 • 5

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

Paper • 2502.17055 • Published Feb 24 • 19

ct2

AI & ML interests

Recent Activity

Organizations

ct-2's activity

Any plan to release 120b and 20-30b level models?

The Hacker's Guide to Building an AI Supercluster

is kimi k2 trained with fp8?