13 5 7

Robert Dahlke PRO

rbrt

https://www.tngtech.com

AI & ML interests

MoE Architecture, building Chimera Models, Finetuning

Recent Activity

new activity 13 days ago

tngtech/DeepSeek-TNG-R1T2-Chimera:Where is the modeling_deepseek.py file

updated a model about 1 month ago

tngtech/DeepSeek-TNG-R1T2-Chimera

new activity about 1 month ago

tngtech/DeepSeek-TNG-R1T2-Chimera:R1Tx?

View all activity

Organizations

New activity in tngtech/DeepSeek-TNG-R1T2-Chimera 13 days ago

Where is the modeling_deepseek.py file

#15 opened 17 days ago by

ccocks-deca

updated a model about 1 month ago

tngtech/DeepSeek-TNG-R1T2-Chimera

Text Generation • 685B • Updated 7 days ago • 2.27k • 242

New activity in tngtech/DeepSeek-TNG-R1T2-Chimera about 1 month ago

R1Tx?

#14 opened about 1 month ago by

ccocks-deca

How is the style?

#3 opened about 2 months ago by

ChuckMcSneed

Where to access other than Chutes.

#8 opened about 2 months ago by

cinnybun02

Test

#13 opened about 1 month ago by

leyra2025

Request: DOI

#10 opened about 2 months ago by

samwisejones

New activity in tngtech/DeepSeek-R1T-Chimera about 2 months ago

Chimera not separating reasoning from response

#3 opened 3 months ago by

SanityInMotion

New activity in tngtech/DeepSeek-TNG-R1T2-Chimera about 2 months ago

AIME24, AIME25 and GPQA-Diamond results

#5 opened about 2 months ago by

ID0M

updated a Space about 2 months ago

README

🚀

TNG on huggingface

liked 2 models about 2 months ago

unsloth/DeepSeek-TNG-R1T2-Chimera-BF16

Text Generation • 684B • Updated Jul 4 • 13 • 3

unsloth/DeepSeek-TNG-R1T2-Chimera

Text Generation • 685B • Updated Jul 3 • 10 • 6

New activity in tngtech/DeepSeek-TNG-R1T2-Chimera about 2 months ago

Missing `model.safetensors.index.json`

#2 opened about 2 months ago by

danielhanchen

liked a model about 2 months ago

tngtech/DeepSeek-TNG-R1T2-Chimera

Text Generation • 685B • Updated 7 days ago • 2.27k • 242

published a model about 2 months ago

tngtech/DeepSeek-TNG-R1T2-Chimera

Text Generation • 685B • Updated 7 days ago • 2.27k • 242

New activity in tngtech/DeepSeek-R1T-Chimera about 2 months ago

Any plans to release an updated version based on DeepSeek-V3-0526 + R1, or how to create the merge myself?

#4 opened 3 months ago by

Lissanro

authored a paper 2 months ago

Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors

Paper • 2506.14794 • Published May 31 • 1

New activity in tngtech/DeepSeek-R1T-Chimera 3 months ago

Paid version?

#2 opened 4 months ago by

ccocks-deca

updated a model 4 months ago

tngtech/DeepSeek-R1T-Chimera

Text Generation • 685B • Updated Jul 6 • 1.33k • 262

New activity in tngtech/DeepSeek-R1T-Chimera 4 months ago

Questions on how routed experts are merged

👍 👀 17

#1 opened 4 months ago by

chuhac

Robert Dahlke PRO

AI & ML interests

Recent Activity

Organizations

rbrt's activity

Where is the modeling_deepseek.py file

R1Tx?

How is the style?

Where to access other than Chutes.

Test

Request: DOI

Chimera not separating reasoning from response

AIME24, AIME25 and GPQA-Diamond results

README

Missing `model.safetensors.index.json`

Any plans to release an updated version based on DeepSeek-V3-0526 + R1, or how to create the merge myself?

Paid version?

Questions on how routed experts are merged