Shantanu Agarwal's picture

6 2 2

Shantanu Agarwal

shantanuagarwal

·

AI & ML interests

None yet

Recent Activity

commented on an article 21 days ago

Efficient LLM Pretraining: Packed Sequences and Masked Attention

upvoted an article 22 days ago

Efficient LLM Pretraining: Packed Sequences and Masked Attention

liked a Space 3 months ago

nanotron/ultrascale-playbook

View all activity

Organizations

shantanuagarwal's activity

New activity in Qwen/Qwen2.5-14B 8 months ago

lora support

#3 opened 8 months ago by

shantanuagarwal

New activity in mistralai/Mistral-Small-Instruct-2409 8 months ago

Base model please

#6 opened 8 months ago by

New activity in nvidia/NV-Embed-v1 11 months ago

Why do we need to hardcode self._attn_implementation = "eager"

#35 opened 11 months ago by

shantanuagarwal

MLP intermediate dimension

#3 opened 12 months ago by

shantanuagarwal