Shantanu Agarwal
shantanuagarwal
ยท
AI & ML interests
None yet
Recent Activity
commented on
an
article
21 days ago
Efficient LLM Pretraining: Packed Sequences and Masked Attention
upvoted
an
article
22 days ago
Efficient LLM Pretraining: Packed Sequences and Masked Attention
liked
a Space
3 months ago
nanotron/ultrascale-playbook
Organizations
shantanuagarwal's activity
lora support
1
#3 opened 8 months ago
by
shantanuagarwal
Base model please
23
3
#6 opened 8 months ago
by
rombodawg

Why do we need to hardcode self._attn_implementation = "eager"
1
#35 opened 11 months ago
by
shantanuagarwal
MLP intermediate dimension
2
#3 opened 12 months ago
by
shantanuagarwal