Activation Memory: Part 2
Code accompanying the deep-dive blog post on activation memory.
- The main utility code is in
act_mem.py. - Basic transformer layers are implemented in
layers.py. - The scripts
{block,mlp}_script.pydemonstrate how replacingGELUwithReLUaffects activation memory. attn_script.pyshows the cost of activation memory in the attention layer.- Tests of the code are in
test.py. - See
requirements.txtfor versions the code was built against.