Activation Memory: Part 2
Code accompanying the deep-dive blog post on activation memory.
- The main utility code is in act_mem.py.
- Basic transformer layers are implemented in layers.py.
- The scripts {block,mlp}_script.pydemonstrate how replacingGELUwithReLUaffects activation memory.
- attn_script.pyshows the cost of activation memory in the attention layer.
- Tests of the code are in test.py.
- See requirements.txtfor versions the code was built against.

