the gradient_checkpointing option is always False in BigBirdPegasusEncoder, BigBirdPegasusDecoder class

by hseom - opened Dec 5, 2022

Dec 5, 2022

Thanks for this great work!

I found this: in BigBirdPegasusEncoder, BigBirdPegasusDecoder class, the gradient_checkpointing option is always False so the GPU memory is accumlated.
please make it optional again :)

# modling_bigbird_pegasus.py, line 1768~
# BigBridPegasusEncoder class
...
...
self.layers = nn.ModuleList([BigBirdPegasusEncoderLayer(config, seed=i) for i in range(config.encoder_layers)])
self.layernorm_embedding = nn.LayerNorm(embed_dim)
self.gradient_checkpointing = False

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment