@Jaward on Hugging Face: "PyTorch implementation of the Self-Compression & Differentiable Quantization…"

Post

1783

PyTorch implementation of the Self-Compression & Differentiable Quantization Algorithm introduced in “Self-Compressing Neural Networks” paper.

The algorithm shows dynamic neural network compression during training - with reduced size of weight, activation tensors and bits required to represent weights.

It’s basically shrinking the neural network size (weights and activations) as it’s being trained without compromising performance - this helps reduce compute and inference cost.

Code: https://github.com/Jaykef/ai-algorithms
Paper: https://arxiv.org/pdf/2301.13142

Join the conversation