Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
In recent years, neural image compression methods have achieved impressive performance in image compression tasks, most of which are based on variational auto-encoder with hyper-prior and autoregressive Gaussian entropy model. We first demonstrate that the way these end-to-end approaches handle quantization during training leads to a mismatch between the gradients direction of entropy model parameters (i.e., mean and standard deviation) and the direction they should be optimized towards during inference, making neural network difficult to learn accurate estimates of entropy model parameters. To address this issue, we then propose a two-step improvement: in the first step, use straight-through estimator to align the forward propagation during training with inference, thereby correcting the gradients of standard deviation parameters; in the second step, utilize gradients transfer that we propose and MSE-guided gradients to manually compensate for the gradients of mean parameters lost due to straight-through estimator. Finally, we also propose to freeze the auto-encoder and hyper auto-encoder in pre-trained models provided by existing works, and fine-tune only the modules that predict the entropy model parameters, enabling efficient validation of proposed improvements. Experimental results show that our improvements bring appreciable performance gains to state-of-the-art neural image compression models in recent years. Meanwhile, our improvements require no modification to the structure of pre-trained models and only lightweight fine-tuning, which shows strong plug-and-play capability and practical utility.