Abstract:
Diffusion models (DMs) have recently gained acclaim for their superior imaging capabilities. However, their extensive computational and memory demands often limit the pra...Show MoreMetadata
Abstract:
Diffusion models (DMs) have recently gained acclaim for their superior imaging capabilities. However, their extensive computational and memory demands often limit the practical application on portable devices. Post-training quantization (PTQ) offers a solution that enables model compression and reduces runtime without retraining. Nonetheless, traditional PTQ methods struggle to handle the unique time-variant distribution in DMs. Accordingly, we propose a novel timestep-grouping PTQ approach to address the multiple timestep issue. We also identify that non-uniform post-SiLU activations may lead to significant quantization loss. We tackle this issue with a region-specific quantization strategy that better represents extreme values after quantization. Combined with the above methods, we achieve a fully quantized diffusion model feasible for hardware implementation. Our experimental results show that the proposed method successfully maintains the FID score after 8-bit quantization.
Published in: 2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP)
Date of Conference: 22-25 September 2024
Date Added to IEEE Xplore: 04 November 2024
ISBN Information: