CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration
Keming Ye, Zhou Zhao, Fan Wu, Shengyu Zhang
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Auto-regressive (AR) models have recently made notable progress in image generation, achieving performance comparable to diffusion-based approaches. However, their computational intensity and sequential nature impede on-device deployment, causing disruptive latency. We address this via a cloud-device collaboration framework CIAR, which utilizes on-device self-verification to handle two key properties of visual synthesis: the vast token vocabulary required for high-fidelity images and inherent spatial redundancy which leads to extreme predictability in homogeneous regions, while object boundaries exhibit high uncertainty. Uniform verification wastes resources on such redundant tokens. Our solution centers on an on-device token uncertainty quantifier, which adopts continuous probability intervals to accelerate processing and make it feasible for large visual vocabularies instead of conventional discrete solution sets. Additionally, we incorporate a Interval-enhanced decoding module to further speed up decoding while maintaining visual fidelity and semantic consistency via a distribution alignment training strategy. Extensive experiments demonstrate that CIAR achieves a 2.18x speed-up and reduces cloud requests by 70\%, while preserving image quality compared to existing methods.