SOTAVerified

Optimal Scalar Quantization for Matrix Multiplication: Closed-Form Density and Phase Transition

2026-03-20Unverified0· sign in to hype

Calvin Ang, Sungyoon Kim, Mert Pilanci

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study entrywise scalar quantization of two matrices prior to multiplication. Given A R^m k and B R^k n, we quantize entries of A and B independently using scalar quantizers with K_X and K_Y levels per entry, and form C= A\, B. The objective is to minimize the matrix multiplication mean-squared error (MSE) E[\|AB- A B\|_F^2] under a pair-i.i.d.\ inner-product model. In the high-resolution regime K_X,K_Y, we derive a sharp K^-2 asymptotic expansion for E, identify the exact optimal leading constants, and characterize asymptotically optimal quantization center densities in terms of conditional second moments. We then specialize to correlated Gaussian multiplicative pairs, obtaining a closed-form optimal point density \[ λ^ (u)\ \ \! (-u^26 ) ((1-ρ^2)+ρ^2u^2 )^1/3, u=xσ_X, \] with the same form for y/σ_Y, and prove a correlation-driven phase transition: the density is unimodal at the origin for |ρ| 1/3 and becomes bimodal for |ρ|>1/3 with peaks at u_peak=3-1/ρ^2. We show our method's applicability in synthetic experiments such as matrix multiplication quantization and least squares optimization, as well as quantization of large language model key and query activations.

Reproductions