Consistent model selection in the spiked Wigner model via AIC-type criteria
Soumendu Sundar Mukherjee
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Consider the spiked Wigner model \[ X = _i = 1^k _i u_i u_i^ + G, \] where G is an N N GOE random matrix, and the eigenvalues _i are all spiked, i.e. above the Baik-Ben Arous-P\'ech\'e (BBP) threshold . We consider AIC-type model selection criteria of the form \[ -2 \, (maximised log-likelihood) + \, (number of parameters) \] for estimating the number k of spikes. For > 2, the above criterion is strongly consistent provided _k > _, where _ is a threshold strictly above the BBP threshold, whereas for < 2, it almost surely overestimates k. Although AIC (which corresponds to = 2) is not strongly consistent, we show that taking = 2 + _N, where _N 0 and _N N^-2/3, results in a weakly consistent estimator of k. We further show that a soft minimiser of AIC, where one chooses the least complex model whose AIC score is close to the minimum AIC score, is strongly consistent. Based on a spiked (generalised) Wigner representation, we also develop similar model selection criteria for consistently estimating the number of communities in a balanced stochastic block model under some sparsity restrictions.