SOTAVerified

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

2026-03-04Unverified0· sign in to hype

Nicolò Bonacorsi, Matteo Bordoni

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study the Collatz total stopping time τ(n) over n 10^7 from a probabilistic machine learning viewpoint. Empirically, τ(n) is a skewed and heavily overdispersed count with pronounced arithmetic heterogeneity. We develop two complementary models. First, a Bayesian hierarchical Negative Binomial regression (NB2-GLM) predicts τ(n) from simple covariates ( n and residue class n 8), quantifying uncertainty via posterior and posterior predictive distributions. Second, we propose a mechanistic generative approximation based on the odd-block decomposition: for odd m, write 3m+1=2^K(m)m' with m' odd and K(m)=v_2(3m+1) 1; randomizing these block lengths yields a stochastic approximation calibrated via a Dirichlet-multinomial update. On held-out data, the NB2-GLM achieves substantially higher predictive likelihood than the odd-block generators. Conditioning the block-length distribution on m 8 markedly improves the generator's distributional fit, indicating that low-order modular structure is a key driver of heterogeneity in τ(n).

Reproductions