Comparison of different Unique hard attention transformer models by the formal languages they can recognize
2025-06-03Unverified0· sign in to hype
Leonid Ryvkin
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This note is a survey of various results on the capabilities of unique hard attention transformers encoders (UHATs) to recognize formal languages. We distinguish between masked vs. non-masked, finite vs. infinite image and general vs. bilinear attention score functions. We recall some relations between these models, as well as a lower bound in terms of first-order logic and an upper bound in terms of circuit complexity.