| Specialized Transformers: Faster, Smaller and more Accurate NLP Models | Sep 29, 2021 | Hard AttentionQuantization | —Unverified | 0 | 0 |
| Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model | Dec 12, 2019 | Deep Reinforcement LearningHard Attention | —Unverified | 0 | 0 |
| Theoretical Limitations of Self-Attention in Neural Sequence Models | Jun 16, 2019 | Hard Attention | —Unverified | 0 | 0 |
| Transformers as Transducers | Apr 2, 2024 | Hard AttentionPOS | —Unverified | 0 | 0 |
| Transformers in Uniform TC^0 | Sep 20, 2024 | Hard Attention | —Unverified | 0 | 0 |
| Unique Hard Attention: A Tale of Two Sides | Mar 18, 2025 | Hard Attention | —Unverified | 0 | 0 |
| Upper, Middle and Lower Region Learning for Facial Action Unit Detection | Feb 10, 2020 | Action Unit DetectionFacial Action Unit Detection | —Unverified | 0 | 0 |
| Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model | Feb 4, 2022 | Activity RecognitionHard Attention | —Unverified | 0 | 0 |
| Word Representation Models for Morphologically Rich Languages in Neural Machine Translation | Jun 14, 2016 | Hard AttentionMachine Translation | —Unverified | 0 | 0 |
| You Only Need One Model for Open-domain Question Answering | Dec 14, 2021 | Hard AttentionNatural Questions | —Unverified | 0 | 0 |
| NoPE: The Counting Power of Transformers with No Positional Encodings | May 16, 2025 | Hard Attention | —Unverified | 0 | 0 |
| Achieving Explainability in a Visual Hard Attention Model through Content Prediction | Jan 1, 2021 | Hard Attentionimage-classification | —Unverified | 0 | 0 |
| A Differentiable Self-disambiguated Sense Embedding Model via Scaled Gumbel Softmax | Sep 27, 2018 | Hard AttentionSentence | —Unverified | 0 | 0 |
| AMR Parsing with Action-Pointer Transformer | Nov 24, 2020 | Abstract Meaning RepresentationAMR Parsing | —Unverified | 0 | 0 |
| An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing | Jun 13, 2017 | Automatic Post-EditingHard Attention | —Unverified | 0 | 0 |
| A study of latent monotonic attention variants | Mar 30, 2021 | Hard Attentionspeech-recognition | —Unverified | 0 | 0 |
| AttentionDrop: A Novel Regularization Method for Transformer Models | Apr 16, 2025 | Hard Attention | —Unverified | 0 | 0 |
| Average-Hard Attention Transformers are Constant-Depth Uniform Threshold Circuits | Aug 6, 2023 | Hard Attention | —Unverified | 0 | 0 |
| Characterizing the Expressivity of Transformer Language Models | May 29, 2025 | Hard Attention | —Unverified | 0 | 0 |
| CLAWS: Contrastive Learning with hard Attention and Weak Supervision | Dec 1, 2021 | Anomaly DetectionContrastive Learning | —Unverified | 0 | 0 |
| Comparison of different Unique hard attention transformer models by the formal languages they can recognize | Jun 3, 2025 | Hard AttentionSurvey | —Unverified | 0 | 0 |
| Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters | Nov 30, 2023 | Continual LearningHard Attention | —Unverified | 0 | 0 |
| DanHAR: Dual Attention Network For Multimodal Human Activity Recognition Using Wearable Sensors | Jun 25, 2020 | Activity RecognitionHard Attention | —Unverified | 0 | 0 |
| Deep Pneumonia: Attention-Based Contrastive Learning for Class-Imbalanced Pneumonia Lesion Recognition in Chest X-rays | Jul 23, 2022 | Contrastive LearningHard Attention | —Unverified | 0 | 0 |
| Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment | Oct 28, 2019 | Hard AttentionSpeech Synthesis | —Unverified | 0 | 0 |