SOTAVerified

Hard Attention

Papers

Showing 150 of 100 papers

TitleStatusHype
Comparison of different Unique hard attention transformer models by the formal languages they can recognize0
Characterizing the Expressivity of Transformer Language Models0
Exact Expressive Power of Transformers with Padding0
Emergence of Fixational and Saccadic Movements in a Multi-Level Recurrent Attention Model for Vision0
NoPE: The Counting Power of Transformers with No Positional Encodings0
Neuroevolution of Self-Attention Over Proto-Objects0
AttentionDrop: A Novel Regularization Method for Transformer Models0
Center-guided Classifier for Semantic Segmentation of Remote Sensing ImagesCode0
Unique Hard Attention: A Tale of Two Sides0
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers0
Ehrenfeucht-Haussler Rank and Chain of Thought0
Simulating Hard Attention Using Soft Attention0
Transformers in Uniform TC^00
Soft-Hard Attention U-Net Model and Benchmark Dataset for Multiscale Image Shadow Removal0
Hard-Attention Gates with Gradient Routing for Endoscopic Image ComputingCode1
TRIP: Trainable Region-of-Interest Prediction for Hardware-Efficient Neuromorphic Processing on Event-based VisionCode0
Transformers as Transducers0
Recurrent Alignment with Hard Attention for Hierarchical Text RatingCode0
GQHAN: A Grover-inspired Quantum Hard Attention Network0
Mutual Distillation Learning For Person Re-IdentificationCode1
Multi-View Unsupervised Image Generation with Cross Attention Guidance0
Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters0
Vamos: Versatile Action Models for Video UnderstandingCode0
Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages0
Language-Guided Reinforcement Learning for Hard Attention in Few-Shot Learning0
Logical Languages Accepted by Transformer Encoders with Hard Attention0
Investigation of Architectures and Receptive Fields for Appearance-based Gaze EstimationCode1
Average-Hard Attention Transformers are Constant-Depth Uniform Threshold Circuits0
On the Learning Dynamics of Attention NetworksCode0
HAT-CL: A Hard-Attention-to-the-Task PyTorch Library for Continual LearningCode0
Coherent Concept-based Explanations in Medical Image and Its Application to Skin Lesion DiagnosisCode1
MESAHA-Net: Multi-Encoders based Self-Adaptive Hard Attention Network with Maximum Intensity Projections for Lung Nodule Segmentation in CT Scan0
Dual Attention Model with Reinforcement Learning for Classification of Histology Whole-Slide Images0
Learning to Perceive in Deep Model-Free Reinforcement LearningCode0
Deep Pneumonia: Attention-Based Contrastive Learning for Class-Imbalanced Pneumonia Lesion Recognition in Chest X-rays0
Dual Attention Networks for Few-Shot Fine-Grained RecognitionCode0
Table Retrieval May Not Necessitate Table-specific Model DesignCode1
Binding Actions to Objects in World ModelsCode0
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity0
Consistency driven Sequential Transformers Attention Model for Partially Observable ScenesCode0
Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model0
You Only Need One Model for Open-domain Question Answering0
CLAWS: Contrastive Learning with hard Attention and Weak Supervision0
A Probabilistic Hard Attention Model For Sequentially Observed ScenesCode0
Understanding Interlocking Dynamics of Cooperative RationalizationCode0
Sharp Attention for Sequence to Sequence Learning0
Specialized Transformers: Faster, Smaller and more Accurate NLP Models0
Saturated Transformers are Constant-Depth Threshold Circuits0
Multimodal Emergent Fake News Detection via Meta Neural Process Networks0
Self-Attention Networks Can Process Bounded Hierarchical LanguagesCode1
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.