SOTAVerified

ListOps

Papers

Showing 110 of 22 papers

TitleStatusHype
Mega: Moving Average Equipped Gated AttentionCode2
Simplified State Space Layers for Sequence ModelingCode2
Cached Transformers: Improving Transformers with Differentiable Memory CacheCode1
Sequence Modeling with Multiresolution Convolutional MemoryCode1
Training Discrete Deep Generative Models via Gapped Straight-Through EstimatorCode1
Dynamic Token Normalization Improves Vision TransformersCode1
Efficiently Modeling Long Sequences with Structured State SpacesCode1
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic GeneralizationCode1
Going Beyond Linear Transformers with Recurrent Fast Weight ProgrammersCode1
Modeling Hierarchical Structures with Continuous Recursive Neural NetworksCode1
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.