| GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling | Nov 3, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling | Oct 14, 2022 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization | Sep 10, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning | May 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Cached Transformers: Improving Transformers with Differentiable Memory Cache | Dec 20, 2023 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Atla Selene Mini: A General Purpose Evaluation Model | Jan 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring | May 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation | Sep 13, 2021 | DecoderDenoising | CodeCode Available | 1 | 5 |
| CPT: Efficient Deep Neural Network Training via Cyclic Precision | Jan 25, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Incorporating External POS Tagger for Punctuation Restoration | Jun 12, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |