| Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| SparQ Attention: Bandwidth-Efficient LLM Inference | Dec 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change | Oct 31, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 | 5 |
| Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation | Mar 18, 2021 | Bilingual Lexicon InductionLanguage Modeling | CodeCode Available | 1 | 5 |
| UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling | Nov 23, 2021 | Image CaptioningImage Description | CodeCode Available | 1 | 5 |
| DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documents | Jul 12, 2024 | Document Layout Analysisdocument understanding | CodeCode Available | 1 | 5 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 | 5 |
| DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration | Jun 6, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 | 5 |
| DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA | Dec 6, 2024 | counterfactualLanguage Model Evaluation | CodeCode Available | 1 | 5 |