| BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model | Sep 20, 2023 | 8kLanguage Modeling | CodeCode Available | 3 | 5 |
| CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up | Dec 20, 2024 | 8kGPU | CodeCode Available | 3 | 5 |
| MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly | May 15, 2025 | 8kBenchmarking | CodeCode Available | 2 | 5 |
| AbdomenAtlas-8K: Annotating 8,000 CT Volumes for Multi-Organ Segmentation in Three Weeks | May 16, 2023 | 8kActive Learning | CodeCode Available | 2 | 5 |
| LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models | Aug 31, 2024 | 8kGPU | CodeCode Available | 2 | 5 |
| LongEmbed: Extending Embedding Models for Long Context Retrieval | Apr 18, 2024 | 4k8k | CodeCode Available | 2 | 5 |
| CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model | Mar 3, 2020 | 8kLanguage Modeling | CodeCode Available | 2 | 5 |
| Hungry Hungry Hippos: Towards Language Modeling with State Space Models | Dec 28, 2022 | 8kCoreference Resolution | CodeCode Available | 2 | 5 |
| GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity? | Feb 7, 2025 | 8kInformation Retrieval | CodeCode Available | 2 | 5 |
| Hyena Hierarchy: Towards Larger Convolutional Language Models | Feb 21, 2023 | 2k8k | CodeCode Available | 2 | 5 |