| Injecting Numerical Reasoning Skills into Language Models | Apr 9, 2020 | Data AugmentationDecoder | CodeCode Available | 1 | 5 |
| RealFormer: Transformer Likes Residual Attention | Dec 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training | Jul 15, 2020 | Contrastive LearningCross-Lingual Transfer | CodeCode Available | 1 | 5 |
| CREAM: Consistency Regularized Self-Rewarding Language Models | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| -former: Infinite Memory Transformer | Sep 1, 2021 | Dialogue GenerationLanguage Modeling | CodeCode Available | 1 | 5 |
| InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER | Mar 8, 2022 | Entity TypingFew-Shot Learning | CodeCode Available | 1 | 5 |
| A Systematic Assessment of Syntactic Generalization in Neural Language Models | May 7, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CPT: Efficient Deep Neural Network Training via Cyclic Precision | Jan 25, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings | Oct 8, 2022 | Contrastive LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models | Oct 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |