| BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models | Jun 18, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Distributed Deep Learning in Open Collaborations | Jun 18, 2021 | Deep LearningLanguage Modeling | CodeCode Available | 1 |
| Golos: Russian Dataset for Speech Research | Jun 18, 2021 | Automatic Speech Recognition (ASR)Language Modeling | CodeCode Available | 1 |
| Scene Transformer: A unified architecture for predicting multiple agent trajectories | Jun 15, 2021 | Autonomous DrivingLanguage Modeling | CodeCode Available | 1 |
| Direction is what you need: Improving Word Embedding Compression in Large Language Models | Jun 15, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Incorporating External POS Tagger for Punctuation Restoration | Jun 12, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BioELECTRA:Pretrained Biomedical text Encoder using Discriminators | Jun 11, 2021 | ArticlesLanguage Modeling | CodeCode Available | 1 |
| Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment | Jun 11, 2021 | DenoisingLanguage Modeling | CodeCode Available | 1 |
| Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models | Jun 10, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Staircase Attention for Recurrent Processing of Sequences | Jun 8, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |