| CVSS Corpus and Massively Multilingual Speech-to-Speech Translation | Jan 11, 2022 | SentenceSpeech-to-Speech Translation | CodeCode Available | 2 |
| Deduplicating Training Data Makes Language Models Better | Jul 14, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SimCSE: Simple Contrastive Learning of Sentence Embeddings | Apr 18, 2021 | Contrastive LearningData Augmentation | CodeCode Available | 2 |
| Pretrained Transformers for Text Ranking: BERT and Beyond | Oct 13, 2020 | Information RetrievalReranking | CodeCode Available | 2 |
| Abstractive Summarization of Spoken andWritten Instructions with BERT | Aug 21, 2020 | Abstractive Text SummarizationArticles | CodeCode Available | 2 |
| Reevaluating Adversarial Examples in Natural Language | Apr 25, 2020 | Sentence | CodeCode Available | 2 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| CLUE: A Chinese Language Understanding Evaluation Benchmark | Apr 13, 2020 | General ClassificationMachine Reading Comprehension | CodeCode Available | 2 |
| ALBERT: A Lite BERT for Self-supervised Learning of Language Representations | Sep 26, 2019 | Common Sense ReasoningGPU | CodeCode Available | 2 |
| PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification | Aug 30, 2019 | Paraphrase IdentificationSentence | CodeCode Available | 2 |