| Balancing Average and Worst-case Accuracy in Multitask Learning | Oct 12, 2021 | image-classificationImage Classification | —Unverified | 0 |
| Learning Compact Metrics for MT | Oct 12, 2021 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 |
| SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition | Oct 11, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue Systems | Oct 11, 2021 | Causal Language ModelingDiversity | —Unverified | 0 |
| On a Benefit of Mask Language Modeling: Robustness to Simplicity Bias | Oct 11, 2021 | Hate Speech DetectionLanguage Modeling | —Unverified | 0 |
| Unsupervised Neural Machine Translation with Generative Language Models Only | Oct 11, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric | Oct 11, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Breaking the Softmax Bottleneck for Sequential Recommender Systems with Dropout and Decoupling | Oct 11, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits | Oct 10, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automatic Text Extractive Summarization Based on Graph and Pre-trained Language Model Attention | Oct 10, 2021 | Extractive SummarizationLanguage Modeling | —Unverified | 0 |
| Long Expressive Memory for Sequence Modeling | Oct 10, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning | Oct 10, 2021 | ArticlesFew-Shot Learning | CodeCode Available | 1 |
| Improving Multi-Party Dialogue Discourse Parsing via Domain Integration | Oct 9, 2021 | Discourse ParsingDomain Adaptation | CodeCode Available | 1 |
| Global Explainability of BERT-Based Evaluation Metrics by Disentangling along Linguistic Factors | Oct 8, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms | Oct 8, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling | Oct 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition | Oct 7, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Back from the future: bidirectional CTC decoding using future information in speech recognition | Oct 7, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings | Oct 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Beam Search with Bidirectional Strategies for Neural Response Generation | Oct 7, 2021 | DecoderLanguage Modeling | —Unverified | 0 |
| Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition | Oct 6, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Cut the CARP: Fishing for zero-shot story evaluation | Oct 6, 2021 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| ABC: Attention with Bounded-memory Control | Oct 6, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| 8-bit Optimizers via Block-wise Quantization | Oct 6, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers | Oct 5, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |