| XLNet: Generalized Autoregressive Pretraining for Language Understanding | Jun 19, 2019 | Audio Question AnsweringChinese Reading Comprehension | CodeCode Available | 1 |
| How multilingual is Multilingual BERT? | Jun 4, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation | Jun 2, 2019 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 1 |
| Adapting Text Embeddings for Causal Inference | May 29, 2019 | Causal IdentificationCausal Inference | CodeCode Available | 1 |
| Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks | May 27, 2019 | General Classificationimage-classification | CodeCode Available | 1 |
| Discrete Flows: Invertible Generative Models of Discrete Data | May 24, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Adaptive Attention Span in Transformers | May 19, 2019 | 8kLanguage Modeling | CodeCode Available | 1 |
| A Surprisingly Robust Trick for Winograd Schema Challenge | May 15, 2019 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 |
| How to Fine-Tune BERT for Text Classification? | May 14, 2019 | General ClassificationLanguage Modeling | CodeCode Available | 1 |
| RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation | May 8, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |