| Long-Short Transformer: Efficient Transformers for Language and Vision | Jul 5, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Robust End-to-End Offline Chinese Handwriting Text Page Spotter with Text Kernel | Jul 4, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling | Jul 2, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XLM-E: Cross-lingual Language Model Pre-training via ELECTRA | Jun 30, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information | Jun 30, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Stabilizing Equilibrium Models by Jacobian Regularization | Jun 28, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| R-Drop: Regularized Dropout for Neural Networks | Jun 28, 2021 | Abstractive Text Summarizationimage-classification | CodeCode Available | 1 |
| SymbolicGPT: A Generative Transformer Model for Symbolic Regression | Jun 27, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CLIP2Video: Mastering Video-Text Retrieval via Image CLIP | Jun 21, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Distributed Deep Learning in Open Collaborations | Jun 18, 2021 | Deep LearningLanguage Modeling | CodeCode Available | 1 |
| BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models | Jun 18, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs | Jun 18, 2021 | DecoderKnowledge Graphs | CodeCode Available | 1 |
| Golos: Russian Dataset for Speech Research | Jun 18, 2021 | Automatic Speech Recognition (ASR)Language Modeling | CodeCode Available | 1 |
| Scene Transformer: A unified architecture for predicting multiple agent trajectories | Jun 15, 2021 | Autonomous DrivingLanguage Modeling | CodeCode Available | 1 |
| Direction is what you need: Improving Word Embedding Compression in Large Language Models | Jun 15, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Incorporating External POS Tagger for Punctuation Restoration | Jun 12, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| BioELECTRA:Pretrained Biomedical text Encoder using Discriminators | Jun 11, 2021 | ArticlesLanguage Modeling | CodeCode Available | 1 |
| Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment | Jun 11, 2021 | DenoisingLanguage Modeling | CodeCode Available | 1 |
| Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models | Jun 10, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks | Jun 8, 2021 | Domain GeneralizationLanguage Modeling | CodeCode Available | 1 |
| Staircase Attention for Recurrent Processing of Sequences | Jun 8, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model | Jun 8, 2021 | Entity TypingLanguage Modeling | CodeCode Available | 1 |
| Top-KAST: Top-K Always Sparse Training | Jun 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators | Jun 4, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding | Jun 3, 2021 | Conversational Response SelectionLanguage Modeling | CodeCode Available | 1 |
| Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution | Jun 3, 2021 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 |
| Provably Secure Generative Linguistic Steganography | Jun 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Luna: Linear Unified Nested Attention | Jun 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Template-Based Named Entity Recognition Using BART | Jun 3, 2021 | Few-shot NERLanguage Modeling | CodeCode Available | 1 |
| A Generalizable Approach to Learning Optimizers | Jun 2, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Decision Transformer: Reinforcement Learning via Sequence Modeling | Jun 2, 2021 | Atari GamesD4RL | CodeCode Available | 1 |
| MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education | Jun 2, 2021 | Knowledge TracingLanguage Modeling | CodeCode Available | 1 |
| Attention-based Contextual Language Model Adaptation for Speech Recognition | Jun 2, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Differential Privacy for Text Analytics via Natural Text Sanitization | Jun 2, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Counterfactual Data Augmentation for Neural Machine Translation | Jun 1, 2021 | counterfactualData Augmentation | CodeCode Available | 1 |
| Dialogue-oriented Pre-training | Jun 1, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups | Jun 1, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images | May 31, 2021 | Few-Shot LearningImage Classification | CodeCode Available | 1 |
| Cascaded Head-colliding Attention | May 31, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Effective Batching for Recurrent Neural Network Grammars | May 31, 2021 | GPULanguage Modeling | CodeCode Available | 1 |
| NeuralLog: Natural Language Inference with Joint Neural and Logical Reasoning | May 29, 2021 | Deep LearningLanguage Modeling | CodeCode Available | 1 |
| CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model | May 29, 2021 | DecoderLanguage Modeling | CodeCode Available | 1 |
| ProtAugment: Unsupervised diverse short-texts paraphrasing for intent detection meta-learning | May 27, 2021 | DiversityIntent Detection | CodeCode Available | 1 |
| TreeBERT: A Tree-Based Pre-Trained Model for Programming Language | May 26, 2021 | Code SummarizationLanguage Modeling | CodeCode Available | 1 |
| Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization | May 26, 2021 | Conversational Response GenerationLanguage Modeling | CodeCode Available | 1 |
| Knowledge Enhanced Masked Language Model for Stance Detection | May 26, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Personalized Transformer for Explainable Recommendation | May 25, 2021 | Explainable RecommendationLanguage Modeling | CodeCode Available | 1 |
| Prevent the Language Model from being Overconfident in Neural Machine Translation | May 24, 2021 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Neural Language Models for Nineteenth-Century English | May 24, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| FedScale: Benchmarking Model and System Performance of Federated Learning at Scale | May 24, 2021 | BenchmarkingFederated Learning | CodeCode Available | 1 |