| EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate | Dec 29, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge | Dec 16, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising | Dec 14, 2021 | Cross-Modal RetrievalDecoder | —Unverified | 0 |
| Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation | Dec 10, 2021 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| DIBERT: Dependency Injected Bidirectional Encoder Representations from Transformers | Dec 5, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Causal Distillation for Language Models | Dec 5, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| UFO: A UniFied TransfOrmer for Vision-Language Representation Learning | Nov 19, 2021 | Image CaptioningImage-text matching | —Unverified | 0 |
| LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model | Nov 18, 2021 | Anomaly DetectionLanguage Modeling | —Unverified | 0 |
| Prompt-Learning for Fine-Grained Entity Typing | Nov 16, 2021 | Entity TypingKnowledge Probing | —Unverified | 0 |
| TACO: Pre-training of Deep Transformers with Attention Convolution using Disentangled Positional Representation | Nov 16, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |