| Pano-AVQA: Grounded Audio-Visual Question Answering on 360deg Videos | Jan 1, 2021 | Audio-visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| TransView: Inside, Outside, and Across the Cropping View Boundaries | Jan 1, 2021 | Relation | —Unverified | 0 |
| Multi-Modal Multi-Action Video Recognition | Jan 1, 2021 | RelationVideo Recognition | CodeCode Available | 0 |
| RDI-Net: Relational Dynamic Inference Networks | Jan 1, 2021 | Computational EfficiencyRelation | CodeCode Available | 0 |
| 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds | Jan 1, 2021 | ObjectObject Proposal Generation | —Unverified | 0 |
| Discourse-level Relation Extraction via Graph Pooling | Jan 1, 2021 | Natural Language UnderstandingRelation | —Unverified | 0 |
| Contextual Knowledge Distillation for Transformer Compression | Jan 1, 2021 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| LayoutTransformer: Relation-Aware Scene Layout Generation | Jan 1, 2021 | Image GenerationLayout Generation | —Unverified | 0 |
| Dynamics of Deep Equilibrium Linear Models | Jan 1, 2021 | Relation | —Unverified | 0 |
| Contextual Graph Reasoning Networks | Jan 1, 2021 | 2D Human Pose EstimationInstance Segmentation | —Unverified | 0 |