| Multiview Transformers for Video Recognition | Jan 12, 2022 | Action ClassificationAction Recognition | —Unverified | 0 |
| Natural Language Descriptions of Human Activities Scenes: Corpus Generation and Analysis | Aug 1, 2016 | Action ClassificationObject Recognition | —Unverified | 0 |
| Representation Learning on Visual-Symbolic Graphs for Video Understanding | May 17, 2019 | Action ClassificationAction Detection | —Unverified | 0 |
| No More Shortcuts: Realizing the Potential of Temporal Self-Supervision | Dec 20, 2023 | Action ClassificationAttribute | —Unverified | 0 |
| OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning | Jan 1, 2024 | 3D Point Cloud ClassificationAction Classification | —Unverified | 0 |
| OmniVec: Learning robust representations with cross modal sharing | Nov 7, 2023 | 3D Point Cloud ClassificationAction Classification | —Unverified | 0 |
| OmniVL:One Foundation Model for Image-Language and Video-Language Tasks | Sep 15, 2022 | Action ClassificationAction Recognition | —Unverified | 0 |
| Open Vocabulary Multi-Label Video Classification | Jul 12, 2024 | Action ClassificationClassification | —Unverified | 0 |
| Optimizing Average Precision using Weakly Supervised Data | Jun 1, 2014 | Action ClassificationBinary Classification | —Unverified | 0 |
| OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition | Mar 30, 2025 | Action ClassificationAction Recognition | —Unverified | 0 |