| CoCa: Contrastive Captioners are Image-Text Foundation Models | May 4, 2022 | Action ClassificationDecoder | CodeCode Available | 1 | 5 |
| An Evaluation of Action Recognition Models on EPIC-Kitchens | Aug 2, 2019 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| An Image is Worth 16x16 Words, What is a Video Worth? | Mar 25, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers | Jun 9, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition | Aug 10, 2024 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Continual 3D Convolutional Neural Networks for Real-time Processing of Videos | May 31, 2021 | Action ClassificationVideo Recognition | CodeCode Available | 1 | 5 |
| Boundary-sensitive Pre-training for Temporal Localization in Videos | Nov 21, 2020 | Action ClassificationClassification | CodeCode Available | 1 | 5 |
| ConvNet Architecture Search for Spatiotemporal Feature Learning | Aug 16, 2017 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks | Nov 14, 2021 | Action ClassificationObject | CodeCode Available | 1 | 5 |
| KNN-MMD: Cross Domain Wireless Sensing via Local Distribution Alignment | Dec 6, 2024 | Action ClassificationAction Classification (1-shot) | CodeCode Available | 1 | 5 |