| AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation | Apr 24, 2023 | 3D Hand Pose EstimationAction Classification | CodeCode Available | 1 |
| Implicit Temporal Modeling with Learnable Alignment for Video Recognition | Apr 20, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| The effectiveness of MAE pre-pretraining for billion-scale pretraining | Mar 23, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Dual-path Adaptation from Image to Video Transformers | Mar 17, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| HierVL: Learning Hierarchical Video-Language Embeddings | Jan 5, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning | Dec 8, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning | Dec 6, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Post-Processing Temporal Action Detection | Nov 27, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 |
| XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning | Nov 25, 2022 | Action ClassificationClassification | CodeCode Available | 1 |
| AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders | Nov 16, 2022 | Action ClassificationRepresentation Learning | CodeCode Available | 1 |