| View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose | Oct 23, 2020 | 3D Pose EstimationAction Recognition | CodeCode Available | 0 |
| Listen Then See: Video Alignment with Speaker Attention | Apr 21, 2024 | cross-modal alignmentQuestion Answering | CodeCode Available | 0 |
| Dynamic Temporal Alignment of Speech to Lips | Aug 19, 2018 | Constrained Lip-synchronizationVideo Alignment | CodeCode Available | 0 |
| Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment | Sep 6, 2024 | Action RecognitionContrastive Learning | CodeCode Available | 0 |
| View-Invariant Probabilistic Embedding for Human Pose | Dec 2, 2019 | Action RecognitionPose Retrieval | CodeCode Available | 0 |
| A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference | Jun 26, 2023 | Video Alignment | CodeCode Available | 0 |
| Learning from Video and Text via Large-Scale Discriminative Clustering | Jul 27, 2017 | Action RecognitionClustering | CodeCode Available | 0 |
| Deep Understanding of Sign Language for Sign to Subtitle Alignment | Mar 5, 2025 | TranslationVideo Alignment | CodeCode Available | 0 |
| Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video | Oct 21, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues | Jan 1, 2025 | Action RecognitionScene Recognition | CodeCode Available | 0 |