| Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations | Sep 8, 2024 | Emotion RecognitionMamba | CodeCode Available | 1 |
| Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment | Sep 6, 2024 | Action RecognitionContrastive Learning | CodeCode Available | 0 |
| Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets | Sep 2, 2024 | Video AlignmentVideo Editing | —Unverified | 0 |
| VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment | Aug 21, 2024 | Video AlignmentVideo Editing | CodeCode Available | 2 |
| CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer | Aug 12, 2024 | Text-to-Video GenerationVideo Alignment | CodeCode Available | 11 |
| Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model | Jul 31, 2024 | BenchmarkingLarge Language Model | CodeCode Available | 0 |
| A Comprehensive Review of Few-shot Action Recognition | Jul 20, 2024 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports | Jul 11, 2024 | Video Alignment | —Unverified | 0 |
| MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions | Jul 8, 2024 | Video AlignmentVideo Generation | CodeCode Available | 4 |
| Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering | Jul 3, 2024 | Contrastive LearningLanguage Modelling | —Unverified | 0 |