| ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries | Dec 17, 2024 | Human Detectionimage-classification | —Unverified | 0 |
| TimeRefine: Temporal Grounding with Time Refining Video LLM | Dec 12, 2024 | Temporal Localization | CodeCode Available | 0 |
| Unsupervised detection and classification of heartbeats using the dissimilarity matrix in PCG signals | Nov 5, 2024 | Heart SegmentationSound Classification | —Unverified | 0 |
| Detection of Sleep Apnea-Hypopnea Events Using Millimeter-wave Radar and Pulse Oximeter | Sep 28, 2024 | Temporal Localization | —Unverified | 0 |
| Impact of Noisy Labels on Sound Event Detection: Deletion Errors Are More Detrimental Than Insertion Errors | Aug 27, 2024 | Event DetectionSound Event Detection | —Unverified | 0 |
| Described Spatial-Temporal Video Detection | Jul 8, 2024 | Multi-class ClassificationTemporal Localization | —Unverified | 0 |
| MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval | Jun 25, 2024 | cross-modal alignmentMoment Retrieval | —Unverified | 0 |
| Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding | Mar 24, 2024 | Dense Video CaptioningTemporal Localization | —Unverified | 0 |
| Skeleton-Based Human Action Recognition with Noisy Labels | Mar 15, 2024 | Action RecognitionDenoising | CodeCode Available | 0 |
| Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition | Mar 11, 2024 | 2D Human Pose EstimationAction Recognition | —Unverified | 0 |