| Fusion of Millimeter-wave Radar and Pulse Oximeter Data for Low-burden Diagnosis of Obstructive Sleep Apnea-Hypopnea Syndrome | Jan 25, 2025 | DiagnosticSleep Staging | —Unverified | 0 |
| LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding | Jan 14, 2025 | Feature CompressionLanguage Modeling | CodeCode Available | 2 |
| Pseudo Strong Labels from Frame-Level Predictions for Weakly Supervised Sound Event Detection | Jan 7, 2025 | Event DetectionSound Event Detection | —Unverified | 0 |
| Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study | Dec 29, 2024 | Motion DetectionOptical Character Recognition | CodeCode Available | 0 |
| ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries | Dec 17, 2024 | Human Detectionimage-classification | —Unverified | 0 |
| TimeRefine: Temporal Grounding with Time Refining Video LLM | Dec 12, 2024 | Temporal Localization | CodeCode Available | 0 |
| TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability | Nov 27, 2024 | Temporal LocalizationVideo Understanding | CodeCode Available | 2 |
| Number it: Temporal Grounding Videos like Flipping Manga | Nov 15, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 |
| Unsupervised detection and classification of heartbeats using the dissimilarity matrix in PCG signals | Nov 5, 2024 | Heart SegmentationSound Classification | —Unverified | 0 |
| Detection of Sleep Apnea-Hypopnea Events Using Millimeter-wave Radar and Pulse Oximeter | Sep 28, 2024 | Temporal Localization | —Unverified | 0 |