| VLM-HOI: Vision Language Models for Interpretable Human-Object Interaction Analysis | Nov 27, 2024 | Human-Object Interaction DetectionImage-text matching | —Unverified | 0 | 0 |
| Voice-Face Cross-modal Matching and Retrieval: A Benchmark | Nov 21, 2019 | RetrievalTriplet | —Unverified | 0 | 0 |
| Watch Where You Head: A View-biased Domain Gap in Gait Recognition and Unsupervised Adaptation | Jul 13, 2023 | Domain AdaptationGait Recognition | —Unverified | 0 | 0 |
| Waveform Driven Plasticity in BiFeO3 Memristive Devices: Model and Implementation | Dec 1, 2012 | Triplet | —Unverified | 0 | 0 |
| WCE Polyp Detection with Triplet based Embeddings | Dec 10, 2019 | Medical ProcedureMetric Learning | —Unverified | 0 | 0 |
| Weakly-supervised learning of visual relations | Jul 29, 2017 | ClusteringRelation | —Unverified | 0 | 0 |
| Weakly Supervised Phrase Localization With Multi-Scale Anchored Transformer Network | Jun 1, 2018 | Region ProposalTriplet | —Unverified | 0 | 0 |
| What can Off-the-Shelves Large Multi-Modal Models do for Dynamic Scene Graph Generation? | Mar 20, 2025 | DecoderGraph Generation | —Unverified | 0 | 0 |
| What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis | Apr 16, 2019 | Self-Supervised LearningTriplet | —Unverified | 0 | 0 |
| When faster rotation is harmful: the competition of alliances with inner blocking mechanism | Mar 21, 2024 | BlockingTriplet | —Unverified | 0 | 0 |