| Read, Watch and Scream! Sound Generation from Text and Video | Jul 8, 2024 | Audio GenerationTriplet | CodeCode Available | 1 |
| Unified Dual-Intent Translation for Joint Modeling of Search and Recommendation | Jul 1, 2024 | Recommendation SystemsTriplet | CodeCode Available | 1 |
| Leveraging Predicate and Triplet Learning for Scene Graph Generation | Jun 4, 2024 | Graph GenerationRelation | CodeCode Available | 1 |
| CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval | May 29, 2024 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 1 |
| Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation | May 16, 2024 | AudioCapsEvent Detection | CodeCode Available | 1 |
| PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning | May 10, 2024 | DecoderGeneralization Bounds | CodeCode Available | 1 |
| DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series | Apr 17, 2024 | Anomaly DetectionContrastive Learning | CodeCode Available | 1 |
| Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives | Apr 17, 2024 | Contrastive LearningImage Retrieval | CodeCode Available | 1 |
| EndoViT: pretraining vision transformers on a large collection of endoscopic images | Apr 3, 2024 | Action Triplet RecognitionSegmentation | CodeCode Available | 1 |
| Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval | Mar 24, 2024 | AttributeImage Retrieval | CodeCode Available | 1 |