| Read, Watch and Scream! Sound Generation from Text and Video | Jul 8, 2024 | Audio GenerationTriplet | CodeCode Available | 1 |
| Unified Dual-Intent Translation for Joint Modeling of Search and Recommendation | Jul 1, 2024 | Recommendation SystemsTriplet | CodeCode Available | 1 |
| Leveraging Predicate and Triplet Learning for Scene Graph Generation | Jun 4, 2024 | Graph GenerationRelation | CodeCode Available | 1 |
| CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval | May 29, 2024 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 1 |
| Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation | May 16, 2024 | AudioCapsEvent Detection | CodeCode Available | 1 |
| PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning | May 10, 2024 | DecoderGeneralization Bounds | CodeCode Available | 1 |
| Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives | Apr 17, 2024 | Contrastive LearningImage Retrieval | CodeCode Available | 1 |
| DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series | Apr 17, 2024 | Anomaly DetectionContrastive Learning | CodeCode Available | 1 |
| EndoViT: pretraining vision transformers on a large collection of endoscopic images | Apr 3, 2024 | Action Triplet RecognitionSegmentation | CodeCode Available | 1 |
| Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval | Mar 24, 2024 | AttributeImage Retrieval | CodeCode Available | 1 |
| GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning | Feb 26, 2024 | Retrieval-augmented GenerationTriplet | CodeCode Available | 1 |
| PCR-99: A Practical Method for Point Cloud Registration with 99 Percent Outliers | Feb 26, 2024 | Point Cloud RegistrationTriplet | CodeCode Available | 1 |
| Event-level Knowledge Editing | Feb 20, 2024 | knowledge editingTriplet | CodeCode Available | 1 |
| Learning to Extract Structured Entities Using Language Models | Feb 6, 2024 | Triplet | CodeCode Available | 1 |
| Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction | Jan 24, 2024 | DenoisingRelation | CodeCode Available | 1 |
| Video Harmonization with Triplet Spatio-Temporal Variation Patterns | Jan 1, 2024 | TripletVideo Enhancement | CodeCode Available | 1 |
| Knowledge Graph Error Detection with Contrastive Confidence Adaption | Dec 19, 2023 | Contrastive LearningKnowledge Graphs | CodeCode Available | 1 |
| Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval | Dec 12, 2023 | Adversarial DefenseImage Retrieval | CodeCode Available | 1 |
| InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models | Dec 10, 2023 | Human-Object Interaction GenerationObject | CodeCode Available | 1 |
| Differentiable Registration of Images and LiDAR Point Clouds with VoxelPoint-to-Pixel Matching | Dec 7, 2023 | Triplet | CodeCode Available | 1 |
| Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions | Nov 28, 2023 | DisentanglementReferring Expression | CodeCode Available | 1 |
| DUnE: Dataset for Unified Editing | Nov 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MSCMNet: Multi-scale Semantic Correlation Mining for Visible-Infrared Person Re-Identification | Nov 24, 2023 | Person Re-IdentificationTriplet | CodeCode Available | 1 |
| Neural-Logic Human-Object Interaction Detection | Nov 16, 2023 | DecoderHuman-Object Interaction Detection | CodeCode Available | 1 |
| Mirror: A Universal Framework for Various Information Extraction Tasks | Nov 9, 2023 | Machine Reading ComprehensionReading Comprehension | CodeCode Available | 1 |