| Read, Watch and Scream! Sound Generation from Text and Video | Jul 8, 2024 | Audio GenerationTriplet | CodeCode Available | 1 |
| Unified Dual-Intent Translation for Joint Modeling of Search and Recommendation | Jul 1, 2024 | Recommendation SystemsTriplet | CodeCode Available | 1 |
| Leveraging Predicate and Triplet Learning for Scene Graph Generation | Jun 4, 2024 | Graph GenerationRelation | CodeCode Available | 1 |
| CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval | May 29, 2024 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 1 |
| Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation | May 16, 2024 | AudioCapsEvent Detection | CodeCode Available | 1 |
| PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning | May 10, 2024 | DecoderGeneralization Bounds | CodeCode Available | 1 |
| DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series | Apr 17, 2024 | Anomaly DetectionContrastive Learning | CodeCode Available | 1 |
| Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives | Apr 17, 2024 | Contrastive LearningImage Retrieval | CodeCode Available | 1 |
| EndoViT: pretraining vision transformers on a large collection of endoscopic images | Apr 3, 2024 | Action Triplet RecognitionSegmentation | CodeCode Available | 1 |
| Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval | Mar 24, 2024 | AttributeImage Retrieval | CodeCode Available | 1 |
| GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning | Feb 26, 2024 | Retrieval-augmented GenerationTriplet | CodeCode Available | 1 |
| PCR-99: A Practical Method for Point Cloud Registration with 99 Percent Outliers | Feb 26, 2024 | Point Cloud RegistrationTriplet | CodeCode Available | 1 |
| Event-level Knowledge Editing | Feb 20, 2024 | knowledge editingTriplet | CodeCode Available | 1 |
| Learning to Extract Structured Entities Using Language Models | Feb 6, 2024 | Triplet | CodeCode Available | 1 |
| Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction | Jan 24, 2024 | DenoisingRelation | CodeCode Available | 1 |
| Video Harmonization with Triplet Spatio-Temporal Variation Patterns | Jan 1, 2024 | TripletVideo Enhancement | CodeCode Available | 1 |
| Knowledge Graph Error Detection with Contrastive Confidence Adaption | Dec 19, 2023 | Contrastive LearningKnowledge Graphs | CodeCode Available | 1 |
| Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval | Dec 12, 2023 | Adversarial DefenseImage Retrieval | CodeCode Available | 1 |
| InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models | Dec 10, 2023 | Human-Object Interaction GenerationObject | CodeCode Available | 1 |
| Differentiable Registration of Images and LiDAR Point Clouds with VoxelPoint-to-Pixel Matching | Dec 7, 2023 | Triplet | CodeCode Available | 1 |
| Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions | Nov 28, 2023 | DisentanglementReferring Expression | CodeCode Available | 1 |
| DUnE: Dataset for Unified Editing | Nov 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MSCMNet: Multi-scale Semantic Correlation Mining for Visible-Infrared Person Re-Identification | Nov 24, 2023 | Person Re-IdentificationTriplet | CodeCode Available | 1 |
| Neural-Logic Human-Object Interaction Detection | Nov 16, 2023 | DecoderHuman-Object Interaction Detection | CodeCode Available | 1 |
| Mirror: A Universal Framework for Various Information Extraction Tasks | Nov 9, 2023 | Machine Reading ComprehensionReading Comprehension | CodeCode Available | 1 |
| InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image | Nov 6, 2023 | NeRFTriplet | CodeCode Available | 1 |
| Large Language Models are Temporal and Causal Reasoners for Video Question Answering | Oct 24, 2023 | Natural Language UnderstandingQuestion Answering | CodeCode Available | 1 |
| CONTRASTE: Supervised Contrastive Pre-training With Aspect-based Prompts For Aspect Sentiment Triplet Extraction | Oct 24, 2023 | Aspect Sentiment Triplet ExtractionAspect Term Extraction and Sentiment Classification | CodeCode Available | 1 |
| LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation | Oct 16, 2023 | Few-Shot LearningLarge Language Model | CodeCode Available | 1 |
| ESA: External Space Attention Aggregation for Image-Text Retrieval | Oct 10, 2023 | Image-text RetrievalRetrieval | CodeCode Available | 1 |
| AANet: Aggregation and Alignment Network with Semi-hard Positive Sample Mining for Hierarchical Place Recognition | Oct 8, 2023 | Re-RankingTriplet | CodeCode Available | 1 |
| SCALE: Synergized Collaboration of Asymmetric Language Translation Engines | Sep 29, 2023 | Continual LearningTranslation | CodeCode Available | 1 |
| A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent Attention | Sep 23, 2023 | BlockingData Augmentation | CodeCode Available | 1 |
| Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning | Sep 23, 2023 | Contrastive LearningRecommendation Systems | CodeCode Available | 1 |
| Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning | Sep 20, 2023 | Contrastive LearningRetrieval | CodeCode Available | 1 |
| Realistic Website Fingerprinting By Augmenting Network Trace | Sep 18, 2023 | Self-Supervised LearningTriplet | CodeCode Available | 1 |
| Zero-Shot Scene Graph Generation via Triplet Calibration and Reduction | Sep 7, 2023 | Graph GenerationScene Graph Generation | CodeCode Available | 1 |
| Patent image retrieval using transformer-based deep metric learning | Sep 1, 2023 | Image RetrievalMetric Learning | CodeCode Available | 1 |
| CoVR-2: Automatic Data Construction for Composed Video Retrieval | Aug 28, 2023 | Composed Image Retrieval (CoIR)Composed Video Retrieval (CoVR) | CodeCode Available | 1 |
| TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection | Aug 21, 2023 | Anomaly DetectionAttribute | CodeCode Available | 1 |
| Noisy-Correspondence Learning for Text-to-Image Person Re-identification | Aug 19, 2023 | Person Re-IdentificationText based Person Retrieval | CodeCode Available | 1 |
| Compositional Feature Augmentation for Unbiased Scene Graph Generation | Aug 13, 2023 | DiversityGraph Generation | CodeCode Available | 1 |
| Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination | Aug 8, 2023 | Image-text matchingRepresentation Learning | CodeCode Available | 1 |
| T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images | Aug 4, 2023 | Change DetectionDecoder | CodeCode Available | 1 |
| Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures | Jul 27, 2023 | Automatic Speech RecognitionContrastive Learning | CodeCode Available | 1 |
| Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval | Jul 18, 2023 | cross-modal alignmentData Augmentation | CodeCode Available | 1 |
| PrimeNet: Pre-Training for Irregular Multivariate Time Series | Jun 26, 2023 | Contrastive LearningIrregular Time Series | CodeCode Available | 1 |
| A semantically enhanced dual encoder for aspect sentiment triplet extraction | Jun 14, 2023 | Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA) | CodeCode Available | 1 |
| Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions | Jun 13, 2023 | Person Re-IdentificationTriplet | CodeCode Available | 1 |
| Few-Shot Open-Set Learning for On-Device Customization of KeyWord Spotting Systems | Jun 3, 2023 | Few-Shot LearningKeyword Spotting | CodeCode Available | 1 |