| METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection | May 10, 2025 | Objectobject-detection | CodeCode Available | 0 |
| End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting | Sep 19, 2024 | DecoderObject | —Unverified | 0 |
| A Review of Human-Object Interaction Detection | Aug 20, 2024 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching | Jun 5, 2024 | cross-modal alignmentImage-text matching | —Unverified | 0 |
| AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation | Apr 11, 2024 | Graph GenerationRelationship Detection | —Unverified | 0 |
| Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection | Mar 26, 2024 | RelationRelationship Detection | CodeCode Available | 1 |
| Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection | Mar 21, 2024 | DecoderObject | —Unverified | 0 |
| Video Relationship Detection Using Mixture of Experts | Mar 6, 2024 | Action RecognitionMixture-of-Experts | CodeCode Available | 0 |
| RelVAE: Generative Pretraining for few-shot Visual Relationship Detection | Nov 27, 2023 | Predicate ClassificationRelationship Detection | —Unverified | 0 |
| Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction | Nov 8, 2023 | Predicate DetectionRelationship Detection | CodeCode Available | 0 |