| VLM-HOI: Vision Language Models for Interpretable Human-Object Interaction Analysis | Nov 27, 2024 | Human-Object Interaction DetectionImage-text matching | —Unverified | 0 |
| Cross-Modal Pre-Aligned Method with Global and Local Information for Remote-Sensing Image and Text Retrieval | Nov 22, 2024 | Image RetrievalReranking | —Unverified | 0 |
| Globally Correlation-Aware Hard Negative Generation | Nov 20, 2024 | Image RetrievalMetric Learning | CodeCode Available | 1 |
| TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition | Nov 16, 2024 | Action RecognitionSkeleton Based Action Recognition | CodeCode Available | 1 |
| Leveraging large language models for efficient representation learning for entity resolution | Nov 15, 2024 | BlockingContrastive Learning | —Unverified | 0 |
| Marker-free Human Gait Analysis using a Smart Edge Sensor System | Nov 14, 2024 | Triplet | —Unverified | 0 |
| Energy Score-based Pseudo-Label Filtering and Adaptive Loss for Imbalanced Semi-supervised SAR target recognition | Nov 6, 2024 | Pseudo LabelPseudo Label Filtering | —Unverified | 0 |
| Graph-DPEP: Decomposed Plug and Ensemble Play for Few-Shot Document Relation Extraction with Graph-of-Thoughts Reasoning | Nov 5, 2024 | Document-level Relation ExtractionFew-Shot Learning | —Unverified | 0 |
| TriG-NER: Triplet-Grid Framework for Discontinuous Named Entity Recognition | Nov 4, 2024 | Boundary Detectionnamed-entity-recognition | CodeCode Available | 0 |
| Deep Learning for Leopard Individual Identification: An Adaptive Angular Margin Approach | Nov 4, 2024 | Deep LearningEdge Detection | CodeCode Available | 0 |
| Polar R-CNN: End-to-End Lane Detection with Fewer Anchors | Nov 3, 2024 | Autonomous DrivingLane Detection | CodeCode Available | 1 |
| Confidence Aware Learning for Reliable Face Anti-spoofing | Nov 2, 2024 | Face Anti-SpoofingPrediction | —Unverified | 0 |
| MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval | Oct 31, 2024 | Image RetrievalPrompt Learning | —Unverified | 0 |
| Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models | Oct 30, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| Prototypical Extreme Multi-label Classification with a Dynamic Margin Loss | Oct 27, 2024 | Contrastive LearningExtreme Multi-Label Classification | —Unverified | 0 |
| Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective | Oct 23, 2024 | graph constructionKnowledge Graphs | CodeCode Available | 1 |
| Denoise-I2W: Mapping Images to Denoising Words for Accurate Zero-Shot Composed Image Retrieval | Oct 22, 2024 | AttributeDenoising | CodeCode Available | 0 |
| GE2E-KWS: Generalized End-to-End Training and Evaluation for Zero-shot Keyword Spotting | Oct 22, 2024 | Keyword SpottingTriplet | —Unverified | 0 |
| Triplet: Triangle Patchlet for Mesh-Based Inverse Rendering and Scene Parameters Approximation | Oct 16, 2024 | Camera CalibrationInverse Rendering | CodeCode Available | 1 |
| Diversified and Adaptive Negative Sampling on Knowledge Graphs | Oct 10, 2024 | Graph EmbeddingInformativeness | —Unverified | 0 |
| TANet: Triplet Attention Network for All-In-One Adverse Weather Image Restoration | Oct 10, 2024 | AllImage Restoration | CodeCode Available | 1 |
| Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training | Oct 8, 2024 | Graph Question AnsweringLanguage Modeling | CodeCode Available | 0 |
| LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model | Oct 3, 2024 | image-classificationImage Classification | —Unverified | 0 |
| NL-Eye: Abductive NLI for Images | Oct 3, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections | Oct 2, 2024 | AttributeImage Retrieval | CodeCode Available | 0 |