Human-Object Interaction Detection

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 449 papers

Title	Date	Tasks	Status	Hype
Distance-Aware Occlusion Detection with Focused Attention	Aug 23, 2022	DecoderHuman-Object Interaction Detection	CodeCode Available	1
Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection	Jul 12, 2022	Human-Object Interaction DetectionObject	CodeCode Available	1
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects	Mar 26, 2020	Human-Object Interaction Detection	CodeCode Available	1
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection	Apr 17, 2023	Human-Object Interaction DetectionQuantization	CodeCode Available	1
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction	Mar 3, 2022	Action SegmentationBenchmarking	CodeCode Available	1
Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection	Jan 7, 2020	Graph AttentionHuman-Object Interaction Detection	CodeCode Available	1
HOTR: End-to-End Human-Object Interaction Detection with Transformers	Apr 28, 2021	DecoderHuman-Object Interaction Detection	CodeCode Available	1
DRG: Dual Relation Graph for Human-Object Interaction Detection	Aug 26, 2020	Human-Object Interaction DetectionObject	CodeCode Available	1
Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors	May 30, 2025	Human-Object Interaction DetectionSemantic Segmentation	CodeCode Available	1
Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions	Oct 7, 2021	Human-Object Interaction DetectionObject	CodeCode Available	1
GTNet:Guided Transformer Network for Detecting Human-Object Interactions	Aug 2, 2021	Human-Object Interaction DetectionObject	CodeCode Available	1
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection	Aug 5, 2024	Human-Object Interaction DetectionPrompt Learning	CodeCode Available	1
Cascaded Human-Object Interaction Recognition	Mar 9, 2020	Human-Object Interaction DetectionObject	CodeCode Available	1
Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory	Sep 7, 2023	Human-Object Interaction DetectionRetrieval	CodeCode Available	1
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection	Aug 14, 2020	Human-Object Interaction DetectionObject	CodeCode Available	1
Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos	Jul 19, 2022	Human-Object Interaction Detection	CodeCode Available	1
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer	Dec 3, 2021	GPUHuman-Object Interaction Detection	CodeCode Available	1
DECO: Dense Estimation of 3D Human-Scene Contact In The Wild	Sep 26, 2023	Contact DetectionDense contact estimation	CodeCode Available	1
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions	May 27, 2022	BenchmarkingFew-Shot Image Classification	CodeCode Available	1
Grounded Affordance from Exocentric View	Aug 28, 2022	DiversityHuman-Object Interaction Detection	CodeCode Available	1
HAKE: Human Activity Knowledge Engine	Apr 13, 2019	Action DetectionHuman-Object Interaction Detection	CodeCode Available	1
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning	Dec 11, 2023	BenchmarkingHuman-Object Interaction Detection	CodeCode Available	1
PaStaNet: Toward Human Activity Knowledge Engine	Apr 2, 2020	Action DetectionHuman-Object Interaction Detection	CodeCode Available	1
Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks	Jun 16, 2020	Graph Neural NetworkHuman-Object Interaction Detection	CodeCode Available	1
StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset	Jul 30, 2024	Human-Object Interaction DetectionObject	CodeCode Available	1
Controllable Human-Object Interaction Synthesis	Dec 6, 2023	Human-Object Interaction DetectionObject	—Unverified	0
End-to-End HOI Reconstruction Transformer with Graph-based Encoding	Mar 8, 2025	Human-Object Interaction Detection	—Unverified	0
EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning	Mar 1, 2025	Human-Object Interaction DetectionObject	—Unverified	0
Contextual Heterogeneous Graph Network for Human-Object Interaction Detection	Oct 20, 2020	Graph AttentionHuman-Object Interaction Detection	—Unverified	0
AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation	Nov 26, 2024	Human-Object Interaction DetectionObject	—Unverified	0
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting	Jun 28, 2024	Human-Object Interaction DetectionObject	—Unverified	0
Contextual Guided Segmentation Framework for Semi-supervised Video Instance Segmentation	Jun 7, 2021	Human-Object Interaction DetectionInstance Segmentation	—Unverified	0
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views	May 22, 2024	Human-Object Interaction DetectionObject	—Unverified	0
Egocentric Human-Object Interaction Detection: A New Benchmark and Method	Jun 17, 2025	BenchmarkingHuman-Object Interaction Detection	—Unverified	0
Bilateral Adaptation for Human-Object Interaction Detection with Occlusion-Robustness	Jan 1, 2024	Human-Object Interaction Detectionobject-detection	—Unverified	0
A Deep Learning Approach to Object Affordance Segmentation	Apr 18, 2020	Deep LearningHuman-Object Interaction Detection	—Unverified	0
GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency	Mar 11, 2020	Human-Object Interaction DetectionObject	—Unverified	0
ContextHOI: Spatial Context Learning for Human-Object Interaction Detection	Dec 12, 2024	Human-Object Interaction DetectionObject	—Unverified	0
3DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications	Oct 14, 2024	3DGS3D Reconstruction	—Unverified	0
Efficient Human-Object-Interaction (EHOI) Detection via Interaction Label Coding and Conditional Decision	Aug 13, 2024	Decision MakingHuman-Object Interaction Detection	—Unverified	0
Bi-Causal: Group Activity Recognition via Bidirectional Causality	Jan 1, 2024	Activity RecognitionGroup Activity Recognition	—Unverified	0
Effective Actor-centric Human-object Interaction Detection	Feb 24, 2022	Human-Object Interaction DetectionObject	—Unverified	0
Beyond Holistic Object Recognition: Enriching Image Understanding with Part States	Dec 15, 2016	Human-Object Interaction DetectionImage Captioning	—Unverified	0
Dynamic Scene Understanding from Vision-Language Representations	Jan 20, 2025	Grounded Situation RecognitionHuman-Human Interaction Recognition	—Unverified	0
Compositional Learning in Transformer-Based Human-Object Interaction Detection	Aug 11, 2023	Human-Object Interaction DetectionObject	—Unverified	0
An analysis of HOI: using a training-free method with multimodal visual foundation models when only the test set is available, without the training set	Aug 11, 2024	Human-Object Interaction Detection	—Unverified	0
Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos	Jun 3, 2025	Graph LearningGraph Neural Network	—Unverified	0
DropKey for Vision Transformer	Jan 1, 2023	Human-Object Interaction Detectionimage-classification	—Unverified	0
DropKey	Aug 4, 2022	Human-Object Interaction Detectionimage-classification	—Unverified	0
Compositional Learning for Human Object Interaction	Sep 1, 2018	Human-Object Interaction DetectionObject	—Unverified	0

Show:10 25 50

← PrevPage 3 of 9Next →

All datasets HICO-DET V-COCO HICO VidHOI Ambiguious-HOI MECCANO

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Ours (PViC+)	mAP	46.49	—	Unverified
2	RLIPv2 (Swin-L)	mAP	45.09	—	Unverified
3	PViC-SwinL	mAP	44.32	—	Unverified
4	SOV-STG (Swin-L)	mAP	43.35	—	Unverified
5	DiffHOI	mAP	41.5	—	Unverified
6	ViPLO	mAP	37.22	—	Unverified
7	FGAHOI	mAP	37.18	—	Unverified
8	ERNet	mAP	36.89	—	Unverified
9	CQL+GEN-VLKT-L	mAP	36.03	—	Unverified
10	QAHOI (Swin-L)	mAP	35.78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RLIPv2	AP(S1)	72.1	—	Unverified
2	MUREN	AP(S1)	68.8	—	Unverified
3	STIP	AP(S1)	66	—	Unverified
4	DiffHOI	AP(S1)	65.7	—	Unverified
5	OCN (ResNet101)	AP(S1)	65.3	—	Unverified
6	OCN (ResNet50)	AP(S1)	64.2	—	Unverified
7	CDN (ResNet101)	AP(S1)	63.91	—	Unverified
8	HOICLIP	AP(S1)	63.5	—	Unverified
9	QPIC + CPC	MAP	63.1	—	Unverified
10	Body Part Interactiveness	AP(S1)	63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DEFR	mAP	65.6	—	Unverified
2	HAKE	mAP	47.1	—	Unverified
3	PaStaNet	mAP	46.3	—	Unverified
4	RelViT	mAP	43.98	—	Unverified
5	Pairwise-Part	mAP	39.9	—	Unverified
6	Mallya & Lazebnik	mAP	36.1	—	Unverified
7	Girdhar & Ramanan	mAP	34.6	—	Unverified
8	R*CNN	mAP	28.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HOI4ABOT	Detection: Full (mAP@0.5)	11.12	—	Unverified
2	ST-GAZE	Detection: Full (mAP@0.5)	10.4	—	Unverified
3	STTRAN	Detection: Full (mAP@0.5)	7.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DJ-RN	mAP	10.37	—	Unverified
2	iCAN	mAP	8.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SlowFast + FasterRCNN	mAP@0.5 role	25.93	—	Unverified