Human-Object Interaction Detection

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 449 papers

Title	Date	Tasks	Status
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images	Mar 12, 2025	AttributeHuman-Object Interaction Detection	—Unverified
Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection	Nov 3, 2023	Graph GenerationHuman-Object Interaction Detection	—Unverified
InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions	Jun 11, 2025	Human AnimationHuman-Object Interaction Detection	—Unverified
Towards Flexible Visual Relationship Segmentation	Aug 15, 2024	Graph GenerationHuman-Object Interaction Detection	—Unverified
Interaction Part Mining: A Mid-Level Approach for Fine-Grained Action Recognition	Jun 1, 2015	Action RecognitionFine-grained Action Recognition	—Unverified
A Graph-based Interactive Reasoning for Human-Object Interaction Detection	Jul 14, 2020	Human-Object Interaction Detection	—Unverified
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction	Mar 28, 2024	Human-Object Interaction DetectionLanguage Modelling	—Unverified
Agglomerative Transformer for Human-Object Interaction Detection	Aug 16, 2023	ClusteringDecoder	—Unverified
AffordanceLLM: Grounding Affordance from Vision Language Models	Jan 12, 2024	Human-Object Interaction DetectionObject	—Unverified
InterTrack: Tracking Human Object Interaction without Object Templates	Aug 25, 2024	Human-Object Interaction DetectionObject	—Unverified
Is First Person Vision Challenging for Object Tracking?	Nov 24, 2020	Human-Object Interaction DetectionObject	—Unverified
Is First Person Vision Challenging for Object Tracking?	Aug 31, 2021	Human-Object Interaction DetectionObject	—Unverified
Is Object Detection Necessary for Human-Object Interaction Recognition?	Jul 27, 2021	Human-Object Interaction DetectionObject	—Unverified
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows	Mar 20, 2022	Human-Object Interaction DetectionObject	—Unverified
Joint Hand-object 3D Reconstruction from a Single Image with Cross-branch Feature Fusion	Jun 28, 2020	3D ReconstructionDepth Estimation	—Unverified
Jointly learning heterogeneous features for rgb-d activity recognition	Dec 15, 2016	Activity RecognitionBenchmarking	—Unverified
Kinematics-based 3D Human-Object Interaction Reconstruction from Single View	Jul 19, 2024	Human-Object Interaction Detection	—Unverified
Kinematics-Guided Reinforcement Learning for Object-Aware 3D Ego-Pose Estimation	Nov 10, 2020	Human-Object Interaction DetectionObject	—Unverified
Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection	Jul 16, 2022	DecoderHuman-Object Interaction Detection	—Unverified
Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding	Aug 23, 2024	Human-Object Interaction DetectionObject	—Unverified
Learning Action Recognition Model From Depth and Skeleton Videos	Oct 1, 2017	Action RecognitionHuman-Object Interaction Detection	—Unverified
Generating Videos of Zero-Shot Compositions of Actions and Objects	Dec 5, 2019	Human-Object Interaction DetectionObject	—Unverified
Learning Asynchronous and Sparse Human-Object Interaction in Videos	Mar 3, 2021	Human-Object Interaction DetectionObject	—Unverified
Towards Overcoming False Positives in Visual Relationship Detection	Dec 23, 2020	DecoderGraph Attention	—Unverified
Learning event representation: As sparse as possible, but not sparser	Oct 2, 2017	ClassificationGeneral Classification	—Unverified

Show:10 25 50

← PrevPage 10 of 18Next →

All datasets HICO-DET V-COCO HICO VidHOI Ambiguious-HOI MECCANO

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Ours (PViC+)	mAP	46.49	—	Unverified
2	RLIPv2 (Swin-L)	mAP	45.09	—	Unverified
3	PViC-SwinL	mAP	44.32	—	Unverified
4	SOV-STG (Swin-L)	mAP	43.35	—	Unverified
5	DiffHOI	mAP	41.5	—	Unverified
6	ViPLO	mAP	37.22	—	Unverified
7	FGAHOI	mAP	37.18	—	Unverified
8	ERNet	mAP	36.89	—	Unverified
9	CQL+GEN-VLKT-L	mAP	36.03	—	Unverified
10	QAHOI (Swin-L)	mAP	35.78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RLIPv2	AP(S1)	72.1	—	Unverified
2	MUREN	AP(S1)	68.8	—	Unverified
3	STIP	AP(S1)	66	—	Unverified
4	DiffHOI	AP(S1)	65.7	—	Unverified
5	OCN (ResNet101)	AP(S1)	65.3	—	Unverified
6	OCN (ResNet50)	AP(S1)	64.2	—	Unverified
7	CDN (ResNet101)	AP(S1)	63.91	—	Unverified
8	HOICLIP	AP(S1)	63.5	—	Unverified
9	QPIC + CPC	MAP	63.1	—	Unverified
10	Body Part Interactiveness	AP(S1)	63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DEFR	mAP	65.6	—	Unverified
2	HAKE	mAP	47.1	—	Unverified
3	PaStaNet	mAP	46.3	—	Unverified
4	RelViT	mAP	43.98	—	Unverified
5	Pairwise-Part	mAP	39.9	—	Unverified
6	Mallya & Lazebnik	mAP	36.1	—	Unverified
7	Girdhar & Ramanan	mAP	34.6	—	Unverified
8	R*CNN	mAP	28.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HOI4ABOT	Detection: Full (mAP@0.5)	11.12	—	Unverified
2	ST-GAZE	Detection: Full (mAP@0.5)	10.4	—	Unverified
3	STTRAN	Detection: Full (mAP@0.5)	7.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DJ-RN	mAP	10.37	—	Unverified
2	iCAN	mAP	8.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SlowFast + FasterRCNN	mAP@0.5 role	25.93	—	Unverified