Human-Object Interaction Detection

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 449 papers

Title	Date	Tasks	Status
An Image-like Diffusion Method for Human-Object Interaction Detection	Mar 23, 2025	Human-Object Interaction DetectionImage Generation	—Unverified
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model	Mar 21, 2025	DisentanglementHuman-Object Interaction Detection	—Unverified
Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions	Mar 20, 2025	3D ReconstructionHuman-Object Interaction Detection	—Unverified
3D Human Interaction Generation: A Survey	Mar 17, 2025	Human-Object Interaction DetectionMotion Generation	—Unverified
Hoi2Anomaly: An Explainable Anomaly Detection Approach Guided by Human-Object Interaction	Mar 13, 2025	Anomaly DetectionHuman-Object Interaction Detection	—Unverified
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images	Mar 12, 2025	AttributeHuman-Object Interaction Detection	—Unverified
GASPACHO: Gaussian Splatting for Controllable Humans and Objects	Mar 12, 2025	Human-Object Interaction DetectionObject	—Unverified
End-to-End HOI Reconstruction Transformer with Graph-based Encoding	Mar 8, 2025	Human-Object Interaction Detection	—Unverified
From Infants to AI: Incorporating Infant-like Learning in Models Boosts Efficiency and Generalization in Learning Social Prediction Tasks	Mar 5, 2025	Human-Object Interaction Detection	—Unverified
EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning	Mar 1, 2025	Human-Object Interaction DetectionObject	—Unverified
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds	Feb 27, 2025	Affordance DetectionHuman-Object Interaction Detection	—Unverified
Human Motion Prediction, Reconstruction, and Generation	Feb 21, 2025	Human motion predictionHuman-Object Interaction Detection	—Unverified
RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations	Feb 18, 2025	Human-Object Interaction DetectionObject	—Unverified
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment	Feb 7, 2025	DiversityHuman-Object Interaction Detection	—Unverified
Functional 3D Scene Synthesis through Human-Scene Optimization	Feb 5, 2025	Human-Object Interaction DetectionScene Generation	—Unverified
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models	Feb 3, 2025	Human AnimationHuman-Object Interaction Detection	—Unverified
B-RIGHT: Benchmark Re-evaluation for Integrity in Generalized Human-Object Interaction Testing	Jan 28, 2025	Human-Object Interaction Detection	CodeCode Available
Eye Gaze as a Signal for Conveying User Attention in Contextual AI Systems	Jan 23, 2025	FrictionHuman-Object Interaction Detection	—Unverified
Dynamic Scene Understanding from Vision-Language Representations	Jan 20, 2025	Grounded Situation RecognitionHuman-Human Interaction Recognition	—Unverified
DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models	Jan 14, 2025	Human-Object Interaction DetectionObject	—Unverified
PersonaHOI: Effortlessly Improving Personalized Face with Human-Object Interaction Generation	Jan 10, 2025	Human-Object Interaction DetectionHuman-Object Interaction Generation	CodeCode Available
Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding	Jan 1, 2025	Human-Object Interaction DetectionMamba	—Unverified
Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes	Jan 1, 2025	Human motion predictionHuman-Object Interaction Detection	—Unverified
ChatHuman: Chatting about 3D Humans with Tools	Jan 1, 2025	Human-Object Interaction DetectionIn-Context Learning	—Unverified
PICO: Reconstructing 3D People In Contact with Objects	Jan 1, 2025	Human-Object Interaction DetectionObject	—Unverified

Show:10 25 50

← PrevPage 7 of 18Next →

All datasets HICO-DET V-COCO HICO VidHOI Ambiguious-HOI MECCANO

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Ours (PViC+)	mAP	46.49	—	Unverified
2	RLIPv2 (Swin-L)	mAP	45.09	—	Unverified
3	PViC-SwinL	mAP	44.32	—	Unverified
4	SOV-STG (Swin-L)	mAP	43.35	—	Unverified
5	DiffHOI	mAP	41.5	—	Unverified
6	ViPLO	mAP	37.22	—	Unverified
7	FGAHOI	mAP	37.18	—	Unverified
8	ERNet	mAP	36.89	—	Unverified
9	CQL+GEN-VLKT-L	mAP	36.03	—	Unverified
10	QAHOI (Swin-L)	mAP	35.78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RLIPv2	AP(S1)	72.1	—	Unverified
2	MUREN	AP(S1)	68.8	—	Unverified
3	STIP	AP(S1)	66	—	Unverified
4	DiffHOI	AP(S1)	65.7	—	Unverified
5	OCN (ResNet101)	AP(S1)	65.3	—	Unverified
6	OCN (ResNet50)	AP(S1)	64.2	—	Unverified
7	CDN (ResNet101)	AP(S1)	63.91	—	Unverified
8	HOICLIP	AP(S1)	63.5	—	Unverified
9	QPIC + CPC	MAP	63.1	—	Unverified
10	Body Part Interactiveness	AP(S1)	63	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DEFR	mAP	65.6	—	Unverified
2	HAKE	mAP	47.1	—	Unverified
3	PaStaNet	mAP	46.3	—	Unverified
4	RelViT	mAP	43.98	—	Unverified
5	Pairwise-Part	mAP	39.9	—	Unverified
6	Mallya & Lazebnik	mAP	36.1	—	Unverified
7	Girdhar & Ramanan	mAP	34.6	—	Unverified
8	R*CNN	mAP	28.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HOI4ABOT	Detection: Full ([email protected])	11.12	—	Unverified
2	ST-GAZE	Detection: Full ([email protected])	10.4	—	Unverified
3	STTRAN	Detection: Full ([email protected])	7.61	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DJ-RN	mAP	10.37	—	Unverified
2	iCAN	mAP	8.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SlowFast + FasterRCNN	[email protected] role	25.93	—	Unverified