EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models

2024-06-14Code Available2· sign in to hype

Julian Straub, Daniel DeTone, Tianwei Shen, Nan Yang, Chris Sweeney, Richard Newcombe

Code Available — Be the first to reproduce this paper.

Code

github.com/facebookresearch/efm3d
Officialpytorch★ 181

Abstract

The advent of wearable computers enables a new source of context for AI that is embedded in egocentric sensor data. This new egocentric data comes equipped with fine-grained 3D location information and thus presents the opportunity for a novel class of spatial foundation models that are rooted in 3D space. To measure progress on what we term Egocentric Foundation Models (EFMs) we establish EFM3D, a benchmark with two core 3D egocentric perception tasks. EFM3D is the first benchmark for 3D object detection and surface regression on high quality annotated egocentric data of Project Aria. We propose Egocentric Voxel Lifting (EVL), a baseline for 3D EFMs. EVL leverages all available egocentric modalities and inherits foundational capabilities from 2D foundation models. This model, trained on a large simulated dataset, outperforms existing methods on the EFM3D benchmark.

Tasks

3D Object Detection 3D Reconstruction Multi-View 3D Reconstruction object-detection Object Detection

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Aria Everyday Objects	ImVoxelNet	mAP	15	—	Unverified
Aria Everyday Objects	Cube R-CNN	mAP	8	—	Unverified
Aria Everyday Objects	EVL	mAP	22	—	Unverified
Aria Everyday Objects	3DETR	mAP	16	—	Unverified
Aria Synthetic Environments	3DETR	MAP	33	—	Unverified
Aria Synthetic Environments	ImVoxelNet	MAP	64	—	Unverified
Aria Synthetic Environments	Cube R-CNN	MAP	36	—	Unverified
Aria Synthetic Environments	EVL	MAP	75	—	Unverified

EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models

Code

Abstract

Tasks

Benchmark Results

Reproductions