SOTAVerified

Generating Context-Aware Natural Answers for Questions in 3D Scenes

2023-10-30Code Available0· sign in to hype

Mohammed Munzer Dwedari, Matthias Niessner, Dave Zhenyu Chen

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

3D question answering is a young field in 3D vision-language that is yet to be explored. Previous methods are limited to a pre-defined answer space and cannot generate answers naturally. In this work, we pivot the question answering task to a sequence generation task to generate free-form natural answers for questions in 3D scenes (Gen3DQA). To this end, we optimize our model directly on the language rewards to secure the global sentence semantics. Here, we also adapt a pragmatic language understanding reward to further improve the sentence quality. Our method sets a new SOTA on the ScanQA benchmark (CIDEr score 72.22/66.57 on the test sets).

Tasks

Reproductions