WorldPrompter: Traversable Text-to-Scene Generation

2025-04-02Unverified0· sign in to hype

Zhaoyang Zhang, Yannick Hold-Geoffroy, Miloš Hašan, Chen Ziwen, Fujun Luan, Julie Dorsey, Yiwei Hu

Unverified — Be the first to reproduce this paper.

Abstract

Scene-level 3D generation is a challenging research topic, with most existing methods generating only partial scenes and offering limited navigational freedom. We introduce WorldPrompter, a novel generative pipeline for synthesizing traversable 3D scenes from text prompts. We leverage panoramic videos as an intermediate representation to model the 360 details of a scene. WorldPrompter incorporates a conditional 360 panoramic video generator, capable of producing a 128-frame video that simulates a person walking through and capturing a virtual environment. The resulting video is then reconstructed as Gaussian splats by a fast feedforward 3D reconstructor, enabling a true walkable experience within the 3D scene. Experiments demonstrate that our panoramic video generation model achieves convincing view consistency across frames, enabling high-quality panoramic Gaussian splat reconstruction and facilitating traversal over an area of the scene. Qualitative and quantitative results also show it outperforms the state-of-the-art 360 video generators and 3D scene generation models.

Tasks

3D Generation Scene Generation Video Generation

WorldPrompter: Traversable Text-to-Scene Generation

Abstract

Tasks

Reproductions