DIScene: Object Decoupling and Interaction Modeling for Complex Scene Generation
DescriptionThis paper reconsiders how to distill knowledge from pretrained 2D diffusion models to guide 3D asset generation, in particular to generate complex 3D scenes: it should accept varied inputs, i.e., texts or images, to allow for flexible expression of requirement; objects in the scene should be style-consistent and decoupled with clearly modeled interactions, benefiting downstream tasks.
We propose DIScene, a novel method for this task.It represents the entire 3D scene with a learnable structured scene graph: each node explicitly models an object with its appearance, textual description, transformation, geometry as a mesh attached with surface-aligned Gaussians; the graph's edges model object interactions.
With this new representation, objects are optimized in the canonical space and interactions between objects are optimized by object-aware rendering to avoid wrong back-propagation.
Extensive experiments demonstrate the significant utility and superiority of our approach and that DIScene can greatly facilitate 3D content creation tasks.
Event Type
Technical Papers
TimeThursday, 5 December 20244:41pm - 4:53pm JST
LocationHall B5 (1), B Block, Level 5
Registration Categories
Language Formats