Mosaic-SDF for 3D Generative Models

1GenAI, Meta 2FAIR, Meta 3Weizmann Institute of Science
CVPR 2024

Abstract

Current diffusion or flow-based generative models for 3D shapes divide to two: distilling pre-trained 2D image diffusion models, and training directly on 3D shapes. When training a diffusion or flow models on 3D shapes a crucial design choice is the shape representation. An effective shape representation needs to adhere three design principles: it should allow an efficient conversion of large 3D datasets to the representation form; it should provide a good tradeoff of approximation power versus number of parameters; and it should have a simple tensorial form that is compatible with existing powerful neural architectures. While standard 3D shape representations such as volumetric grids and point clouds do not adhere to all these principles simultaneously, we advocate in this paper a new representation that does.

We introduce Mosaic-SDF (M-SDF): a simple 3D shape representation that approximates the Signed Distance Function (SDF) of a given shape by using a set of local grids spread near the shape's boundary. The M-SDF representation is fast to compute for each shape individually making it readily parallelizable; it is parameter efficient as it only covers the space around the shape's boundary; and it has a simple matrix form, compatible with Transformer-based architectures.

We demonstrate the efficacy of the M-SDF representation by using it to train a 3D generative flow model including class-conditioned generation with the 3D Warehouse dataset, and text-to-3D generation using a dataset of about 600k caption-shape pairs.

M-SDF Representation

The goal of this work is to introduce Mosaic-SDF (M-SDF), a simple novel 3D shape representation suitable for generative models. In a nutshell, M-SDF approximates an arbitrary Signed Distance Function (SDF) with a set of small volumetric grids with different centers and scales. Namely, M-SDF of a single shape is represented as a matrix of dimension n × d, where each row corresponds to a single grid, and it can be fitted to a given shape’s SDF in around 1 minute.


overview

Comparing M-SDF representation to some of the popular existing representations for a fixed budget of parameters.

M-SDF Generation with Flow Matching

With M-SDF we are able to train a forward based Flow Matching generative model on a dataset of 3D shapes.

Method overview: Train (left): First we convert the dataset of shapes to M-SDF representations (Algorithm 1), next we train a Flow Matching model with the M-SDF representations (Algorithm 2). Sampling (right): We random a noisy M-SDF and numerically solve the ode. See algorithms in the paper.


Diversity

BibTeX

@article{yariv2023mosaicsdf,
	 title={Mosaic-SDF for 3D Generative Models},
	 author={Lior Yariv and Omri Puny and Natalia Neverova 
 		 and Oran Gafni and Yaron Lipman},
 	 journal={arXiv},
	 year={2023}
}