SADG: Segment Any Dynamic Gaussian Without Object Trackers

Abstract

Understanding dynamic 3D scenes is fundamental for various applications, including extended reality (XR) and autonomous driving. Effectively integrating semantic information into 3D reconstruction enables holistic representation that opens opportunities for immersive and interactive applications. To this end, we introduce SADG, Segment Any Dynamic Gaussian Without Object Trackers, a novel approach that combines dynamic Gaussian Splatting representation and semantic information without reliance on object IDs. We propose to learn semantically-aware features by leveraging masks generated from the Segment Anything Model (SAM) and utilizing our novel contrastive learning objective based on hard pixel mining. The learned Gaussian features can be effectively clustered without further post-processing. This enables fast computation for further object-level editing, such as object removal, composition, and style transfer by manipulating the Gaussians in the scene. Due to the lack of consistent evaluation protocol, we extend several dynamic novel-view datasets with segmentation benchmarks that allow testing of learned feature fields from unseen viewpoints. We evaluate SADG on proposed benchmarks and demonstrate the superior performance of our approach in segmenting objects within dynamic scenes along with its effectiveness for further downstream editing tasks.

Approach

Given dynamic reconstruction, we proceed to learn Gaussian features using our contrastive semantically-aware learning based on SAM masks. Once the features are properly learned, clustering (DBSCAN) is performed directly on the learned Gaussian features, and the corresponding segmentation field can be rendered. We demonstrate the applicability of our representation on various scene-editing applications. Some of them include segmentation of a target object by click/text prompt in our GUI, object removal or scene composition, and others.

Segment Object in Novel Views by Click Prompt

Experience sync issues with the GIFs? Simply refresh the webpage to synchronize playback.

Segment Object in Novel Views by Text Prompt

Text Prompt: hands with cookie

Text Prompt: pan and stove

Text Prompt: two hands

Text Prompt: dog

Experience sync issues with the GIFs? Simply refresh the webpage to synchronize playback.

Scene Editing in Dynamic Scene

Object Removal

Object Style Transfer

Object Composition

Experience sync issues with the GIFs? Simply refresh the webpage to synchronize playback.

BibTeX

@article{li2024sadg,
    title={SADG: Segment Any Dynamic Gaussian Without Object Trackers},
    author={Li, Yun-Jin and Gladkova, Mariia and Xia, Yan and Cremers, Daniel},
    journal={arXiv preprint arXiv:2411.19290},
    year={2024}
}

SADG: Segment Any Dynamic Gaussian Without Object Trackers

Abstract

GUI Demo

Approach

Segment Object in Novel Views by Click Prompt

Segment Object in Novel Views by Text Prompt

Text Prompt: hands with cookie

Text Prompt: pan and stove

Text Prompt: two hands

Text Prompt: dog

Scene Editing in Dynamic Scene

Object Removal

Object Style Transfer

Object Composition

BibTeX