Skip to content

vLAR-group/FoundObj

Repository files navigation

FoundObj: Self-supervised Foundation Models as Rewards for Label-free 3D Object Segmentation

arXiv License: CC-BY-NC-SA

Overview

Official PyTorch implementation of FoundObj, a unsupervised framework for 3D class-agnostic object instance segmentation in indoor scenes.

drawing

Objects are progressively merged by the agent:

drawing

Pretrained checkpoints can be found here:

1. Environment

Installing dependencies

The released environment is tested with Python 3.10, PyTorch 2.6, CUDA 12.6, and spconv-cu126.

conda env create -f environment.yml
conda activate foundobj
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install xformers==0.0.29.post2 --index-url https://download.pytorch.org/whl/cu126
pip install spconv-cu126
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-2.6.0+cu126.html
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
pip install --no-build-isolation git+https://github.com/facebookresearch/pytorch3d.git@stable
pip install --no-build-isolation git+https://github.com/facebookresearch/detectron2.git
cd pointnet2 && python setup.py install && cd ..

If you compile the ScanNet superpoint extension, build the C++ module:

pip install pybind11
cd preprocessing/cpp_utils
python setup.py build_ext --inplace
cd ../..

2. Data Preparation

Download ScanNet and S3DIS, following GrabS to generate the segmentator binary files. Then organize the raw scans locally.

python preprocessing/scannet_preprocessing.py --data-dir /path/to/scannet --save-dir data/scannet/processed_aligns
python preprocessing/scannet200_preprocessing.py --data-dir /path/to/scannet --save-dir data/scannet/processed_aligns200
python preprocessing/s3dis_preprocessing.py --data-dir /path/to/Stanford3dDataset_v1.2_Aligned_Version --save-dir data/s3dis/processed

Before S3DUS preprocessing, check whether the following two issuses existed and fix them manually if needed.

  • line 180389 in Area_5/hallway_6/Annotations/ceiling_1.txt and Area_5/office_20/Annotations/table_2.txt, line 1: remove the empty line.

FoundObj uses point-level superpoints for proposal search. Generate superpoints from ScanNet meshes:

python preprocessing/get_scannet_superpoints.py --scans-root /path/to/scannet/scans

Then build adjacent-superpoint lookup files:

python preprocessing/get_neighbors.py

DINOv2 Features

Project DINOv2 features onto ScanNet superpoints. The 3D and 2D data are available from OpenScene:

wget https://cvg-data.inf.ethz.ch/openscene/data/scannet_processed/scannet_3d.zip
wget https://cvg-data.inf.ethz.ch/openscene/data/scannet_processed/scannet_2d.zip
python distillation/project_scannet_features.py --scannet-2d-root /path/to/scannet_2d_images --scannet-3d-root /path/to/scannet_3d_data \

Pseudo Masks

Generate NCut initial pseudo masks:

python preprocessing/generate_ncut_masks.py --data-root data/scannet/processed_aligns

The expected ScanNet layout is:

data/scannet/
|-- processed_aligns/
|-- processed_aligns200/
|-- superpoints/
|-- superpoint_neighbors/
|-- dinov2b14_spfeats/
`-- pseudo_mask/

3. CenterField

The shape reward module is trained from synthetic object-centric scenes. We use ABO and 3D-FUTURE object meshes, with required ScanNet crops for background clutter.

Prepare Synthetic Object Scenes

The filtered object lists are provided in preprocessing_obj/object_lists/. They come from TRELLIS-500K metadata and keep only objects with aesthetic_score >= 4.5. The 3D-Future needs to download manually before processing, while ABO can be automatically downloaded by the script. Noe the 3D-Future dataset downlaod links are not available now. Therefore we also release the processed data.

If you want to prepare by your own, please follow the instructions below:

# ABO: downloads abo-3dmodels.tar from the public ABO S3 bucket, then extracts the selected .glb files
python preprocessing_obj/prepare_objects.py --dataset ABO
python preprocessing_obj/prepare_objects.py --dataset 3D-Future --models_root /path/to/3D-FUTURE_raw

The expected files are 3D-FUTURE-model/<instance>/raw_model.obj.

Build synthetic scenes and partial point clouds:

python preprocessing_obj/create_scene_mesh.py
python preprocessing_obj/render_single_view.py
python preprocessing_obj/extract_obj_surf.py
python preprocessing_obj/combine_views.py
python preprocessing_obj/split_samples.py

The expected object data layout is:

data/objects/
|-- ABO/
|   |-- renders/
|   |-- syn_mesh/
|   |-- syn_depth/
|   |-- syn_scene_pc/
|   `-- syn_traindata/
|-- 3D-Future/
|   `-- ...
`-- Others_augobj/

Train CenterField

# Trellis sparse-structure backbone
python train_cf/train_cf.py --data_dir data/objects/ABO,data/objects/3D-Future --save_path ckpt/centerfield

The trained CenterField checkpoint, e.g. ckpt/centerfield/ckpt_step_300000.tar, is used by train_cutcost.py.

4. Segmentation Training on ScanNet

Train FoundObj with the PPO proposal policy and CutCost reward:

python train_cutcost.py --data-root data/scannet/processed_aligns --superpoint-dir data/scannet/superpoints --dino-dir data/scannet/dinov2b14_spfeats --superpoint-neighbor-dir data/scannet/superpoint_neighbors --pre-pseudo data/scannet/pseudo_mask --center-field-ckpt ckpt/centerfield/ckpt_step_300000.tar --save-path ckpt/foundobj

5. Evaluation

ScanNet200

CUDA_VISIBLE_DEVICES=0 python eval_scannet.py --ckpt /path/to/checkpoint.tar --data-root data/scannet/processed_aligns200 -superpoint-dir data/scannet/superpoints --save-path outputs/eval_scannet

S3DIS

Build S3DIS Superpoints

The S3DIS superpoint script requires two C++ extensions. Install dependencies and compile:

sudo apt-get install -y libeigen3-dev #or conda install -c conda-forge eigen if sudo is unavailable
pip install pybind11

cd preprocessing/partition
c++ -O3 -shared -std=c++11 -fPIC -fopenmp $(python3 -m pybind11 --includes) -I./include cutpursuit_pybind.cpp -o ../libcp$(python3-config --extension-suffix)
c++ -O3 -shared -std=c++11 -fPIC -fopenmp $(python3 -m pybind11 --includes) -I/usr/include/eigen3 ply_c_pybind.cpp -o ../libply_c$(python3-config --extension-suffix)
cd ../..

Then generate superpoints:

python preprocessing/prepare_s3dis_superpoints_fast.py --input-path data/s3dis/processed --sp-save-path data/s3dis/superpoints 

And evaluate the S3DIS segmentation results:

CUDA_VISIBLE_DEVICES=0 python eval_s3dis.py --ckpt /path/to/checkpoint.tar --data-dir data/s3dis/processed --sp-dir data/s3dis/superpoints --test-area Area_5 --save-path outputs/eval_s3dis

About

FoundObj in PyTorch (ICML 2026)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors