Skip to content

Adding support for Multiscan Dataset #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ We list the available data used in the current version of CrossOver in the table
| ------------ | ----------------------------- | ----------------------------------- | -------------------------- | -------------------------- |
| ScanNet | `[point, rgb, cad, referral]` | `[point, rgb, floorplan, referral]` | ❌ | ✅ |
| 3RScan | `[point, rgb, referral]` | `[point, rgb, referral]` | ✅ | ✅ |
| MultiScan | `[point, rgb, referral]` | `[point, rgb, referral]` | ❌ | ✅ |


We detail data download and release instructions for preprocessing with scripts for ScanNet + 3RScan.
Expand Down Expand Up @@ -110,4 +111,39 @@ Scan3R/
| │ ├── objectsDataMultimodal.pt -> object data combined from data1D.pt + data2D.pt + data3D.pt (for easier loading)
| │ └── sel_cams_on_mesh.png (visualisation of the cameras selected for computing RGB features per scan)
| └── ...
```
```

### MultiScan

#### Running preprocessing scripts
Adjust the path parameters of `MultiScan` in the config files under `configs/preprocess`. Run the following (after changing the `--config-path` in the bash file):

```bash
$ bash scripts/preprocess/process_multiscan.sh
```

Our script for MultiScan dataset performs the following additional processing:

- 3D-to-2D projection for 2D segmentation and stores as `gt-projection-seg.pt` for each scan.

Post running preprocessing, the data structure should look like the following:

```
MultiScan/
├── objects_chunked/ (object data chunked into hdf5 format for instance baseline training)
| ├── train_objects.h5
| └── val_objects.h5
├── scans/
| ├── scene_00000_00/
| │ ├── gt-projection-seg.pt -> 3D-to-2D projected data consisting of framewise 2D instance segmentation
| │ ├── data1D.pt -> all 1D data + encoded (object referrals + BLIP features)
| │ ├── data2D.pt -> all 2D data + encoded (RGB + floorplan + DinoV2 features)
| │ ├── data2D_all_images.pt (RGB features of every image of every scan)
| │ ├── data3D.pt -> all 3D data + encoded (Point Cloud + I2PMAE features - object only)
| │ ├── object_id_to_label_id_map.pt -> Instance ID to NYU40 Label mapped
| │ ├── objectsDataMultimodal.pt -> object data combined from data1D.pt + data2D.pt + data3D.pt (for easier loading)
| │ └── sel_cams_on_mesh.png (visualisation of the cameras selected for computing RGB features per scan)
| └── ...
```


4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ See [DATA.MD](DATA.md) for detailed instructions on data download, preparation a
| ------------ | ----------------------------- | ----------------------------------- | -------------------------- | -------------------------- |
| Scannet | `[point, rgb, cad, referral]` | `[point, rgb, floorplan, referral]` | ❌ | ✅ |
| 3RScan | `[point, rgb, referral]` | `[point, rgb, referral]` | ✅ | ✅ |
| MultiScan | `[point, rgb, referral]` | `[point, rgb, referral]` | ❌ | ✅ |


> To run our demo, you only need to download generated embedding data; no need for any data preprocessing.

Expand All @@ -133,7 +135,7 @@ Various configurable parameters:
- `--database_path`: Path to the precomputed embeddings of the database scenes downloaded before (eg: `./release_data/embed_scannet.pt`).
- `--query_modality`: Modality of the query scene, Options: `point`, `rgb`, `floorplan`, `referral`
- `--database_modality`: Modality used for retrieval. Same options as above.
- `--ckpt`: Path to the pre-trained scene crossover model checkpoint (details [here](#checkpoints)), example_path: `./checkpoints/scene_crossover_scannet+scan3r.pth/`).
- `--ckpt`: Path to the pre-trained scene crossover model checkpoint (details [here](#checkpoints)), example_path: `./checkpoints/scene_crossover_scannet+scan3r.pth/`.

For embedding and pre-trained model download, refer to [generated embedding data](DATA.md#generated-embedding-data) and [checkpoints](#checkpoints) sections.

Expand Down
2 changes: 1 addition & 1 deletion TRAIN.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ $ bash scripts/train/train_instance_crossover.sh
```

#### Train Scene Retrieval Pipeline
Adjust path/configuration parameters in `configs/train/train_scene_crossover.yaml`. You can also add your customised dataset or choose to train on Scannet & 3RScan or either. Run the following:
Adjust path/configuration parameters in `configs/train/train_scene_crossover.yaml`. You can also add your customised dataset or choose to train on Scannet, 3RScan & MultiScan or any combination of the same. Run the following:

```bash
$ bash scripts/train/train_scene_crossover.sh
Expand Down
12 changes: 11 additions & 1 deletion configs/evaluation/eval_instance.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,23 @@ data :
max_object_len : 150
voxel_size : 0.02

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
avail_modalities : ['point', 'cad', 'rgb', 'referral']
max_object_len : 150
voxel_size : 0.02

task:
name : InferenceObjectRetrieval
InferenceObjectRetrieval:
val : [Scannet]
modalities : ['rgb', 'point', 'cad', 'referral']
scene_modalities : ['rgb', 'point', 'referral', 'floorplan']
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/instance_crossover_scannet+scan3r.pth
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/instance_crossover_scannet+scan3r+multiscan.pth


inference_module: ObjectRetrieval
Expand Down
12 changes: 11 additions & 1 deletion configs/evaluation/eval_scene.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,23 @@ data :
max_object_len : 150
voxel_size : 0.02

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
avail_modalities : ['point', 'cad', 'rgb', 'referral']
max_object_len : 150
voxel_size : 0.02

task:
name : InferenceSceneRetrieval
InferenceSceneRetrieval:
val : [Scannet]
modalities : ['rgb', 'point', 'cad', 'referral']
scene_modalities : ['rgb', 'point', 'referral', 'floorplan'] #, 'point']
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/scene_crossover_scannet+scan3r.pth
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/scene_crossover_scannet+scan3r+multiscan.pth

inference_module: SceneRetrieval
model:
Expand Down
8 changes: 8 additions & 0 deletions configs/preprocess/process_1d.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ data:
label_filename : labels.instances.align.annotated.v2.ply
skip_frames : 1

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
skip_frames : 1

Shapenet:
base_dir : /drive/datasets/Shapenet/ShapeNetCore.v2/

Expand Down
10 changes: 9 additions & 1 deletion configs/preprocess/process_2d.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ data:
label_filename : labels.instances.align.annotated.v2.ply
skip_frames : 1

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
skip_frames : 1

modality_info:
1D :
feature_extractor:
Expand Down Expand Up @@ -60,4 +68,4 @@ task:
name : Preprocess
Preprocess :
modality : '2D'
splits : ['val']
splits : ['train', 'val']
8 changes: 8 additions & 0 deletions configs/preprocess/process_3d.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,14 @@ data:
processor1D : Scan3R1DProcessor
label_filename : labels.instances.align.annotated.v2.ply

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
skip_frames : 1

modality_info:
1D :
feature_extractor:
Expand Down
9 changes: 9 additions & 0 deletions configs/preprocess/process_multimodal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,15 @@ data:
skip_frames : 1
avail_modalities : ['point', 'rgb', 'referral']

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/MultiScan
process_dir : ${data.process_dir}/MultiScan/
chunked_dir : ${data.process_dir}/MultiScan/objects_chunked
processor3D : Scan3R3DProcessor
processor2D : Scan3R2DProcessor
processor1D : Scan3R1DProcessor
avail_modalities : ['point', 'rgb', 'referral']

modality_info:
1D :
feature_extractor:
Expand Down
11 changes: 11 additions & 0 deletions configs/train/train_instance_baseline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,17 @@ data :
max_object_len : 150
voxel_size : 0.02

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/Multiscan
process_dir : ${data.process_dir}/MultiScan/
chunked_dir : ${data.process_dir}/MultiScan/objects_chunked
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
avail_modalities : ['point', 'rgb', 'referral']
max_object_len : 150
voxel_size : 0.02

task:
name : ObjectLevelGrounding
ObjectLevelGrounding :
Expand Down
15 changes: 13 additions & 2 deletions configs/train/train_instance_crossover.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,23 @@ data :
max_object_len : 150
voxel_size : 0.02

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/Multiscan
process_dir : ${data.process_dir}/MultiScan/
chunked_dir : ${data.process_dir}/MultiScan/objects_chunked
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
avail_modalities : ['point', 'cad', 'rgb', 'referral']
max_object_len : 150
voxel_size : 0.02

task:
name : SceneLevelGrounding
SceneLevelGrounding :
modalities : ['rgb', 'point', 'cad', 'referral']
train : [Scannet, Scan3R]
val : [Scannet, Scan3R]
train : [Scannet, Scan3R, MultiScan]
val : [Scannet, Scan3R, MultiScan]

trainer: GroundingTrainer

Expand Down
11 changes: 11 additions & 0 deletions configs/train/train_scene_crossover.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,17 @@ data :
max_object_len : 150
voxel_size : 0.02

MultiScan:
base_dir : /media/sayan/Expansion/data/datasets/Multiscan
process_dir : ${data.process_dir}/MultiScan/
chunked_dir : ${data.process_dir}/MultiScan/objects_chunked
processor3D : MultiScan3DProcessor
processor2D : MultiScan2DProcessor
processor1D : MultiScan1DProcessor
avail_modalities : ['point', 'cad', 'rgb', 'referral']
max_object_len : 150
voxel_size : 0.02

task:
name : UnifiedTrain
UnifiedTrain :
Expand Down
3 changes: 2 additions & 1 deletion data/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
from .scannet import *
from .scan3r import *
from .scan3r import *
from .multiscan import *
42 changes: 42 additions & 0 deletions data/datasets/multiscan.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import os.path as osp
import numpy as np
from typing import List, Any
from omegaconf import DictConfig

from ..build import DATASET_REGISTRY
from .scanbase import ScanObjectBase, ScanBase

@DATASET_REGISTRY.register()
class MultiScanObject(ScanObjectBase):
"""MultiScan dataset class for instance level baseline"""
def __init__(self, data_config: DictConfig, split: str) -> None:
super().__init__(data_config, split)

@DATASET_REGISTRY.register()
class MultiScan(ScanBase):
"""MultiScan dataset class"""
def __init__(self, data_config: DictConfig, split: str) -> None:
super().__init__(data_config, split)

filepath = osp.join(self.files_dir, '{}_scans.txt'.format(self.split))
self.scan_ids = np.genfromtxt(filepath, dtype = str)

def get_temporal_scan_pairs(self) -> List[List[Any]]:
"""Gets pairs of temporal scans from the dataset."""
scene_pairs = []

ref_scan_ids = [scan_id for scan_id in self.scan_ids if scan_id.endswith('00')]

for ref_scan_id in ref_scan_ids:
rescan_list = []

for rescan_id in self.scan_ids:
rescan = {}
if rescan_id.startswith(ref_scan_id.split('_')[0]) and rescan_id != ref_scan_id:
rescan['scan_id'] = rescan_id
rescan_list.append(rescan)
if len(rescan_list) == 0:
continue

scene_pairs.append([ref_scan_id, rescan_list])
return scene_pairs
42 changes: 41 additions & 1 deletion prepare_data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
This document provides instructions for pre-processing different datasets, including
- ScanNet
- 3RScan
- MultiScan

## Prerequisites

Expand All @@ -16,10 +17,12 @@ Before you begin, simply activate the `crossover` conda environment.

- **3RScan**: Download 3RScan dataset from the [official website](https://github.com/WaldJohannaU/3RScan).

- **MultiScan**: Download MultiScan dataset from the [official website](https://github.com/smartscenes/multiscan).

- **ShapeNet**: Download Shapenet dataset from the [official website](https://shapenet.org/) and unzip.

### Download Referral and CAD annotations
We use [SceneVerse](https://scene-verse.github.io/) for instance referrals (ScanNet & 3RScan) and [Scan2CAD](https://github.com/skanti/Scan2CAD) for CAD annotations (ScanNet). Exact instructions for data setup below.
We use [SceneVerse](https://scene-verse.github.io/) for instance referrals (ScanNet, 3RScan & MultiScan) and [Scan2CAD](https://github.com/skanti/Scan2CAD) for CAD annotations (ScanNet). Exact instructions for data setup below.

#### ScanNet
1. Run the following to extract ScanNet data
Expand Down Expand Up @@ -96,4 +99,41 @@ Scan3R/
├── test_scans.txt
└── sceneverse
└── ssg_ref_rel2_template.json
```

#### MultiScan
1. Download MultiScan data into MultiScan/scenes and run the following to extract MultiScan data

```bash
cd MultiScan/scenes
unzip '*.zip'
rm -rf '*.zip'
```
3. To generate sequence of RGB images and corresponding camera poses from the ```.mp4``` file, run the follwing
```bash
cd prepare_data/multiscan
python preprocess_2d_multiscan.py --base_dir PATH_TO_MULTISCAN --frame_interval {frame_interval}
```
Once completed, the data structure would look like the following:
```
MultiScan/
├── scenes/
│ ├── scene_00000_00/
│ │ ├── sequence/ (folder containing rgb images at specified frame interval)
| | ├── frame_ids.txt
│ │ ├── scene_00000_00.annotations.json
│ │ ├── scene_00000_00.jsonl
│ │ ├── scene_00000_00.confidence.zlib
│ │ ├── scene_00000_00.mp4
│ │ ├── poses.jsonl
│ │ ├── scene_00000_00.ply
│ │ ├── scene_00000_00.align.json
│ │ ├── scene_00000_00.json
| └──
└── files
├── scannetv2-labels.combined.tsv
├── train_scans.txt
├── test_scans.txt
└── sceneverse
└── ssg_ref_rel2_template.json
```
Loading