Skip to content

andrewyguo/Dark3R

Repository files navigation

Dark3R: Low-Light 3D Reconstruction

Dark3R extends MASt3R (Matching Anything by Segmenting with 3D Representations) to work effectively with low-light and noisy images. The method uses LoRA (Low-Rank Adaptation) fine-tuning to adapt the pre-trained MASt3R model for low-light conditions while maintaining performance on clean images.


1. Installation Instructions

Dark3R uses a conda environment for dependency management. The project includes an environment.yml file that specifies all required dependencies.

Prerequisites

  • Conda (Miniconda or Anaconda)
  • CUDA-capable GPU (recommended for training and inference)
  • CUDA 12.1+ (for GPU support)

Installation Steps

  1. Clone the repository (if not already done):

    git clone <repository-url>
    cd Dark3R  
  2. Create the conda environment from the environment.yml file:

    conda env create -f environment.yml
  3. Activate the environment:

    conda activate dark3r
  4. Verify installation:

    python -c "import torch; print(f'PyTorch version: {torch.__version__}')"
    python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

The environment includes all necessary dependencies including PyTorch, CUDA support, computer vision libraries, 3D reconstruction tools, and evaluation utilities.

Downloading MASt3R Checkpoints

Follow the instructions in the original MASt3R repository to download the pre-trained MASt3R weights into checkpoints/.

Make sure to download both the main model and the retrieval model. Your checkpoints folder should have the following:

checkpoints/
├── MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth
├── MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl
└── MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth

If those are not available, we have a backup of these weights in this folder.


2. Downloading the Dataset

Dataset Structure

Dark3R datasets are organized with the following structure:

<DATASET_ROOT>/
├── bc040_chapel/                  # scene 
│   ├── iso102400_s1_00080/        # exposure 
│   │   ├── downsampled_004/       # resolution
│   │   ├── downsampled_016/     
│   │   ├── images_arw/          
│   ├── iso102400_s1_00125/
│   │   ├── downsampled_016/
│   │   └── images_arw/
│   └── ...
├── bc041_traincar/
│   └── ...
└── ...

The dataset is organized on 3 levels: scene, exposure, and resolution.

Folder Nomenclature

Each scene folder (starting with bc) contains multiple exposures (starting with iso), each containing different resolutions. The naming convention is iso<iso value>_s<1 or 0>_<exposure time>. For example, iso102400_s1_00125 means ISO 102400 and exposure time of 1/125s. iso102400_s0_00003 would mean ISO 102400 and exposure time of 3s.

Downloading Datasets

The dataset can be downloaded using the script download/download_dataset.py. There are pre-defined configuration files in download/config that control which scenes, exposures, and resolutions are downloaded.

Real Captures (Tripod)

If you do not need the original raw files, which are extremely large in size, we recommend using the config download/config/full_scenes_downsampled.yaml.

python download/download_dataset.py download/config/full_scenes_downsampled.yaml

Synthetic Captures (Handheld)

To download synthetic captures, we also provide a pre-defined config.

python download/download_dataset.py download/config/synthetic_downsampled.yaml

COLMAP Poses: We also provide COLMAP poses on the long exposure images to be used as reference for computing pose error. They can be found in the colmap_long_exposure folder inside where the dataset was unzipped.

It is recommended to use the script above to download the dataset. However, the full dataset can also be found using this Dropbox Link.

Additional Info

For more details on the dataset, please see download/dataset_info.md.

3. Training Instructions

Note, this is only required for training Dark3R. If you are using the pretrained weights, continue to step 4.

Generating Pairs

To train Dark3R, the model requires pairs of images with visual overlap to learn dense matching and 3D consistency. Instead of computing these on the fly, we pre-calculate valid pairs using image retrieval, following the MASt3R-SfM protocol.

Run the following script to generate a JSON file containing these pairs:

python utils/generate_training_pairs.py --train_dataset_root data/dark3r_dataset

Output: This will create a pairs file (e.g., pairs_rf0050_ws0050.json) that you must pass to the --precomputed_pairs argument when running the training script.

Note: This script will only compute pairs for scenes in the training dataset. These are defined in dataset/dataset_config/finetuning_real_datasets_1023.yml.

Running the Training Script

To train the model, run train_dark3r.py. The script uses torch.distributed for multi-GPU support.

Below is the configuration we found to have the best results and correspond to the quantitative results in the paper.

python -m torch.distributed.run --nproc_per_node=1 train_dark3r.py \
    --batch_size 2 \
    --num_epochs 20 \
    --save_dir results/train_dark3r \
    --real_dataset_config dataset/dataset_config/finetuning_real_datasets_1023.yml \
    --real_dataset_root data/dark3r_dataset \
    --sim_dataset_root data/synthetic_dataset \
    --precomputed_pairs pairs_rf0050_ws0050.json 

Key Arguments

  • --nproc_per_node: Number of GPUs to use. The paper uses 8 A6000 GPUs.

  • --losses: Specifies the loss components. encoder and desc refer to the consistency losses on noisy inputs.

  • --real_dataset_root: Path to the real-world captured dataset (handheld/tripod) used for "captured data" supervision.

  • --sim_dataset_root: Path to the dataset where noise is synthesized using a Poisson-Gaussian model.

  • --precomputed_pairs: JSON file containing the pairs of images to be used for training (generated via retrieval).

Training Outputs

Checkpoints and logs will be saved to the directory specified by -o / --output_dir. The final model weights should be used for the inference steps in Section 4.


4. Downloading Pretrained Weights

Skip this step and continue to step 5 if you are not using the pretrained weights.

To download the pretrained weights, run the command:

python download/download_dataset.py download/config/pretrained_weights.yaml

5. Inference Instructions

Dark3R inference performs 3D reconstruction from a set of low-light images. The canonical format for running inference is:

python forward_dark3r.py \
    --filelist "data/dark3r_dataset/bc040_chapel/iso102400_s1_02000/downsampled_016/images_npy_undistorted/*.npy" \
    --outdir results/test_dark3r \
    --finetuned_model <FINETUNED_LORA_PATH> \
    --winsize 15 --refid 10

Key Arguments:

  • --filelist: Glob pattern or list of image file paths (typically .npy files)
  • --outdir: Output directory for reconstruction results
  • --finetuned_model: Path to trained LoRA weights file (typically ends with _lora.pth)
  • --winsize: number of key images ($N_a$ in MASt3R-SfM paper)
  • --refid: number of nearest neighbors ($k$ in MASt3R-SfM paper)
  • --niter1, --niter2: Number of iterations for each alignment optimization stage

Additional Optional Arguments:

  • --images_start, --images_end: Range of images to process
  • --subsample: Subsample factor for processing every Nth image

Output Files: The reconstruction produces several output files in the specified output directory:

  • transforms.json: Camera poses in NeRF format
  • scene.pkl: Full scene reconstruction data, needed to render depth maps
  • Additional visualization and analysis files

Rendering Sparse Depth

After running inference, you can export and visualize sparse depth maps from the reconstructed scene. The process involves two steps: exporting depth maps from the scene.pkl file, and then visualizing them.

Step 1: Export Depth Maps

Use the export_depth.py script to extract depth maps and confidence maps from the scene.pkl file:

python utils/export_depth.py \
    --pkl_path <OUTPUT_PATH>/scene.pkl 

Arguments:

  • --pkl_path / -p: Path to the scene.pkl file (typically in the inference output directory)
  • --out_dir / -o: Output directory for depth and confidence maps (default: same directory as pkl_path)
  • --subsample: Subsample factor for dense depth computation (default: 8)

Output: The script creates two directories in the output directory:

  • depth/: Contains depth maps as .npy files, named after the original image files
  • conf/: Contains confidence maps as .npy files, named after the original image files

Example:

python utils/export_depth.py \
    -p ./outputs/reconstruction/scene.pkl \
    -o ./outputs/reconstruction

Step 2: Visualize Depth Maps

Use the visualize_depth.py script to create visualizations of the exported depth maps:

python utils/visualize_depth.py \
    --input_dir <OUTPUT_PATH>/depth 

Please see the the script for specific arguments.

Note: The visualization script requires FFmpeg to create videos. Use the --no_video flag to skip video creation.


6. Evaluation Instructions

Dark3R includes an evaluation script to compare predicted camera poses against ground truth transforms. The evaluation script takes two transform JSON files and computes pose estimation accuracy metrics.

The reference camera poses for each scene can be found in the colmap_poses_only folder. This folder should be downloaded already with the script in the download dataset step above.

The script automatically filters and aligns the ground truth and predicted transforms to match frame indices before evaluation.

Usage:

python evaluate_dark3r.py <ground_truth_transforms.json> <predicted_transforms.json> [options]

Arguments:

  • ground_truth_transforms.json: Path to ground truth transforms file (NeRF format)
  • predicted_transforms.json: Path to predicted transforms file (NeRF format)

Output: The evaluation results are saved to:

  • eval_pose.txt: Human-readable text file with detailed metrics
  • eval_pose.json: JSON file with structured evaluation results

Both files are saved in the same directory as the predicted transforms file.

Evaluation Metrics: The script computes the following metrics:

  • APE (Absolute Pose Error): Translation and rotation errors
  • RPE (Relative Pose Error): Frame-to-frame pose errors
  • RRA (Rotation Recall Accuracy): Percentage of frames with rotation error below threshold
  • RTA (Translation Recall Accuracy): Percentage of frames with translation error below threshold

Depth Evaluation

Dark3R also includes a script to evaluate depth maps against ground truth. This script computes scale-aware depth metrics by first determining the scale factor between predicted and ground truth poses using Umeyama alignment, then rescales the predicted depths before computing metrics.

Usage:

python utils/evaluate_depth.py \
    --gt_depth_dir <GT_DEPTH_DIR> \
    --test_depth_dir <TEST_DEPTH_DIR> \
    [--gt_transforms <GT_TRANSFORMS_JSON>] \
    [--test_transforms <TEST_TRANSFORMS_JSON>]

Arguments:

  • --gt_depth_dir / -g: Directory containing ground truth depth .npy files (required)
  • --test_depth_dir / -t: Directory containing predicted/test depth .npy files (required)
  • --gt_transforms / -gt: Path to ground truth transforms.json (default: gt_depth_dir/../transforms.json)
  • --test_transforms / -tt: Path to test transforms.json (default: test_depth_dir/../transforms.json)
  • --max_pairs / -mp: Maximum number of pairs for pairwise scale computation (default: 200000)

Example:

python utils/evaluate_depth.py \
    -g ./data/ground_truth/depth \
    -t ./outputs/reconstruction/depth \
    -gt ./data/ground_truth/transforms.json \
    -tt ./outputs/reconstruction/transforms.json

Output: The evaluation results are saved to eval_depth.json in the parent directory of the test depth directory. The output includes:

  • Scale factors: Multiple scale estimation methods (Umeyama, path length, pairwise). Umeyama is used to rescale the depth maps and the other methods are used to sanity check the Umeyama alignment.
  • Depth metrics: Absolute Relative Error (AbsRel) and δ<1.25 accuracy
  • Per-image metrics: Individual metrics for each depth map
  • Summary statistics: Mean and median metrics across all images

Note: The script automatically handles shape mismatches by resizing predicted depths to match ground truth using bilinear interpolation.

Dark3R-NeRF

For instructions on how to setup and run Dark3R-NeRF, please see dark3rnerf/README.md.

Citation

            @article{guo2026dark3r,
              title={Dark3R: Learning Structure from Motion in the Dark},
              author={Andrew Y Guo and Anagh Malik and SaiKiran Tedla and Yutong Dai and Yiqian Qin and Zach Salehe and Benjamin Attal and Sotiris Nousias and Kyros Kutulakos and David B. Lindell},
              year={2026},
              journal={arXiv},
              url={https://arxiv.org/abs/2603.05330},
            }

About

[CVPR 2026] Official code release for Dark3R: Learning Structure from Motion in the Dark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages