|
6 | 6 | [DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras](https://arxiv.org/abs/2108.10869) |
7 | 7 | Zachary Teed and Jia Deng |
8 | 8 |
|
9 | | -* Code for our paper DROID-SLAM will be added on September 1st. |
| 9 | +``` |
| 10 | +@article{teed2021droid, |
| 11 | + title={{DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras}}, |
| 12 | + author={Teed, Zachary and Deng, Jia}, |
| 13 | + journal={arXiv preprint arXiv:2108.10869}, |
| 14 | + year={2021} |
| 15 | +} |
| 16 | +``` |
| 17 | + |
| 18 | +**Initial Code Release:** This repo currently provides a single GPU implementation of our monocular SLAM system. It also contains demos, training, and evaluation scripts. Stereo, RGB-D, and multi-GPU code will be added on **September 7**. |
| 19 | + |
| 20 | + |
| 21 | +## Requirements |
| 22 | + |
| 23 | +To run the code you will need ... |
| 24 | +* **Inference:** Running the demos will require a GPU with at least 11G of memory. |
| 25 | + |
| 26 | +* **Training:** Training requires a GPU with at least 24G of memory. We train on 4 x RTX-3090 GPUs. |
| 27 | + |
| 28 | +## Getting Started |
| 29 | +1. Clone the repo using the `--recursive` flag |
| 30 | +```Bash |
| 31 | +git clone --recursive https://github.com/princeton-vl/DROID-SLAM.git |
| 32 | +``` |
| 33 | + |
| 34 | +2. Creating a new anaconda environment using the provided .yaml file. Use `environment_novis.yaml` to if you do not want to use the visualization |
| 35 | +```Bash |
| 36 | +conda env create -f environment.yml |
| 37 | +pip install evo --upgrade --no-binary evo |
| 38 | +pip install gdown |
| 39 | +``` |
| 40 | + |
| 41 | +3. Compile the extensions (takes about 10 minutes) |
| 42 | +```Bash |
| 43 | +python setup.py install |
| 44 | +``` |
| 45 | + |
| 46 | + |
| 47 | +## Demos |
| 48 | + |
| 49 | +1. Download the model from google drive: [droid.pth](https://drive.google.com/file/d/1PpqVt1H4maBa_GbPJp4NwxRsd9jk-elh/view?usp=sharing) |
| 50 | + |
| 51 | +2. Download some sample videos using the provided script. |
| 52 | +```Bash |
| 53 | +./tools/download_sample_data.sh |
| 54 | +``` |
| 55 | + |
| 56 | +Run the demo on any of the samples (all demos can be run on a GPU with 11G of memory). While running, press the "s" key to increase the filtering threshold (= more points) and "a" to decrease the filtering threshold (= fewer points). |
| 57 | + |
| 58 | + |
| 59 | +```Python |
| 60 | +python demo.py --imagedir=data/abandonedfactory --calib=calib/tartan.txt --stride=2 |
| 61 | +``` |
| 62 | + |
| 63 | +```Python |
| 64 | +python demo.py --imagedir=data/sfm_bench/rgb --calib=calib/eth.txt |
| 65 | +``` |
| 66 | + |
| 67 | +```Python |
| 68 | +python demo.py --imagedir=data/Barn --calib=calib/barn.txt --stride=1 --backend_nms=4 |
| 69 | +``` |
| 70 | + |
| 71 | +```Python |
| 72 | +python demo.py --imagedir=data/mav0/cam0/data --calib=calib/euroc.txt --t0=150 |
| 73 | +``` |
| 74 | + |
| 75 | +```Python |
| 76 | +python demo.py --imagedir=data/rgbd_dataset_freiburg3_cabinet/rgb --calib=calib/tum3.txt |
| 77 | +``` |
| 78 | + |
| 79 | + |
| 80 | +**Running on your own data:** All you need is a calibration file. Calibration files are in the form |
| 81 | +``` |
| 82 | +fx fy cx cy [k1 k2 p1 p2 [ k3 [ k4 k5 k6 ]]] |
| 83 | +``` |
| 84 | +with parameters in brackets optional. |
| 85 | + |
| 86 | +## Evaluation (Monocular) |
| 87 | +We provide evaluation scripts for TartanAir, EuRoC, and TUM. EuRoC and TUM can be run on a 1080Ti. The TartanAir validation script will require 24G of memory. |
| 88 | + |
| 89 | +### EuRoC |
| 90 | +Download the [EuRoC](https://projects.asl.ethz.ch/datasets/doku.php?id=kmavvisualinertialdatasets) sequences (ASL format) and put them in `datasets/EuRoC` |
| 91 | +```Bash |
| 92 | +./tools/evaluate_euroc.sh |
| 93 | +``` |
| 94 | + |
| 95 | +### TUM-RGBD |
| 96 | +Download the fr1 sequences from [TUM-RGBD](https://vision.in.tum.de/data/datasets/rgbd-dataset/download) and put them in `datasets/TUM-RGBD` |
| 97 | +```Bash |
| 98 | +./tools/evaluate_tum.sh |
| 99 | +``` |
| 100 | + |
| 101 | +### TartanAir |
| 102 | +Download the [TartanAir](https://theairlab.org/tartanair-dataset/) dataset using the script `thirdparty/tartanair_tools/download_training.py` and put them in `datasets/TartanAir` |
| 103 | +```Bash |
| 104 | +./tools/validate_tartanair.sh |
| 105 | +``` |
| 106 | + |
| 107 | +## Training |
| 108 | + |
| 109 | +First download the TartanAir dataset. The download script can be found in `thirdparty/tartanair_tools/download_training.py`. You will only need the `rgb` and `depth` data. |
| 110 | + |
| 111 | +``` |
| 112 | +python download_training.py --rgb --depth |
| 113 | +``` |
| 114 | + |
| 115 | +You can then run the training script. We use 4x3090 RTX GPUs for training which takes approximatly 1 week. If you use a different number of GPUs, adjust the learning rate accordingly. |
| 116 | + |
| 117 | +**Note:** On the first training run, covisibility is computed between all pairs of frames. This can take several hours, but the results are cached so that future training runs will start immediately. |
| 118 | + |
| 119 | + |
| 120 | +``` |
| 121 | +python train.py --datapath=<path to tartanair> --gpus=4 --lr=0.00025 |
| 122 | +``` |
| 123 | + |
| 124 | + |
| 125 | +## Acknowledgements |
| 126 | +Data from [TartanAir](https://theairlab.org/tartanair-dataset/) was used to train our model. We additionally use evaluation tools from [evo](https://github.com/MichaelGrupp/evo) and [tartanair_tools](https://github.com/castacks/tartanair_tools). |
0 commit comments