# SIN Structure Inference Net: Object Detection Using Scene-level Context and Instance-level Relationships. In CVPR 2018.(http://vipl.ict.ac.cn/uploadfile/upload/2018041318013480.pdf) ### Requirements: software 1. Requirements for Tensorflow 1.3.0 (see: [Tensorflow](https://www.tensorflow.org/)) 2. Python packages you might not have: `cython`, `python-opencv`, `easydict` ### Installation (sufficient for the demo) 1. Clone the SIN repository ```Shell # Make sure to clone with --recursive git clone --recursive https://github.com/choasUp/SIN.git ``` 2. Build the Cython modules ```Shell cd $SIN_ROOT/lib make ``` ### Demo *After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo. Wait ... ### Training Model 1. Download the training, validation, test data and VOCdevkit ```Shell wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar ``` 2. Extract all of these tars into one directory named `VOCdevkit` ```Shell tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar ``` 3. It should have this basic structure ```Shell $VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ... ``` 4. Create symlinks for the PASCAL VOC dataset ```Shell cd $SIN_ROOT/data ln -s $VOCdevkit VOCdevkit ``` 5. Download the pre-trained ImageNet models [[Google Drive]](https://drive.google.com/open?id=0ByuDEGFYmWsbNVF5eExySUtMZmM) [[Dropbox]](https://www.dropbox.com/s/po2kzdhdgl4ix55/VGG_imagenet.npy?dl=0) ```Shell mv VGG_imagenet.npy $SIN_ROOT/data/pretrain_model/VGG_imagenet.npy ``` 6. [optional] Set learning rate and max iter ```Shell vim experiments/scripts/faster_rcnn_end2end.sh # ITERS vim lib/fast/config.py # LR cd lib # if you edit the code, make best make ``` 7. Set your GPU id, then run script to train and test model ```Shell cd $SIN_ROOT export CUDA_VISIBLE_DEVICSE=0 ./train.sh ``` 8. Test your dataset ```Shell ./test_all.sh ``` ### The result of testing on PASCAL VOC 2007 (VGG net) ``` AP for aeroplane = 0.7853 AP for bicycle = 0.8045 AP for bird = 0.7456 AP for boat = 0.6657 AP for bottle = 0.6144 AP for bus = 0.8424 AP for car = 0.8663 AP for cat = 0.8894 AP for chair = 0.5803 AP for cow = 0.8466 AP for diningtable = 0.7171 AP for dog = 0.8578 AP for horse = 0.8626 AP for motorbike = 0.7802 AP for person = 0.7857 AP for pottedplant = 0.4869 AP for sheep = 0.7599 AP for sofa = 0.7351 AP for train = 0.8199 AP for tvmonitor = 0.7683 Mean AP = 0.7607 ``` ### References [Faster R-CNN caffe version](https://github.com/rbgirshick/py-faster-rcnn) [Faster R-CNN tf version](https://github.com/smallcorgi/Faster-RCNN_TF) ### Citation Yong Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. Structure Inference Net: Object Detection Using Scene-level Context and Instance-level Relationships. In CVPR 2018.