In this paper, we present a novel stereo matching network aimed at real-time with high accuracy. Current deep architectures form a massive cost volume in order to leverage global context information. However, forming the cost volume is time consuming and it acts as a bottleneck of the network. We form the smaller cost volume than previously used for speed gain. However, the down-scaled cost volume leads to accuracy degradation at areas of thin structures and homogeneous regions. To overcome this limitation, we use focal loss that handles hard negative examples. Moreover, we ease multi-modal distribution problem by using top-k argmin operation when regressing disparity. We call our proposed network RT-SNet, which runs over 40 FPS on color stereo images using NVIDIA Tesla P100. We evaluate our proposed network on KITTI 2015 dataset, experimental results show that RTSNet outperforms other networks with similar runtime.
- 파일 다운로드
- Real-Time Stereo Matching Network with High Accuracy.pdf (17.59MB)