SUNet: Symmetric Undistortion Network for Rolling Shutter Correction

ICCV 2021


Bin Fan, Yuchao Dai, Mingyi He

School of Electronics and Information, Northwestern Polytechnical University, Xi'an, China   

Abstract


The vast majority of modern consumer-grade cameras employ a rolling shutter (RS) mechanism, leading to image distortions if the camera moves during image acquisition. In this paper, we present a novel deep network to solve the generic rolling shutter correction problem with two consecutive frames. Our pipeline is symmetrically designed to predict the global shutter (GS) image corresponding to the intermediate time of these two frames, which is difficult for existing methods because it corresponds to a camera pose that differs most from the two frames. First, two time-symmetric dense undistortion flows are estimated by using well-established principles: pyramidal construction, warping, and cost volume processing. Then, both rolling shutter images are warped into a common global shutter one in the feature space, respectively. Finally, a symmetric consistency constraint is constructed in the image decoder to effectively aggregate the contextual cues of two rolling shutter images, thereby recovering the high-quality global shutter image. Extensive experiments with both synthetic and real data from public benchmarks demonstrate the superiority of our proposed approach over the state-of-the-art methods.


Contribution


  • We propose an efficient end-to-end symmetric RS undistortion network to solve the generic RS correction problem with two consecutive frames.
  • Our context-aware cost volume together with the symmetric consistency constraint can aggregate the contextual cues of two input RS images effectively.
  • Extensive experiments show that our approach performs favorably against the state-of-the-art methods in both GS image restoration and inference efficiency.

Overview Video



Network Architecture


Architecture

Our pipeline mainly consists of two sub-networks: a PWC-based undistortion flow estimator and a time-centered GS image decoder. We only show the RS correction modules at the top two levels. For the rest of the pyramidal levels (excluding the first two layers), the overall RS correction modules have a similar structure as the second to the top level. Note that only the second to fifth pyramid features are warped, following a tailored correlation GS image decoder. Our network is designed symmetrically to aggregate two consecutive RS images in a coarse-to-fine manner. The symmetric convolutional layers of the same color share the same weights.


Demo Video



Qualitative Comparison


ablation study

Qualitative results against baseline methods. Even rows: absolute difference between the corresponding image and the ground truth GS image. (a) The original second frame RS images. (b-e) GS images predicted by existing methods and our SUNet, respectively.


3D Reconstruction


flow uncertainty visualization

SfM results by Colmap for a building. (a) Reconstructed 3D model with original RS images. (b) Reconstructed 3D model with corrected GS images. (c) Reconstructed 3D model with ground truth GS images. It demonstrates that our pipeline removes the undesired RS distortion and generates a more accurate 3D model as the ground truth one.


Citation


@inproceedings{fan_SUNet_ICCV21,
  title={SUNet: Symmetric Undistortion Network for Rolling Shutter Correction},
  author={Fan, Bin and Dai, Yuchao and He, Mingyi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={4541--4550},
  year={2021}
}