Inverting a Rolling Shutter Camera: Bring Rolling Shutter Images to High Framerate Global Shutter Video

ICCV 2021


Bin Fan, Yuchao Dai

School of Electronics and Information, Northwestern Polytechnical University, Xi'an, China   

Abstract


Rolling shutter (RS) images can be viewed as the result of the row-wise combination of global shutter (GS) images captured by a virtual moving GS camera over the period of camera readout time. The RS effect brings tremendous difficulties for the downstream applications. In this paper, we propose to invert the above RS imaging mechanism, i.e., recovering a high framerate GS video from consecutive RS images to achieve RS temporal super-resolution (RSSR). This extremely challenging problem, e.g., recovering 1440 GS images from two 720-height RS images, is far from being solved end-to-end. To address this challenge, we exploit the geometric constraint in the RS camera model, thus achieving geometry-aware inversion. Specifically, we make three contributions in resolving the above difficulties: (i) formulating the bidirectional RS undistortion flows under the constant velocity motion model, (ii) building the connection between the RS undistortion flow and optical flow via a scaling operation, and (iii) developing a mutual conversion scheme between varying RS undistortion flows that correspond to different scanlines. Building upon these formulations, we propose the first RS temporal super-resolution network in a cascaded structure to extract high framerate global shutter video. Our method explores the underlying spatio-temporal geometric relationships within a deep learning framework, where no extra supervision besides the middle-scanline ground truth GS image is needed. Essentially, our method can be very efficient for explicit propagation to generate GS images under any scanline. Experimental results on both synthetic and real data show that our method can produce high-quality GS image sequences with rich details, outperforming state-of-the-art methods.


Contribution


  • We identify and establish a detailed proof of the scanline-dependent nature of the bidirectional RS undistortion flows, which is essential for understanding the intrinsic geometrical properties of RS correction problem.
  • From the theoretical perspective, we provide a sound motivation for our first learning-based RSSR solution for latent GS video sequence extraction from two consecutive RS images, which brings RS images alive.
  • Our approach not only outperforms the state-of-the-art methods in both RS effect removal and inference efficiency, but also can produce a smooth and continuous video sequence far beyond the reach of the existing method.

Problem Statement


Architecture

The RS image is generated by continuously synthesizing the GS image row by row, while our rolling shutter temporal super-resolution (RSSR) pipeline reverses this process, i.e., extracting the latent GS image sequence from two consecutive RS images.


Overview Video



Network Architecture


Architecture

Given two input consecutive RS frames, we first estimate the bidirectional optical flows. Then, we use the UNet architecture to resolve the correlation maps. Next, the middle-scanline RS undistortion flows can be calculated explicitly, while being certifiable. Finally, we adopt softmax splatting to generate the target middle-scanline GS frames. Note that our main network is designed to predict the latent GS images corresponding to the middle scanline during training. In particular, in the test phase, the RS undistortion flows for any scanline $s\in[0,h-1]$ can be propagated explicitly (see dashed arrow), followed by the recovery of the GS image corresponding to scanline $s$.


Demo Video



RS Undistortion Flow vs. Optical Flow


ablation study

Compared to the isotropically smooth regular optical flow map, the RS undistortion flow map exhibits a more significant scanline dependence. On the one hand, the RS undistortion flows near the target scanline appear as lighter colors (i.e., smaller warping displacement values). On the other hand, the RS undistortion flows corresponding to pixels smaller than and larger than the target scanline show different colors (i.e., different warping displacement directions).


Qualitative Comparison


ablation study

Visual comparisons on Fastec-RS testing set. We zoom in the correction results according to the blue boxes. While other methods cause various artifacts, our method produces best results.


Citation


@inproceedings{fan_RSSR_ICCV21,
  title={Inverting a rolling shutter camera: bring rolling shutter images to high framerate global shutter video},
  author={Fan, Bin and Dai, Yuchao},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={4228--4237},
  year={2021}
}