School of Electronics and Information, Northwestern Polytechnical University, Xi'an, China
Most modern consumer-grade cameras are equipped with an electronic rolling shutter (RS), leading to image distortions when the camera moves during image acquisition. We explore the first structure and motion estimation problem of a dynamic generalized RS stereo camera. Such a general configuration is commonplace in robots and autonomous driving applications. We propose a tractable RS stereo differential structure from motion (SfM) algorithm, taking into account the RS effect during consecutive imaging, which effectively compensates for the RS-stereo image distortion by a linear scaling operation on each optical flow. We further propose embedding the cheirality into RANSAC and develop a robust RS-stereo-aware full-motion estimation framework. We demonstrate that the RS stereo motion and depth map refined by our non-linear optimization schemes within the maximum likelihood criterion can be used for image correction to recover high-quality global shutter (GS) stereo images. Moreover, using the proposed generalized RS stereo differential SfM pipeline, the corrected images produce an accurate 3D scene structure as the ground-truth structure. Extensive experiments on both synthetic and real RS stereo data demonstrate the effectiveness of our model and method in various configurations.
Illustration of the exposure, readout, idle, and delay mechanisms of the generalized RS stereo camera across two consecutive frames. The sensor is exposed and read out row by row at a constant speed. Assuming the camera exposure is instantaneous, the frame time $\tau_i$ includes readout time $\tau^a_i$ and idle time $\tau^b_i$ in the single RS camera $i=l,r$. Moreover, there is a calibrated delay time $\tau^d$ between the exposure start times of left and right cameras.
From two consecutive general RS stereo images (a), we establish a generalized RS stereo model to robustly recover the RS stereo motion (b) and the 3D scene structure (c), and then achieve high-quality RS stereo correction (d), in which the red tilted poles in the foreground are repaired. Note that we are only showing an example for standard RS stereo images here.
Quantitative evaluation for generalized RS stereo configuration under various settings: (a) varying the image resolution of the right RS camera, (b) varying the FPS of the right RS camera, (c) varying the exposure delay time of the right RS camera, (d) randomly varying the image resolution, FOV, and readout time ratio of the right RS camera separately.
Qualitative results on synthetic data with various RS stereo configurations. In the first two columns, we show both left and right RS images to illustrate the generality of our RS stereo configuration. The last two columns represent the residual images, i.e., the absolute difference between the original or corrected RS image and the ground-truth GS image. From top to bottom: (a) Standard RS stereo camera with the same orientation. (b) RS stereo camera with a vertical orientation. (c) RS stereo camera with opposite orientation. (d) The right camera has a higher frame rate of 60Hz. (e) The right camera delays exposure by 1/60 seconds. (f) The right camera has a larger image resolution of 1200$\times$1200. It demonstrates that our method has excellent performances for varying setups in removing RS distortion and estimating RS depth maps, showing as darker residual images in the last column. Images have been scaled for visualization.
Qualitative results on real data collected by a UAV. Our method is effective to reconstruct the accurate 3D structure and remove the undesired RS distortion for the generalized RS stereo configuration in practice.
We would like to thank DJI for the support of this work.
@article{fan2022differential,
title={Differential sfm and image correction for a rolling shutter stereo rig},
author={Fan, Bin and Dai, Yuchao and Zhang Zhiyuan and Wang, Ke},
journal={Image and Vision Computing},
year={2022},
volume={124},
number={},
pages={104492},
publisher={Elsevier}
}