Learning Dense and Continuous Optical Flow from an Event Camera

IEEE Transactions on Image Processing (TIP 2022)

Zhexiong Wan, Yuchao Dai^#, Yuxin Mao

School of Electronics and Information, Northwestern Polytechnical University, Xi'an, Shaanxi, 710129, China.
^# corresponding author
wanzhexiong@mail.nwpu.edu.cn, daiyuchao@nwpu.edu.cn, maoyuxin@mail.nwpu.edu.cn

arXiv

IEEE

Supp

Code

Abstract

Event cameras such as DAVIS can simultaneously output high temporal resolution events and low frame-rate intensity images, which own great potential in capturing scene motion, such as optical flow estimation. Most of the existing optical flow estimation methods are based on two consecutive image frames and can only estimate \emph{discrete flow} at a fixed time interval. Previous work has shown that \emph{continuous flow} estimation can be achieved by changing the quantities or time intervals of events. However, they are difficult to estimate reliable \emph{dense flow}, especially in the regions without any triggered events. In this paper, we propose a novel deep learning-based dense and continuous optical flow estimation framework from a single image with event streams, which facilitates the accurate perception of high-speed motion. Specifically, we first propose an event-image fusion and correlation module to effectively exploit the internal motion from two different modalities of data. Then we propose an iterative update network structure with bidirectional training for optical flow prediction. Therefore, our model can estimate reliable dense flow as two-frame-based methods, as well as estimate temporal continuous flow as event-based methods. Extensive experimental results on both synthetic and real captured datasets demonstrate that our model outperforms existing event-based state-of-the-art methods and our designed baselines for accurate dense and continuous optical flow estimation.

DCEIFlow structure

We use the feature extractor (left) to obtain event and image features and compute the matching correlation using our proposed event-image fusion and correlation construction module (middle). Then we feed them into the iterative flow updater (right) to update the flow iteratively. After the last iteration, we apply the up-sample operation to get the full-resolution output. The structures enclosed by the orange box need to be iteratively updated.

Continuous Flow

Visual comparisons of continuous flow prediction with different time intervals. dt denotes the frame interval (e.g., dt=1.0 represents the time interval between two adjacent frames). The two-frame (I1+I2) approaches can only estimate dense optical flow between frames. The event-only (E) approaches can estimate continuous optical flow, but cannot predict accurate dense flow. Our model can estimate both dense and continuous optical flow by fusing events and the first image (I1+E).

Acknowledgments

This research was sponsored by Zhejiang Lab.

Thanks the assiciate editor and the reviewers for their comments, which is very helpful to improve our paper.

Thanks for the following helpful open source projects: RAFT, event_utils, EV-FlowNet, mvsec_eval, esim_py, DVS-Voltmeter, Spike-FlowNet.