LEOD: Label-Efficient Object Detection for Event Cameras

Abstract

Object detection with event cameras benefits from the sensor’s low latency and high dynamic range. However, it is costly to fully label event streams for supervised training due to their high temporal resolution. To reduce this cost, we present LEOD, the first method for label-efficient event-based detection. Our approach unifies weakly- and semi-supervised object detection with a self-training mechanism. We first utilize a detector pre-trained on limited labels to produce pseudo ground truth on unlabeled events. Then, the detector is re-trained with both real and generated labels. Leveraging the temporal consistency of events, we run bi-directional inference and apply tracking-based post- processing to enhance the quality of pseudo labels. To stabilize training against label noise, we further design a soft anchor assignment strategy. We introduce new experimental protocols to evaluate the task of label-efficient event-based detection on Gen1 and 1Mpx datasets. LEOD consistently outperforms supervised baselines across various labeling ratios. For example, on Gen1, it improves mAP by 8.6% and 7.8% for RVT-S trained with 1% and 2% labels. On 1Mpx, RVT-S with 10% labels even surpasses its fully-supervised counterpart using 100% labels. LEOD maintains its effectiveness even when all labeled data are available, reaching new state-of-the-art results. Finally, we show that our method readily scales to improve larger detectors as well.

Publication
Conference on Computer Vision and Pattern Recognition (CVPR)

Toronto Intelligent Systems Lab Co-authors

Ziyi Wu
Ziyi Wu
PhD Student

Hi! I am a PhD student working on computer vision. My research interests include representation learning, 3D vision, and event-based vision.

Igor Gilitschenski
Igor Gilitschenski
Assistant Professor