Paper-Conference

Dissecting Deep RL with High Update Ratios: Combatting Value Divergence

We show that deep reinforcement learning can maintain its ability to learn without resetting network parameters in settings where the …

Marcel Hussing, Claas Voelcker, Igor Gilitschenski, Amir-Massoud Farahmand, Eric Eaton

When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning

We investigate the impact of auxiliary learning tasks such as observation reconstruction and latent self-prediction on the …

Claas Voelcker, Tyler Kastner, Igor Gilitschenski, Amir-Massoud Farahmand

Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention

Understanding road geometry is a critical component of the autonomous vehicle (AV) stack. While high-definition (HD) maps can readily …

Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic

Watch Your Steps: Local Image and Scene Editing by Text Instruction

Denoising diffusion models have enabled high-quality image generation and editing. We present a method to localize the desired edit …

Ashkan Mirzaei, Tristan Aumentado-Armstrong, Marcus A. Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G. Derpanis, Igor Gilitschenski

LEOD: Label-Efficient Object Detection for Event Cameras

Object detection with event cameras benefits from the sensor’s low latency and high dynamic range. However, it is costly to fully label …

Ziyi Wu, Mathias Gehrig, Qing Lyu, Xudong Liu, Igor Gilitschenski

Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

High-definition (HD) maps have played an integral role in the development of modern autonomous vehicle (AV) stacks albeit with high …

Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic

SPAD: Spatially Aware Multi-View Diffusers

We present SPAD, a novel approach for creating consistent multi-view images from text prompts or single images. To enable multi-view …

Yash Kant, Aliaksandr Siarohin, Ziyi Wu, Michael Vasilkovsky, Guocheng Qian, Jian Ren, Riza Alp Guler, Bernard Ghanem, Sergey Tulyakov, Igor Gilitschenski

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

Large-scale multi-task robotic manipulation systems often rely on text to specify the task. In this work, we explore whether a robot …

Vidhi Jain, Maria Attarian, Nikhil J Joshi, Ayzaan Wahid, Danny Driess, Quan Vuong, Pannag R Sanketi, Pierre Sermanet, Stefan Welker, Christine Chan, Igor Gilitschenski, Yonatan Nisk, Debidatta Dwibedi

AvatarOne: Monocular 3D Human Animation

Reconstructing realistic human avatars from monocular videos is a challenge that demands intricate modeling of 3D surface and …

Akash Karthikeyan, Robert Ren, Yash Kant, Igor Gilitschenski

iNVS: Repurposing Diffusion Inpainters for Novel View Synthesis

We present a method for generating consistent novel views from a single source image. Our approach focuses on maximizing the reuse of …

Yash Kant, Aliaksandr Siarohin, Michael Vasilkovsky, Riza Alp Guler, Jian Ren, Sergey Tulyakov, Igor Gilitschenski