I’m a Staff Research Scientist at Waymo, the autonomous driving company formerly known as the Google Self-Driving Car Project. At Waymo, I’m developing new technologies for autonomous vehicles in areas such as reconstructive sensor simulation (NeRF/3DGS), generative sensor simulation, sensor fusion, multi-task learning and foundation models. I have deployed numerous safety-critical models to Waymo’s fully autonomous vehicle fleet, which has served millions of trips to customers across various markets.A subset of my research is published at CVPR, ICCV, CoRL, IROS and ICRA, and I hold numerous international patents in the autonomous driving domain. I have been organizing the AV industry’s primary academic workshop at CVPR in 2022, 2023, 2024 and 2025.
I enjoy interdisciplinary work, and have broad experience in machine learning, deep learning and computer vision. Before joining Waymo, I pursued research in domains such as computational perception, aerial robotics and biomedical imaging. Some of my previous projects were related to the study of human memory (at MIT), machine learning applications in healthcare (with Massachusetts General Hospital), astronomy (with the Harvard-Smithsonian Center) and electron microscopy (with the Harvard Lichtman Lab).
News
02/26/2025 |
New paper at CVPR’25: “SceneCrafter: Controllable Multi-View Driving Scene Editing” |
01/01/2025 |
I’m organizing the Workshop on Autonomous Driving at CVPR’25 in Nashville, TN |
01/29/2024 |
New paper at ICRA’24: “LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection” |
01/01/2024 |
I’m organizing the Workshop on Autonomous Driving at CVPR’24 in Seattle, WA |
06/29/2023 |
Recordings of the CVPR WAD 2023 workshop are available now. |
01/01/2023 |
I’m organizing the Workshop on Autonomous Driving at CVPR’23 in Vancouver, Canada |
06/20/2022 |
New paper at IROS’22: “Instance Segmentation with Cross-Modal Consistency” |
06/20/2022 |
Organized the Workshop on Autonomous Driving at CVPR’22 |
06/14/2022 |
Our Block-NeRF dataset is now available. |
03/01/2022 |
New paper at CVPR’22: “Block-NeRF: Scalable Large Scene Neural View Synthesis” (oral presentation) |
01/16/2022 |
New preprint: “GradTail: Learning Long-Tailed Data Using Gradient-based Sample Weighting” |
07/22/2021 |
New paper at ICCV’21: “4D-Net for Learned Multi-Modal Alignment” |
03/01/2021 |
New paper at CVPR’21: “Taskology: Utilizing Task Relations at Scale” (oral presentation) |
10/14/2020 |
New paper at CoRL’20: “Unsupervised Monocular Depth Learning in Dynamic Scenes” |
07/02/2020 |
New paper at ECCV’20: “Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability” |
07/01/2020 |
New paper at UIST’20: “Predicting Visual Importance Across Graphic Design Types“ |
04/10/2020 |
New paper at Medical Imaging with Deep Learning (MIDL’20): “Fast Mitochondria Segmentation For Connectomics” |
02/10/2020 |
I am co-organizing the 4D-VISION workshop at ECCV’20 |
01/22/2020 |
I co-organized the ComputeFest Transfer Learning workshop at Harvard |
10/02/2019 |
New paper at SVRHM, NeurIPS’19: “To Decay or not to Decay: Modeling Video Memorability Over Time” |
08/19/2019 |
I’m joining Waymo as a Research Scientist |
05/30/2019 |
Graduated from Harvard University with a Master’s degree in Computational Science and Engineering |
04/30/2019 |
New paper at Robotics: Science and Systems (RSS’19): “OIL: Observational Imitation Learning” |
04/16/2019 |
New paper at VOCVALC, CVPR’19: “Unsupervised Monocular Depth and Ego-motion Learning with Structure and Semantics” |
04/06/2019 |
New paper at UAVISION, CVPR’19: “Learning a Controller Fusion Network by Online Trajectory Filtering” |
01/23/2019 |
Gave a workshop on “Convolutional Autoencoders for Image Manipulation” at ComputeFest 2019 |
11/28/2018 |
New project released: OIL: Observational Imitation Learning |
11/27/2018 |
New blog post on our struct2depth work on Google’s AI blog |
11/19/2018 |
The code for our struct2depth paper is now part of the TensorFlow models repository |
11/01/2018 |
New paper at AAAI’19: “Depth Prediction Without The Sensors: Leveraging Structure For Unsupervised Learning From Monocular Videos” |
10/06/2018 |
Joined the MIT Computational Perception & Cognition Lab, led by Aude Olivia |
09/08/2018 |
We won the best paper award at UAVISION 2018 |
08/03/2018 |
We are presenting our work on autonomous drone racing on Sept 8 at UAVISION, ECCV’18 |
05/29/2018 |
Started internship in the Google Brain Robotics group |
05/22/2018 |
Our new datasets for connectomics research are now publicly available: Kasthuri++ and Lucchi++ |
05/21/2018 |
Release of new tutorial for Bayesian GAN |
02/08/2018 |
Started new project on Connectomics with the Visual Computing Group (VCG) |
01/22/2018 |
Started new collaboration with the Center for Clinical Data Science (CCDS) |
11/23/2017 |
New paper in IJCV: “Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications” (full text) |
10/21/2017 |
Official release of Sim4CV, our simulation environment for Computer Vision |
09/01/2017 |
Started Master’s program in Computational Science and Engineering |
08/19/2017 |
New paper at UAVision, ECCV’18: “Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation“ |
05/24/2017 |
Recipient of German Academic Scholarship Foundation US-Scholarship (Studienstiftung) |
03/28/2017 |
Recipient of DAAD Graduate Scholarship |
Publications
2025-03-04 03:37:21
89
SceneCrafter: Controllable Multi-View Driving Scene Editing
Zehao Zhu, Yuliang Zou, Chiyu “Max” Jiang, Bo Sun, Vincent Casser, Xiukun Huang, Jiahao Wang, Zhenpei Yang, Ruiqie Gao, Leonidas Guibas, Mingxing Tan, Dragomir Anguelov
Conference on Computer Vision and Pattern Recognition (CVPR’25).
2025-03-04 03:42:44
90
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection
Wayne Hung, Vincent Casser, Henrik Kretzschmar, Jyh-Jing Hwang, Dragomir Anguelov:
IEEE International Conference on Robotics and Automation (ICRA’24).
Full link
2025-03-04 03:45:47
91
Block-NeRF: Scalable Neural Rendering
Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P. Srinivasan, Jon T. Barron, Henrik Kretzschmar
Conference on Computer Vision and Pattern Recognition (CVPR’22). Oral presentation.
Full link
2025-03-04 03:47:28
92
Instance Segmentation with Cross-Modal Consistency
Alex Zhu, Vincent Casser, Reza Mahjourian, Henrik Kretzschmar and Soeren Pirk
International Conference on Intelligent Robots and Systems (IROS’22)
Full link
2025-03-04 03:49:11
93
4D-Net for Learned Multi-Modal Alignment
AJ Piergiovanni, Vincent Casser, Michael Ryoo and Anelia Angelova
International Conference on Computer Vision (ICCV’21)
Full link
2025-03-04 03:52:05
94
Taskology: Utilizing Task Relations at Scale
Yao Lu, Soeren Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova and Ariel Gordon
Conference on Computer Vision and Pattern Recognition (CVPR’21). Oral presentation.
Full link
2025-03-04 03:53:20
95
Unsupervised Monocular Depth Learning in Dynamic Scenes
Hanhan Li, Ariel Gordon, Hang Zhao, Vincent Casser, Anelia Angelova
Conference on Robot Learning (CoRL’20)
Full link
2025-03-04 03:54:38
96
Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability
Camilo Fosco, Anelise Newman, Vincent Casser, Allen Lee, Barry McNamara and Aude Oliva
European Conference on Computer Vision (ECCV’20)
Full link
2025-03-04 03:56:41
97
Predicting Visual Importance Across Graphic Design Types
Camilo Fosco, Vincent Casser, Amish K. Bedi, Peter O’Donovan, Aaron Hertzmann and Zoya Bylinskii
ACM User Interface Software and Technology Symposium (UIST’20)
Full link
2025-03-04 03:58:39
98
Fast Mitochondria Segmentation for Connectomics
Vincent Casser, Kai Kang, Hanspeter Pfister and Daniel Haehn
Medical Imaging with Deep Learning (MIDL’20)
Full link
2025-03-04 03:59:26
99
Depth Prediction Without the Sensors
Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
Thirty-Third AAAI Conference on Artificial Intelligence (AAAI’19)
Full link
2025-03-04 04:02:54
101
OIL: Observational Imitation Learning
Guohao Li and Matthias Mueller, Vincent Casser, Neil Smith, Dominik Michels, Bernard Ghanem
Robotics: Science and Systems (RSS’19)
Full link
2025-03-04 04:01:53
100
Sim4CV: A Photo-Realistic Simulator for Computer Vision
Matthias Mueller, Vincent Casser, Jean Lahoud, Neil Smith, Bernard Ghanem
International Journal of Computer Vision (IJCV)
Full link