Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation
V. Casser and M. Mueller, N. Smith, D. Michels and B. Ghanem: “Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation.” 2nd International Workshop on Computer Vision for UAVs, ECCV’18, 2018. Best paper award. (full text)
M. Mueller, G. Li, V. Casser, N. Smith, D. Michels, B. Ghanem: “Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing.” 3rd International Workshop on Computer Vision for UAVs, CVPR’19. (full text)
Abstract: Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation.
Automating the navigation of unmanned aerial vehicles (UAVs) in diverse scenarios has gained much attention in recent years. However, teaching UAVs to fly in challenging environments remains an unsolved problem, mainly due to the lack of training data. In this paper, we train a deep neural network to predict UAV controls from raw image data for the task of autonomous UAV racing in a photo-realistic simulation. Training is done through imitation learning with data augmentation to allow for the correction of navigation mistakes. Extensive experiments demonstrate that our trained network (when sufficient data augmentation is used) outperforms state-of-the-art methods and flies more consistently than many human pilots. Additionally, we show that our optimized network ar- chitecture can run in real-time on embedded hardware, allowing for efficient on- board processing critical for real-world deployment. From a broader perspective, our results underline the importance of extensive data augmentation techniques to improve robustness in end-to-end learning setups.
Abstract: Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing.
Autonomous UAV racing has recently emerged as an interesting research problem. The dream is to beat humans in this new fast-paced sport. A common approach is to learn an end-to-end policy that directly predicts controls from raw images by imitating an expert. However, such a policy is limited by the expert it imitates and scaling to other environments and vehicle dynamics is difficult. One approach to overcome the drawbacks of an end-to-end policy is to train a network only on the perception task and handle control with a PID or MPC controller. However, a single controller must be extensively tuned and cannot usually cover the whole state space. In this paper, we propose learning an optimized controller using a DNN that fuses multiple controllers. The network learns a robust controller with online trajectory filtering, which suppresses noisy trajectories and imperfections of individual controllers. The result is a network that is able to learn a good fusion of filtered trajectories from different controllers leading to significant improvements in overall performance. We compare our trained network to controllers it has learned from, end-to-end baselines and human pilots in a realistic simulation; our network beats all baselines in extensive experiments and approaches the performance of a professional human pilot.