A high-quality driving dataset is a key ingredient to thrive the autonomous vehicle industry in Pittsburgh and build a smart city for the residence. In this project, we aim to build the world’s first scenario-based driving database that is dedicated to connected and autonomous vehicles. We plan to record and model the dynamic traffic information in Pittsburgh from heterogeneous driving data such as lidar point cloud, vision information, GPS, etc. Dynamic unsupervised learning then will be applied to automatically extract typical driving scenarios automatically. Our data collection platform is equipped with multiple advanced sensors including Lidar, high-resolution camera, radar, GPS, IMU units, and vehicle information such as steering wheels and braking pedals. The platform is able to capture the complex and informative real-world driving scenarios and categorize them as high-dimensional and heterogeneous time series data. After that, an unsupervised learning approach based on nonparametric Bayesian will be applied to learn and recognize driving scenarios by segmentation. A user-friendly web application will be developed to provide the dataset to public from a scenario perspective. Particularly, we plan to work closely with the department of mobility of the city and integrate the DSRC and smart cities information (e.g. traffic light, grid, event, weather, etc) into our analysis. The confidence of success stems from the lab’s accumulated efforts in developing the automated vehicle platforms and unsupervised machine learning theories supported by Toyota, Uber, Ford, Mcity, etc.
Pittsburgh has become a city of smart mobility. Hundreds of autonomous vehicles (AV) made by Uber, Argo AI, and Aurora are driving through the city daily. AV industry is a great opportunity to thrive the city and provide jobs, but also poses concerns of public safety unveiled by the recent crashes involving Uber, Waymo, and Tesla. High-quality driving datasets can be the keystone for autonomous vehicle researchers. Most existing traffic dataset only contains processed raw data from the view of ego vehicle and provide discretized driving data such as image and data sequences. However, this may shadow the important interaction information in driving behavior and will cause defects in autonomous vehicle development. To reduce the risks, a key ingredient is to understand the behavior of multiple agents in traffic scenarios with respect to their dynamic interactions. While the dynamic model for a single vehicle is well developed, the modeling for driving scenarios involving a large number of vehicles remains unsolved. The varying number of vehicles, cyclists, and pedestrians leads to a high heterogeneity in traffic data and obstacle the researchers. Moreover, apart from single or two-vehicle behaviors which can be described by empirical narratives such as ‘left turn’, ‘car following’ or ‘overtaking’. The definition of multi-agent traffic behavior is beyond human knowledge and thereby barrage human research from recognizing and labeling those traffic scenarios who have an extremely large number of involving agents. In order to identify the dynamic pattern of vehicle interaction behaviors from large traffic data, an unsupervised learning approach may help to recognize and classify the data without prior knowledge. In this project, we propose an automatic way to extract dynamic driving scenarios and build a driving library that contains interaction scenario information in the Pittsburgh. Our data collection platform vehicle, which is equipped with several advanced sensors including high-resolution cameras, Lidar, radar, and GPS unit, will be utilized to collect the data. The raw data will be processed into multi-dimensional time series. The encountered vehicles, cyclists and pedestrians will be perceived by perception approaches. What’s more, in order to better understand the driving interaction behaviors, an unsupervised learning approach based on nonparametric Bayesian will be applied to learn and recognize driving scenarios by segmentation. Traffic primitives represent the fundamental driving scenarios akin to words in our previous researches. We applied Bayesian unsupervised learning based on hierarchical Dirichlet process and successfully extracted the traffic primitives from a massive dataset recorded in Ann Arbor. Besides, a web application will be built to provide the dataset to public from a scenario perspective, from which users can query the driving scenario data from a dynamic interaction perspective. In summary, the scenario-based dataset established a link between an individual autonomous and the city. The output of the project will facilitate self-driving development and testing, as well as other aspects including policy-making, insurance, business model, security, and privacy.
Tasks are described in the detailed plan session. Task 1 (Data): 1/1/2019-8/31/2019 Task 2 (Theory): 1/1/2019-6/30/2019 Task 3 (Process): 3/1/2019-10/31/2019 Task 4 (Web): 7/1/2019-12/31/2019
The project will be divided into the following tasks: Task 1. Integrate a data collection platform by installing multiple advanced sensors such as camera, radar, Lidar, and IMU. Collect the data around the city Task 2. Develop dynamic unsupervised learning based on nonparametric Bayesian learning. Task 3. Design the data processing method to extract traffic scenarios in the data based on unsupervised learning method. Task 4. Develop and set up the web application for a public database.
Deliverables in this project include: 1. Novel theories to automatically and efficiently recognize the dynamic driving scenarios. 2. A scenario-based data library for smart mobility in Pittsburgh 3. A website (traffic-net.org) providing the world’s first scenario-based public AV driving database
Name | Affiliation | Role | Position | |
---|---|---|---|---|
dingzhao@cmu.edu | Zhao, Ding | Carnegie Mellon School of Engineering | PI | Faculty - Untenured, Tenure Track |
No match sources!
No partners!