#180 "Real Time Traffic Congestion Prediction and Mitigation at the City Scale" (year 2, continuation from year 1)

Principal Investigator
John Shen
Start Date
July 1, 2018
End Date
March 31, 2020
Project Type
Research Applied
Grant Program
FAST Act - Mobility National (2016 - 2022)
Grant Cycle
2018 Mobility21 UTC


For the first time in human history we have the necessary tools to pursue ambitious experimental research on human mobility at the global scale. Researchers from fields of computer, information, data, behavior, and social sciences, may finally have their “Large Hadron Collider” to sense, curate, and analyze an incredible amount of real-world human mobility data; this is enabled by the ubiquitous wireless connectivity and over six billion mobile devices and connected vehicles.

This project focuses on vehicular traffic in major cities around the world. Our research involves: 1) Sensing: Collect and curate GPS traces from large fleets of vehicles in major cities; 2) Analytics: Leverage data analytic and machine learning techniques to generate accurate traffic flow and congestion models based on extensive historical data; and 3) Services: a. Develop accurate real-time prediction system that utilizes historical models and real-time data; and b. Develop novel ways to introduce real-time intervention to mitigate the potential on set of traffic congestions.

We plan to partner with the Data Science research group at Uber. CMU PhD students will be able to access real-world data as research interns at Uber. Without such access the proposed research would be impossible. With Uber as our deployment partner, we have the opportunity to deploy our research ideas in the real world environment and to gather data via such in-situ experiments.

Previous studies on vehicular traffic were mostly based on simulation and limited field-collected data from taxi fleets from just few cities. We have the opportunity to compare traffic patterns from major cities around the world to characterize their similarities and differences. We can research traffic congestion prediction and mitigation techniques that take into account cultural and driver behavior differences. We also want to research the potential of leveraging private enterprises to produce valuable public services for societal good.    
1. Background
There has been a great deal of research on vehicular traffic monitoring and modelling, with focus on major freeways and more recently on arterial roads as well. More recent work focused on travel time prediction in the urban environments. Most of the research efforts are based on limited traffic data from a few cities and limited time duration. It has not been possible to deploy research ideas in the real world environment to collect in-situ experimental data to validate and extend the research ideas.
2. Proposed Research
Our proposed research employs end-to-end experimental systems approach to address the problem of traffic congestions in major cities around the world by partnering with a ride sharing company with global presence (Uber). Our research spans the three domains of Sensing, Analytics, and Services associated with vehicular traffic in urban environments. We now describe proposed research tasks.
A. Sensing
Our research is driven by the availability of and insights from real-world data on vehicular traffic. CMU PhD students will have access to such data as research interns at Uber. The raw GPS data from Uber vehicles are curated to produce GPS traces for all the rides serviced by these vehicles. Below we illustrate such traces for San Francisco, Pittsburgh, and Bangalore.
B. Analytics
a. Create Traffic Model Based on Historical Data: Based on the GPS traces, we believe it is possible to create accurate model of traffic flow in a city. Such a model can capture the temporal dimension, e.g. time of day, day of week, and week of year, as well as the spatial dimension based on the map geometry of a city. This model can capture the typical traffic pattern of a city. By querying this model, we can obtain information on the typical or expected traffic condition for a given time and location.
b. Compare Traffic Models of Different Cities: It is an open research question whether all the major cities in the world exhibit similar traffic flow and congestion patterns. We will investigate the traffic models of various cities to determine if the patterns are similar or different, and to potentially identify major “canonical” categories of city traffic patterns.
c. Extract Macro-Models for Specific Events: We believe the historical model can capture certain patterns of traffic due to regularity in human and commercial behavior in a city. However, the traffic patterns are also affected by irregular events. These events can be scheduled or unscheduled events. One of our research goals is to explore the feasibility of extracting localized (in both the temporal and spatial dimensions) “macro-models” for such events from the overall historical model. Such macro-models can be very useful.
C. Services
a. Develop Travel Time Prediction System: There is significant recent research focusing on travel time prediction. We believe we can leverage the historical model in conjunction with real-time data to achieve very accurate real-time travel time prediction. There is the opportunity to conduct larger-scale trial deployment of our prediction system in the real-world environment in order to assess the actual prediction accuracy.
b. Explore the Feasibility of Traffic Intervention: Partnering with Uber provides us the opportunity to explore the feasibility of directly introduce intervention into the traffic flow in real time using Uber vehicles, with the goal of mitigating the potential (predicted) on set of traffic congestions. This can be a very powerful tool for reducing congestion.
c. Explore the Potential of Large Scale Pooling: One possible form of intervention is the dynamic formation of ride pooling based on real-time ride requests. Uber has recently launched Uber Pool that allows multiple riders to share a ride. We plan to assess the potential of ideal ride pooling and to develop ways to approach the ideal bound. Some recent research indicated this can be very promising to reduce congestion.
d. Develop Potential Solutions for Urban Problems: Having robust predictive traffic models at the city scale can help local government bodies to reduce congested commutes. Moreover, cities can use such models as useful tools to accurately predict outcomes of “what if” scenarios to more efficiently manage transportation resources and air quality.
3. Project Plan and Potential Contributions
This project will involve two PhD students, and partnership with the Data Science research group at Uber. This group is located in San Carlos, only about 15 minutes from the CMU Silicon Valley campus. This is a brand new partnership that can have strategic significance for CMU Silicon Valley campus. We plan to launch this 3-year project in October 2016 and end in September 2019. We will have very strong support from Uber researchers. We are quite open to collaboration with other CMU groups and external research groups as well. Currently we do not have committed matching funds for this project but will be pursuing such funding aggressively.
We expect that this project will lead to unprecedented research results that can impact both the research community as well as the transportation industry. Our research on traffic data analytics embodies two foundational convictions. We strongly believe such research must be based on actual and extensive real world data, otherwise the research results may not have any practical value. We also believe that the goals for such research must lead to genuinely useful services for the public. We also hope to demonstrate that a private enterprise can be leveraged to produce useful services for the public and to realize common good for cities around the world.
July 1, 2018 - September 30, 2019
Strategic Description / RD&T

Deployment Plan
Currently pursuing potential deployment partners from industry.
Expected Outcomes/Impacts
Expected key accomplishments for the second year of this project:
1. Data driven assessment of the potentials  of ride pooling at the city scale.
2. Develop city-scale human mobility patterns using machine learning techniques.
Expected Outputs



Individuals Involved

Email Name Affiliation Role Position
ajauhri@cmu.edu Jauhri, Abhinav ECE/SV Other Student - PhD
jpshen@cmu.edu Shen, John ECE/SV PI Faculty - Tenured


Amount of UTC Funds Awarded
Total Project Budget (from all funding sources)


Type Name Uploaded
Data Management Plan dmp_9RUxyvK_s9foor7.docx Jan. 12, 2018, 11:44 p.m.
Publication ICLR_2019_Submission_aFkAd5e.pdf Sept. 29, 2018, 5:24 p.m.
Presentation 2018_MASITE_ITSPA_pZyfU3g.pdf Sept. 29, 2018, 5:24 p.m.
Progress Report 180_Progress_Report_2018-09-30 Sept. 29, 2018, 5:24 p.m.
Publication Generating Realistic Ride-Hailing Data Sets Using GANs and Validating the Synthetic Data Sets Against Real Data Sets March 30, 2019, 5:06 p.m.
Presentation Using GANs for Generating Synthetic Data Sets” March 30, 2019, 5:06 p.m.
Progress Report 180_Progress_Report_2021-03-31 March 31, 2021, 2:54 p.m.
Final Report final_report_-_180_TG3E82K.pdf July 7, 2021, 12:23 p.m.

Match Sources

No match sources!


Name Type
Ericsson None