#480 Mitigating crash risks in work zones: causal inference and Crash Modification Factors

Principal Investigator
Sean Qian
Start Date
July 1, 2023
End Date
June 30, 2024
Project Type
Research Advanced
Grant Program
US DOT BIL, Safety21, 2023 - 2028 (4811)
Grant Cycle
Safety21 : 23-24


In the U.S., a work zone crash occurred every five minutes during 2015 – 2019. It is still unclear the causes of those crashes, such as work zone configurations, weather conditions, work zone duration, and roadway characteristics. The causal impact of those potential factors to work zone crashes may vary substantially cross different types of roads and traffic flow characteristics. Agencies have been working on mitigating work zone crash risk by implementing work zone countermeasures, such as increasing work zone duration, left-hand merge, downstream lane shift, increasing the inside shoulder width, and two-way two-lane operations. The effectiveness of such countermeasures is typically evaluated by a Crash Modification Factor (referred to as CMF throughout this proposal), and more generally, Crash Modification Functions. To this end, crash modification factors (CMF) are typically unknown for work zones.  For example, the Manual on Uniform Traffic Control De- vices (MUTCD) provides qualitative recommendations of signs, flaggers, or closure settings for work zones with different characteristics, but no quantitative CMFunctions. FHWA CMF clearinghouse only provides the CMFunctions for implementing left-hand merge and downstream lane shift in rural areas, and modifying shoulder width in urban areas. 

It is critical to understand the root causes of work zone crashes and propose effective strategies to reduce crash occurrences. This project will continue the work zone safety analysis models developed by CMU Mobility Data Analytics Center, by extending to a full set of all potential causal factors and deploying the models to a number of state agencies (e.g., PA, MD, CA). Based on the models, the team will also establish a systematic approach to estimate the CMF for work zones under various roadway and work zone characteristics. In addition, an online web-based traffic safety analysis tool for selected deployment partners will be developed. Up-to-date safety data from various data providers can be acquired, archived and analyzed to enhance the web application over time. The safety data providers include State and local agencies, Police Department, Waze, and other private data sources (such as INRIX and TomTom).  The team will integrate and analyze large-scale crash and incident data, and developed an online tool to visualize and forecast crash types, frequencies and severity for an actual or hypothetical work zone deployments on each road segment, along with mitigating strategies for agencies’ decision making.

This research is based upon funded research ‘Mobility Data Analytics Center’ in the years of 2016-2023, with the focus on data-driven safety analysis and improvement. In the past two years, we have started building a data engine and a prototype web application to demonstrate the feasibility of multi-source data-driven decision making for state DOTs. We started from the PA where we have close partnerships with many local entities, and have successfully applied our data analytics tools in several case studies.  Our intension in this project is update the safety analysis models for both PA and MD (possible for CA too), and develop Crash Modification Factors specifically for work zones on state owned roads, which fills the gap that CMFs were rigorously developed for roads without active construction projects. 

First, we will implement a rigorous causal inference model (from MAC’s prior research studies) to infer the causal effect of work zones on crash risk across different work zone configurations, roadway functional classifications, weather conditions, and traffic conditions. The causal forest model avoids potential spurious heterogeneous treatment effects (HTE) by systematically identifying the heterogeneity of treatment effects. In addition, the developed method incorporates the causal forest method with the fixed-effect variable representing road segments to mitigate the unobserved confounding bias in work zone safety studies. The proposed method will be implemented using multi-source data sets of thousands of work zones in PA and MD between 2018 and mid-2022 to control for the complex built and natural environments and reduce the bias of the estimated HTE.

Second, safety analytics of work zones for the City of Pittsburgh and two selected counties in MD (Montgomery and Howard County) for developing initial Crash Modification Factors (CMFs). We will implement our rigorously developed econometrics models, Regression Discontinuity, to relate the work zone configurations to the safety risks, e.g. work zone length, buffer zone, roadway characteristics, traffic volumes, speed limits, etc. Crash Modification Factors can then be estimated from those causal models that encapsulate the impact of one factor to the work zone safety, holding everything else the same. This will be compared to the Highway Safety Manual to ensure CMFs are consistent with other road conditions and characteristics. We will build a web-GIS application to provide a user interface to visualize all work zones by crash risk, potential conflicts with traffic flow, and buffer zone, roadway characteristics, traffic volumes, speed limits, identify safety hot spots, highlight potential crash causes.  In addition, suggestions on how to improve work zones safety for each work zone will be provided as part of the web-GIS application. Users can visualize the current cause of work zone related crashes and be advised with measures that can potentially reduce work zone safety risks. This will also be streamlined with WZDx, work zone data exchange, to ensure the output data format is consistent with WZDx, which can be used by all stakeholders following this national standard. 

Third, establishment of an online traffic safety analysis tool for PA and MD using the safety data in 2022-2023. We continue to collect and archive up-to-date safety data from various data providers in the both states and enhance the web application. The safety data providers include PennDOT, MDOT SHA, Waze, and other private data sources (such as INRIX and TomTom).  We will integrate and analyze large-scale crash and incident data, and developed an online tool to visualize and forecast crash types, frequencies and severity for each road segment in PA and MD. The web application allows travelers and agencies access historical, real-time, and forecasted traffic safety metrics on state owned roads. The servers hosting the web application will be optimized for load balancing. We will continue to interview various data resource providers in both states to enhance the quality and quantity of massive data, including governmental agencies, consulting firms and private data providers.


Strategic Description / RD&T
The research aligns with USDOT priorities of human factors with the objective of analyzing how human factors, such as socio-demographic, human interaction with work zones, construction projects and roadway design, would impact safety risks, and how to optimally create the best built environment to improve safety.  This research also addresses data driven system safety priority by analyzing the safety of statewide work zones with the integration of multi-source data from GIS, social media, infrastructure, land-use, Census data and traffic data. Likewise, the project aligns with the Safety21 UTC focus on promoting transportation systems safety.
Deployment Plan
July – September 2023
1.	 Briefs and Demos to Maryland DOT SHA, Caltrans, toXcel Inc. 

October – December 2023
1.	 Briefs and Demos to Maryland DOT SHA, Caltrans, toXcel Inc.

January – March 2024
1.	Briefs and Demos to PennDOT
2.	Briefs and Demos to other areas, such as FHWA Turner-Fairbank research center

April – June 2024 
1.	Briefs and Demos to Maryland DOT SHA, Caltrans, toXcel Inc.
2.	Develop research report.
3.	Develop a prototype dashboard web-GIS application for PA/MD state roads
4.	Develop policy brief for legislators.

Overall plan

We will work closely with the State of PA, MD, CA to implement this research. The team consisting of the PI, research scientist and phd students will hold a bi-weekly coordination calls to discuss difficulties encountered and proposed solutions, and to outline plans for completing the scope of work, key milestones and deliverables. When performing the tasks, we will together meet with project managers, engineers and staff at those state agencies who provides feedback/comments for each month, to ensure the model development and testing are consistent with state DOTs’ view, and the tasks are aligned with the partners’ needs.
In terms of implementation barriers, we will evaluate and prioritize barriers as the project progresses. Main potential barriers for this project are to identify work zone locations and configurations, and utilize them as independent variables to develop safety causal model. If any risks or barriers are identified during the project, we will use our domain expertise to find alternative methods or to seek professional help from both data and methodology perspectives utilizing resources from CMU and deployment partners, and more broadly, through the CMU’s National University Transportation Center on Safety (Safety21). The core CMU team at Mobility Data Analytics Center (MAC) has sufficient and somewhat overlapping expertise in safety data analytics, travel behavioral model, and GIS that we can reallocate personnel if needed. Another advantage is that this research team at MAC have been intensively collaborating on research projects in the past two years with necessary data analytics, GIS and traffic engineering skill sets, including one successful MDOT research project, two successful PennDOT research projects and thus are able to reduce or eliminated those barriers.

Upon the completion of this project, we plan to actively seek both industrial and federal funding based on this initial development. Our framework is applicable to any large traffic networks with safety data and work zone information. This generality will attract attentions from various public agencies and non-profit organizations to better deploy safe roadway infrastructure and road construction project. Potential funding agencies/collaborators include the Department of Transportation, Federal Highway Administration, TRB, state DOTs, MPOs/RPOs, and local non-profits and mobility service companies.
Expected Outcomes/Impacts
The expected outcome of this research is a novel framework of work zone safety models and tools that are generally applicable for any regional transportation networks. It also estimate CMFs for work zones that quantifies the safety risks of work zones and optimizes safety metrics of upcoming work zone designs. This framework will be delivered with a set of open-source codes shared online, followed by a prototype web application that implements it using multi-modal data collected over many years in the state of PA and MD. The application also provides user interfaces to manage various scenarios of work zone configurations, weather conditions and driving behaviors, and visualize the resultant system metrics for any road segments of interest. One case study will be conducted for assisting public agencies on setting guidelines for planning and operating work zones for PA. We will actively engage state DOTs, PennDOT, MDOT SHA and Caltrans, to gauge their interest and deploy those tools for their day-to-day operations. 
Expected Outputs
Modeling scripts	The scripts for work zone safety analysis and CMF calculations, with a description of different causal variables used in the model.	
June 30, 2024	
Will be shared with state DOT officials for establishing CMF of work zones, and safety analysis dashboard

Final report	A technical report summarizing all data sets, innovations, technical details of proposed econometric models, solution algorithms, CMFs validated case studies, and findings.	
June 30, 2024	
The report will be fully edited and ready for publication in academic journals.
Manual on Uniform Traffic Control De- vices (MUTCD) provides qualitative recommendations of signs, flaggers, or closure settings for work zones with different characteristics, but no quantitative CMFunctions (FHWA, 2019). In addition, the online database of CMF, the CMF Clearinghouse (FHWA, 2021), which is supposed to ‘‘compile all documented CMFs in a central location’’ by Federal Highway Administration (FHWA), only provides the CMFunc- tions for implementing left-hand merge and downstream lane shift in rural areas, and modifying shoulder width in urban areas (FHWA, 2021). The lack of including all influential factors on work zone related crashes limits the applications of CMFunctions, calling for a holistic and complete set of CMFunctions or CMFs to ensure work zone safety under various natural or built environment.

Various environmental conditions can impact crash risk caused by work zones, such as work zone configurations, roadway functional classification, weather conditions, and traffic conditions. With respect to roadway function classification, it has been identified that the crash rate decreases with the presence of work zones on urban non-interstate highways and increases with the presence of work zones on other highways (Jin et al., 2008), while work zones in rural areas have higher crash risk than work zones in urban areas (Harb et al., 2008). Traffic volume is also found to have significant effects on work zone crash oc- currence (Yang et al., 2015a). For instance, Zhang et al. (2022a) found that work zones on roadways with higher traffic volume (e.g., larger than 20,000 vehicles per day) are associated with higher crash risk. In comparison, the crash risk caused by work zones on roadways with lower traffic volume (e.g., smaller than 20,000 vehicles per day) is not significant. Finally, weather conditions also affect the crash risk caused by work zones. For instance, Harb et al. (2008) found that drivers are less likely to be involved in work zone crashes (single-vehicle) during rainy weather and more likely to be involved in work zone crashes during foggy weather.

Harb, R., Radwan, E., Yan, X., Pande, A., Abdel-Aty, M., 2008. Freeway work-zone crash analysis and risk identification using multiple and conditional logistic regression. J. Transp. Eng. 134 (5), 203–214.
Jin, T.G., Saito, M., Eggett, D.L., 2008. Statistical comparisons of the crash characteristics on highways between construction time and non-construction time. Accid. Anal. Prev. 40 (6), 2015–2023.
FHWA, 2009. Work zone safety and mobility fact sheet. https://safety.fhwa.dot.gov/wz/wz_awareness/2009/factsht09.cfm.
FHWA, 2017. Work zone facts and statistics. https://web.archive.org/web/20170828102601/https://ops.fhwa.dot.gov/wz/resources/facts_stats/safety.htm. 
FHWA, 2019. Manual on Uniform Traffic Control Devices. Technical Report, USDOT. 
FHWA, 2021. CMF clearinghouse. http://www.cmfclearinghouse.org/results.cfm.
Yang, H., Ozbay, K., Ozturk, O., Yildirimoglu, M., 2013. Modeling work zone crash frequency by quantifying measurement errors in work zone length. Accid. Anal. Prev. 55, 192–201.
Zhang, Z., Akinci, B., Qian, S., 2022a. Inferring the causal effect of work zones on crashes: Methodology and a case study. Anal. Methods Accid. Res. 33, 100203.
Zhang, Z., Akinci, B., Qian, S., 2022b. A novel map-matching algorithm for relating work zones and crashes. In: Construction Research Congress 2022. American Society of Civil Engineers, pp. 366–375.

Individuals Involved

Email Name Affiliation Role Position
seanqian@cmu.edu Qian, Sean Carnegie Mellon University PI Faculty - Tenured


Amount of UTC Funds Awarded
Total Project Budget (from all funding sources)


Type Name Uploaded
Data Management Plan dmp_81tFk6T.docx Sept. 18, 2023, 9:54 p.m.

Match Sources

No match sources!


Name Type
California Department of Transportation Deployment & Equity Partner Deployment & Equity Partner
toXcel Inc Deployment Partner Deployment Partner