#413 Hierarchical Decision Making and Control in RL-based Autonomous Driving for Improved Safety in Complex Traffic Scenarios

Principal Investigator: Keith Redmill
Status: Completed
Start Date: July 1, 2023
End Date: June 30, 2024

Project Type: Research Advanced
Grant Program: US DOT BIL, Safety21, 2023 - 2028 (4811)
Grant Cycle: Safety21 : 23-24
Visibility: Public

Abstract

In this work, we propose a framework for vehicle planning and decision-making and vehicle control. We formulate the vehicle control problem as an MDP (Markov Decision Process) and leverage reinforcement learning to learn high-quality autonomous driving strategies. The motivation behind this research stems from the complexity of highway driving, which requires the seamless coordination of multiple tasks to ensure safe and efficient navigation.

In the realm of vehicle control research, various methodologies have been explored, including model-based and rule-based approaches, as well as data-driven methods such as supervised learning and reinforcement learning. Model-based and rule-based methods provide transparency and interpretability in decision-making processes, but they necessitate tailored controllers for diverse driving scenarios [1-3]. On the other hand, data-driven methods like supervised learning require extensive dataset collection [4], while reinforcement learning generates its own data through simulations and learns driving strategies through interactions with the environment. A common limitation across prior works in reinforcement learning lies in their reliance on single-layer controllers, lacking higher-level decision-making capabilities [5][6]. Furthermore, existing studies often focus on relatively simplistic traffic scenarios, limiting the generalizability of their findings [7][8].

To address these limitations and to address the challenges posed by complex highway driving scenarios, we adopt a Hierarchical Deep Reinforcement Learning (HDRL) framework. Specifically, the upper-level controller will set the vehicle’s target speed and desired lane or lane change behavior, while the lower-level controller undertakes the fine-grained task of managing the driving dynamics of the vehicle’s longitudinal acceleration and lateral steering control. This hierarchical framework endows the controller with enhanced interpretability and empowers it to navigate complex and intricate traffic environments more effectively than single-layer counterparts. By incorporating high-level decision-making abilities, our proposed approach presents a significant advancement over traditional reinforcement learning-based controllers.

We contemplate two pivotal advantages from this approach. Firstly, HDRL facilitates the hierarchical decomposition of decision-making tasks, enhancing the efficiency of task execution. Secondly, within the HDRL framework, the dual layers of controllers exhibit distinct exploration capabilities in unfamiliar environments. This empowers us to harness the upper-level controller's proficiency in mapping potential driving trajectories within intricate autonomous driving settings, while concurrently engaging the lower-level controller for precise real-time adjustments and control of vehicle behavior. By leveraging the advantages of hierarchical learning, we aim to achieve better coordination and decision-making capabilities, leading to improved safety and exploration in high-stakes environments.

Through comprehensive simulations and experiments, we aim to demonstrate the superiority of our hierarchical controller in handling complex driving situations and contribute to the advancement of safe and efficient autonomous driving technology. We will design driving scenarios, including challenging “trap” scenarios, to test our reinforcement learning framework and compare its performance with traditional single-layer reinforcement learning controllers. These traps involve other traffic vehicles obstructing the autonomous vehicle’s desired path, testing the systems’ ability to identify and navigate around such obstructions. By evaluating the performance of our controller and refining the HDRL framework, we endeavor to strike an optimal equilibrium between safety imperatives and decision making and operational efficiency.

Description

Timeline

Strategic Description / RD&T

Our research plan aims to increase system level safety with a novel vehicle control framework utilizing Hierarchical Reinforcement Learning. The overarching objective is to enhance safety and efficiency in complex highway driving scenarios into the design of autonomous vehicle controllers. This aligns with the U.S. DOT focus on safe technology on page 19, transformative novel automation on page 50 and 60, and AI and machine learning on page 57 et al of the 2022-2026 Research, Development, and Technology Strategic Plan

To strike a balance between safety and efficiency, we will optimize the Hierarchical Reinforcement Learning framework and reward function. Through iterative experiments and fine-tuning, our objective is to develop a controller that maximizes safety while maintaining smooth and efficient driving performance.

In conclusion, our research plan aligns with the research priorities of data-driven system safety, with a particular emphasis on safe driving technology development. We strive to achieve safer and more human-friendly autonomous driving, contributing to the advancement of autonomous vehicle technology and its successful integration into the transportation ecosystem.

Deployment Plan

The research imagined for the project will not immediately lead to deployment.  However, we are in discussion with a member of the research and development division of a major automotive manufacturer with the hope of engaging a project advisor from the OEM. This person will participate in review meetings with our team at least twice during the project and provide comments and advice covering both technical considerations and issues related to the relevance of the research and the potential for its use in future production vehicles and vehicle automation systems.

Expected Outcomes/Impacts

The anticipated outcome of our research is the successful development and validation of a hierarchical reinforcement learning framework tailored for complex traffic scenarios. By incorporating safety factors into the design of reward functions, we aim to enhance the safety culture and decision-making capabilities of autonomous vehicles.

Through our experiments and simulations, we expect to demonstrate that our proposed hierarchical framework enables the autonomous agent to effectively navigate and explore the driving environment while considering safety considerations. Specifically, when the vehicle encounters slow-moving traffic and becomes stuck in a trap, we anticipate that the agent will demonstrate its ability to autonomously identify safe escape routes, showcasing its enhanced decision-making skills in critical situations.

Furthermore, we will design and conduct driving scenarios to compare the performance of our hierarchical framework with traditional single-layer reinforcement learning approaches. We anticipate that the hierarchical structure will exhibit superior exploration capabilities, efficiently learning safe and optimal driving behaviors in complex traffic environments.

Overall, we envisage our research to yield promising results that highlight the potential of hierarchical reinforcement learning for enhancing vehicle safety exploration in challenging traffic scenarios. By addressing safety factors and integrating human-centric decision-making processes, our anticipated outcome will contribute to the advancement of safe and efficient autonomous driving technology, ultimately fostering a safer transportation ecosystem.

Expected Outputs

1. We will release an open-source codebase that embodies our novel hierarchical reinforcement learning framework for complex traffic scenarios. This will provide researchers and developers with a valuable resource to study, adapt, and build upon. 

2. Our research findings will be documented in research papers and conference presentations. By publishing our work, we aim to contribute to the advancement of knowledge in the field, sharing insights into the efficacy and innovation of our approach.

TRID

We selected four search terms that seemed to capture the relevant information in the TriD projects database: “vehicle control machine learning” (25 records), “reinforcement learning vehicle” (14 records), “deep learning vehicle” (31 records), and “explainable” (3 records).  The search results are shown in the attached document.  After discarding projects that were related to infrastructure, traffic control, maintenance, or strictly perception (marked with a dash in the search results) and after identifying duplicate project among the four searches, we found twenty six projects of potential relevance (marked A-Z in the search results) and identified nine as being the most relevant (marked D, G, J, K, L, N, R, S, and T).

Our research focuses on the development of an advanced vehicle control framework  with an emphasis on machine learning technology. However, data-driven methods like supervised learning require extensive dataset collection, while reinforcement learning generates its own data through simulations and learns driving strategies through interactions with the environment.  A common limitation across prior work in reinforcement learning lies in their reliance on single-layer controllers, lacking higher-level decision-making capabilities. Furthermore, existing studies often focus on relatively simplistic traffic scenarios, limiting the generalizability of their findings.

To address these limitations, we introduce a novel hierarchical framework that endows the controller with enhanced interpretability and empowers it to effectively navigate intricate traffic environments. By incorporating high-level planning and decision-making abilities, the controller is able to adapt to complex traffic scenarios more effectively than single-layer counterparts, contributing to the advancement of safe and efficient autonomous driving technology.

Individuals Involved

Email	Name	Affiliation	Role	Position
redmill.1@osu.edu	Redmill, Keith	OSU	PI	Faculty - Tenured
yurtsever.2@osu.edu	Yurtsever, Ekim	OSU	Other	Faculty - Research/Systems

Budget

Amount of UTC Funds Awarded

$105147.00

Total Project Budget (from all funding sources)

$160749.00

Documents

Type	Name	Uploaded
Data Management Plan	dmp-Redmill-413-2023.pdf	Oct. 16, 2023, 1:37 p.m.
Project Brief	AnticipatedDeploymentActivities-Redmill-413_ULszo7g.pdf	Oct. 16, 2023, 1:41 p.m.
Progress Report	413_Progress_Report_2024-03-31	March 31, 2024, 5:35 p.m.
Final Report	Redmill_Keith_413.pdf	Aug. 28, 2024, 9:31 a.m.

Match Sources

No match sources!

Partners

Name	Type
Ohio Department of Transportation	Deployment Partner_ Deployment Partner_