#583 Humanoid as Human Assessment for the Safety of Vulnerable Road Users

Principal Investigator: Ding Zhao
Status: Active
Start Date: July 1, 2025
End Date: June 30, 2026

Project Type: Research Advanced
Grant Program: US DOT BIL, Safety21, 2023 - 2028 (4811)
Grant Cycle: Safety21 : 25-26
Visibility: Public

Abstract

Despite rapid advancements in autonomous driving and connected vehicle technology, significant challenges persist in the safety assessment of the new technologies. Ensuring the functional safety of these complex systems is essential, yet there still needs to be a critical gap in methodologies that can comprehensively cover all necessary testing conditions. This gap is especially pronounced when evaluating scenarios involving vulnerable road users, defined as pedestrians, bicyclists, individuals using personal conveyances, and those in work zones as outlined by the ANSI D16.1-2007 standard and U.S. safety regulations.

Three main challenges highlight the complexity of this task. First, critical scenarios involving vulnerable road users are rare in real-world settings, making it difficult to gather sufficient data for testing. Simulated environments often lack the fidelity to replicate nuanced interactions, complicating the evaluation of autonomous vehicle performance in high-risk situations. Second, despite advances in simulation technologies, a sim-to-real gap remains due to inherent limitations in simulating vehicle behavior and environmental complexity. This requires real-world testing to validate simulation findings, which can otherwise undermine confidence in autonomous vehicle safety. Third, most real-world evaluations rely on simple dummies to represent pedestrians, failing to capture diverse, realistic human behaviors and interactions. This limits the ability to evaluate autonomous systems under realistic conditions thoroughly.

To the best of our knowledge, neither the academic community nor industry leaders have converged on a robust, standardized approach for conducting real-world evaluations that accurately reproduce complex, high-risk scenarios involving vulnerable road users and intricate traffic conditions. Addressing these challenges requires multidisciplinary research and innovative solutions that enhance both simulation fidelity and real-world testing methodologies.

Inspired by advancements in large language models (LLMs), we propose to develop a specialized foundation model for the evaluation of autonomous vehicles, focused on scenarios involving vulnerable road users. We aim to create a platform that integrates humanoid robots capable of simulating these road users' movements to test autonomous vehicle responses. Leveraging our expertise in critical scenario generation, we aim to build a foundation model that produces scenarios posing safety challenges to vulnerable road users. These scenarios will be generated randomly, conditionally based on text prompts, or through retrieval-augmented generation using specific scenarios as input.

We will also develop a platform with humanoid robots programmed to replicate the behaviors of pedestrians, scooter users, and individuals in wheelchairs. Our research will include developing algorithms to simulate the movements of pedestrians with specific physical limitations, ensuring a comprehensive range of real-world testing. Additionally, we will apply our prior work in accelerated evaluation to enhance testing efficiency. This initiative seeks to address current validation challenges and improve the safety and reliability of autonomous systems.

We have built a strong connection to PennSTART who will deploy the vulnerable robots to their test track which will be ready in 2026. Over the next half-decade, while Pittsburgh leverages its strong industrial background and leadership in the autonomy sector, it will face increased national and international competition. By contributing to autonomous driving evaluation and focusing on vulnerable road user safety, this project aims to reinforce Pittsburgh’s position as a leader in the industry and support its continued innovation and growth.

Description

Timeline

Strategic Description / RD&T

Section left blank until USDOT’s new priorities and RD&T strategic goals are available in Spring 2026.

Deployment Plan

July - August 2025: VRU-Centered Scenario Generation Setup
July: Develop the scenario representation structure and humanoid motion representation framework. Finalize the system architecture for scenario setup in the simulation environment and define requirements and evaluation metrics.
August: Enhance existing critical scenario generation methodologies to create a VRU-centered critical scenario generation algorithm, utilizing a pre-trained visual-language model (VLM). Implement the algorithm to reconstruct collision scenarios from the NHTSA crash dataset, generating diverse VRU-centered critical scenarios based on scenarios without VRU involvement in the original dataset.

September - November 2025: Foundation Model for Verification and Validation
September: Fine-tune the pre-trained VLM using the generated scenarios to create a robust foundation model, enhancing traffic data analysis and scenario generation capabilities. Employ this refined model to generate VRU-centered critical scenarios in an end-to-end manner, building a comprehensive critical scenario dataset based on NHTSA data.
October: Further finetune the foundation model by incorporating traffic policies and regulatory requirements. Ground these rules within generated scenarios to enable policy-aware scenario generation. Implement an algorithm that produces VRU-centered critical scenarios conditioned on traffic rule violations, creating an additional dataset focused on rule-violation scenarios.
November: Develop a user interface that enables users to select pre-generated testing scenarios, configure scenario parameters, and create new scenarios as needed.

December 2025 - March 2026: Intelligent Humanoid and Uniform VRU Platform Development
December: Initiate the development of a retrieval-augmented generation (RAG) mechanism for motion synthesis. Segment the humanoid body into distinct parts and learn latent distributions for each. Implement retrieval methods for conditioned motion generation that go beyond text similarity and incorporate manual or automated labeling for VRU-specific conditioned retrieval.
January: Enhance the combination of retrieved motions by developing embeddings for body parts and using a Transformer encoder for whole-body motion reconstruction. Test and adjust the integration of this approach to handle complex and dynamic VRU scenarios.
February: Regarding the open-loop generated motions, integrate generated motions with physics-based tracking controllers to ensure physically plausible motion. Refine the low-level tracking controller to handle diverse and physically feasible motion scenarios, addressing current limitations in motion tracking.
March: Improve the retrieval-augmented motion synthesis through iterative feedback within simulations, optimizing generated motions and tracking controllers. Begin integrating generation and tracking in a closed-loop control system, drawing on methods such as CLoSD and PhysDiff for robust character control.

April - June 2026: Real-World VRU-Centered Critical Scenario Deployment
April: Combine the scenario generation module with the humanoid robot platform to create a VRU-centered critical scenario deployment system for real-world autonomous vehicle evaluation. Implement a communication module to facilitate interaction between these components.
May: Fine-tune the foundation model to integrate knowledge from the intelligent humanoid and uniform VRU platform, enabling seamless end-to-end evaluation from scenario generation to real-world scenario replication.
June: Deploy the integrated evaluation system in real-world settings and conduct final calibration and adjustments. Collect feedback from real-world evaluations and further refine the system to enhance its robustness and accuracy.

Expected Outcomes/Impacts

The primary anticipated outcome of this project is the development of a comprehensive foundation model and a uniform VRU platform utilizing a humanoid robot. This integrated system will include a foundation model enabling users to specify scenario requirements; a simulation environment for selecting critical scenarios; a humanoid robot platform capable of executing these scenarios and performing physical evaluations for autonomous vehicles; and an overarching system that connects these components and collects evaluation data. 

The foundation model will empower users to create customized scenarios using multi-modal inputs such as images, text descriptions, or structured reference scenarios. The uniform platform will enable the simulation of various types of vulnerable road users, including pedestrians, cyclists, scooter users, and individuals using wheelchairs or strollers. We will collaborate with PennSTART to design and implement the real-world evaluation system, ensuring comprehensive and realistic assessments of autonomous vehicle performance.

This research will deliver a solution for real-world evaluation of autonomous vehicles, focusing on the safety of vulnerable road users with greater behavioral fidelity than conventional dummies. Additionally, this work will result in an open-sourced foundation model for autonomous vehicle verification and validation, incorporating specialized capabilities for understanding traffic rules and the behavior of vulnerable road users.

Expected Outputs

- We plan to transfer the technology to PennSTART through a CMU spin-off company Trustelligence llc led by the PI.
- A dataset comprising real-world crash data paired with reconstructed scenarios, as well as a dataset of generated critical scenarios that present significant risks to vulnerable road users and include corresponding responsibility analyses based on traffic rules.
- A foundation model capable of following textual and visual guidance to conduct scenario analysis, identify responsibility according to traffic rules, and create critical scenarios based on multi-modal inputs.
- An AV evaluation system that leverages the foundation model to generate and execute critical scenarios involving vulnerable road users and process the resulting evaluation data.
- A uniform robot platform designed to autonomously perform the necessary behaviors of vulnerable road users and execute scenarios for AV evaluation, with built-in capabilities to collect and report evaluation results to the evaluation system.

TRID

According to the TRIS database, this project would mark the first foundation-model-powered initiative focused on autonomous vehicle (AV) physical evaluation programs specifically prioritizing the safety of vulnerable road users (VRUs). This project will complement the following existing projects. The differences and comparisons are listed.

Comparison with “Vehicle-in-Virtual-Environment (VVE) Method for Developing and Evaluating VRU Safety of Connected and Autonomous Driving” and “Vehicle-in-Virtual-Environment (VVE) Method for Developing and Evaluating VRU Safety of Connected and Autonomous Driving with Year 2 Focus on Bicyclist Safety”:
The first distinction is that, rather than focusing on coarse-grained VRU categories such as pedestrians, cyclists, and persons on personal conveyance, our project will include more fine-grained groups, particularly minor groups and those with different behaviors. Second, unlike VVE’s focus on virtual environments, this project aims to develop both the software and hardware components necessary for a physical evaluation program. Third, our project will utilize the reasoning capabilities of the foundation model to create realistic scenarios that present significant risks to VRUs, enhancing the accuracy and applicability of scenario generation.

Comparison with “Safety in Connected Automated Vehicles in the presence of Vulnerable Road Users ”:
Project #451 studied the specific case where pedestrians “pedestrians who may stop or turn back while crossing the street” and the safety issue along with this specific type of scenario. In contrast, this project will focus on building an evaluation system to evaluate autonomous vehicles’ performance under critical scenarios with the existence of vulnerable road users. Other than the safety issue, we focus on creating diverse critical scenarios with high fidelity via a foundation model and building a uniform robot platform.

Comparison with “Cooperative connected intelligent vehicles and infrastructure for road safety applications”: 
Other than improving the safety of vulnerable road users via the cooperation and communication of intelligent vehicles and infrastructure, this project focuses on the evaluation of autonomous vehicles and building an automatic and uniform evaluation system.

Comparison with “Simulator Evaluation of Emerging Countermeasures for Speed Management to Reduce Conflicts with Vulnerable Users”:
Other than the speed management of vehicles, this project aims to investigate the evaluation system to verify and validate the autonomous vehicles’ performance under critical scenarios involving vulnerable road users.

Individuals Involved

Email	Name	Affiliation	Role	Position
dingzhao@cmu.edu	Zhao, Ding	Carnegie Mellon University	PI	Faculty - Untenured, Tenure Track

Budget

Amount of UTC Funds Awarded

$100000.00

Total Project Budget (from all funding sources)

$200000.00

Documents

Type	Name	Uploaded
Data Management Plan	data_management_plan_1.docx	Nov. 20, 2024, 2:40 p.m.

Match Sources

No match sources!

Partners

Name	Type
PennSTART	Deployment Partner Deployment Partner