Project: #511 Cooperative Sensing of Vulnerable Road Users and Real-time Response to Potential Collisions via Connected Vehicle and Infrastructure Communication
Progress Report - Reporting Period Ending: April 1, 2025
Principal Investigator: Stephen Smith <sfs@cs.cmu.edu>
Status: Overdue Project

Start Date: July 1, 2024
End Date: June 30, 2025

Research Type: None
Grant Type: Research Applied
Grant Program: US DOT BIL, Safety21, 2023 - 2028 (4811)
Grant Cycle: Safety21 : 24-25 


Progress Report  (Last Updated: Aug. 28, 2025, 9:06 p.m.)
% Project Completed to Date: None
% Grant Award Expended: None
% Match Expended &amp; Document: None

USDOT Requirements

Accomplishments
The project has two overarching objectives: (1) development of an end-to-end framework for cooperative perception, prediction and planning at intersections that incorporates data from the sensors of connected autonomous vehicles (CAVs) moving through the intersection, from infrastructure sensors mounted at the intersection, and from vulnerable road users connected to the intersection; and (2) investigation of accident mitigation strategies for CAVs when responding to potential collisions that have been identified by the cooperative perception pipeline. 

Work to date has focused principally on the first objective of developing an end-to-end framework for cooperative perception, prediction and planning. Building from our previous work on cooperative perception by multiple CAV&#39;s, a new framework was specified that utilizes a  multi-modal large language model (LLM) to integrate information obtained by various CAV and infrastructure sensors and build a model of the current state of the intersection and enable it to answer queries from individual CAVs to determine such information as the presence of occluded objects and vehicles, and the current trajectories of various vehicles. The LLM is situated centrally at the intersection and is assumed to interact with vehicles and infrastructure sensors via V2X technology. We have extended a contemporary open source data set (developed originally for a setting of an LLM interacting with just a single vehicle) to include multiple vehicles in addition to infrastructure sensors, and have used this extended data set to create a benchmark for future research. This work is described in [1]. 

 [1] Hsu-kuang Chiu, Ryo Hachiuma, Chien-Yi Wang, Stephen F. Smith, Yu-Chiang Frank Wang, and Min-Hung Chen, &quot;V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models&quot;, arXiv PrePrint, arXiv:2502.09980, Feb. 2025

Impacts
To this point, the impact of the work has mainly been in increasing the body of scientific knowledge relative to the use of large language model technology in improving the safety of travelers moving through signalized intersections. LLM technology has emerged as a powerful means of accumulating large knowledge models and bringing them to bear to solve interpretation and prediction problems in new contexts, and hence it is important to understand what utility this technology might have in transportation safety applications.  This project is contributing to building this understanding.

Other
Website:  https://eddyhkchiu.github.io/v2vllm.github.io/  - this website provides additional information relating to the V2V-LLM work summarized above, and will also be where the expended data set will be made available to other researchers.

Outcomes


New Partners
Through an internship carried out by CMU Ph.D. student Hsu-Kuang Chiu over last summer and extending on a part-time basis through the 2024-25 academic year, Nvidia has collaborated with us on this project, contributing GBU computing power and in-kind support for collaborating researchers at Nvidia.


Issues
The focus of the proposed end-to-end perception pipeline was shifted to explore the use of LLM&#39;s due to the advent of other work that has shown their advantages in this context.