Login

Project

#403 Robust Automatic Detection of Traffic Activity from Vehicle Perspectives


Principal Investigator
Alex Hauptmann
Status
Completed
Start Date
July 1, 2022
End Date
June 30, 2023
Project Type
Research Advanced
Grant Program
FAST Act - Mobility National (2016 - 2022)
Grant Cycle
2022 Mobility21 UTC
Visibility
Public

Abstract

The accurate detection and prediction of actions by multiple traffic participants such as pedestrians, vehicles, cyclists and others is a critical prerequisite for enabling self driving vehicles to make autonomous decisions. Current approaches  to teach an autonomous vehicle how to drive use reinforcement learning which is essentially relies on already collected situations as examples relying purely on visual similarity without any understanding of the semantics of the situation and therefore no ability to reason about other similar situations that may have different appearance. This can be overcome by methods that provide situation awareness to the vehicle. The idea is to enable semantically meaningful representations of road scenarios which include the physical layout of the scene, the various participants prior and current activities. The ability to abstract this semantic representation and apply it to multiple scenes that are conceptually similar allows much more robust decision-making strategies by autonomous vehicles. Essentially this allows endowing autonomous vehicles with a reasoning process.

    
Description
We proposed to build an efficient robust spatio-temporal activity detection system for extended and road activity detection. The proposed system is a composition of a four-stage framework: Proposal Generation, Proposal Filtering, Activity Recognition and Activity De-duplication. The major difference to former works, is the concept of cube proposals. Rather than simply adapting tube proposals,  cropped trajectories of detected and tracked objects, we propose to merge and crop the area of detected objects across the frames.
The proposed system will provide a real-time activity detection for unconstrained video streams of road scenes, and be robust across different road scenarios.
We will implement overlapping spatio-temporal cubes as the  core concept of road activity proposals to ensure coverage and completeness of activity detection through oversampling.
\
An early version of this system tailored to human activity detection only has achieved outstanding performance in a large series of activity detection benchmarks such as the TRECVid 2021 challenge on activity analysis in extended surveillance video. 
Timeline
Months 1-3: Gather and preprocess data sets, start pretraining on annotated data
Months 4-6: Experiments with implementation of different algorithms for road activity detection
Months 7-9: Improve system, participate in public challenges for automatic road activity analysis
Months 10-12: Improve system for final deliverable and report
Strategic Description / RD&T

    
Deployment Plan
GM will use the algorithms we develop. They will validate our results on our test data and also apply the algorithms to their own proprietary data. 

The research will adapt based on GM  feedback in their evaluation on proprietary data. 
Expected Outcomes/Impacts
Our primary metrics will be detection accuracy in established data sets such as the NVIDIA AI City Challenge data and the ICCV Road Challenge data, based on the metrics already in use for these datasets. 

Expected Outputs

    
TRID


    

Individuals Involved

Email Name Affiliation Role Position
ablair2@andrew.cmu.edu Blair, Allison CMU Other Staff - Business Manager
alex@cs.cmu.edu Hauptmann, Alex CMU/LTI PI Faculty - Research/Systems

Budget

Amount of UTC Funds Awarded
$100000.00
Total Project Budget (from all funding sources)
$146000.00

Documents

Type Name Uploaded
Data Management Plan DMP_-_Robust_Automatic_Detection_of_Traffic_Activity_from_Vehicle_Perspectives.pdf Dec. 7, 2022, 12:24 p.m.
Publication Msnet: A multilevel instance segmentation network for natural disaster damage assessment in aerial videos March 30, 2023, 5:31 a.m.
Publication Scene graphs: A survey of generations and applications March 30, 2023, 5:31 a.m.
Publication Subspace Representation Learning for Few-shot Image Classification March 30, 2023, 5:32 a.m.
Publication Argus++: Robust real-time activity detection for unconstrained video streams with overlapping cube proposals March 30, 2023, 5:33 a.m.
Publication Trm: Temporal relocation module for video recognition March 30, 2023, 5:33 a.m.
Publication Rethinking spatial invariance of convolutional networks for object counting March 30, 2023, 5:34 a.m.
Publication Vehicle and Pedestrian Trajectory and Gap Estimation for Traffic Conflict Prediction March 30, 2023, 5:35 a.m.
Publication Deep discrete cross-modal hashing with multiple supervision March 30, 2023, 5:35 a.m.
Publication Video pivoting unsupervised multi-modal machine translation March 30, 2023, 5:36 a.m.
Publication MAGVIT: Masked Generative Video Transformer March 30, 2023, 5:36 a.m.
Final Report 403_-_Final_Report.pdf July 5, 2023, 12:31 p.m.

Match Sources

No match sources!

Partners

Name Type
General Motors Deployment Partner Deployment Partner