#42 Enhanced pedestrian and vehicle detection using surround-view camera system

Principal Investigator: Vijayakumar Bhagavatula
Status: Completed
Start Date: Jan. 1, 2016
End Date: Dec. 31, 2016

Project Type: Research Advanced
Grant Program: MAP-21 TSET National (2013 - 2018)
Grant Cycle: 2016 TSET UTC
Visibility: Public

Abstract

Being informed of the objects in the vicinity of a vehicle is critical to maintaining safety while driving. Vision is one of the major sources of sensing when humans are driving. However, many traffic accidents (e.g., backup crashes in parking lots and driveways, lane merging accidents on urban roads and highways, etc.) are caused by inadequate visibility. Vehicle camera systems can assist in improving the situational awareness for the drivers. Reduced driver attention is another major cause of accidents. Even with adequate visualization of the vehicle surroundings, drivers may still ignore such information and not take the necessary safety actions. Such attention lapses can be due to many factors, such as fatigue, alcohol, texting and other distractions. In such situations, computer vision techniques that analyze the images for potential safety problems are likely to significantly reduce the chances and impacts of accidents. Many current automotive vision approaches employ cameras that look in front of the cars or behind the cars whereas some others employ vision systems looking at the blind zones. While such vision systems can improve safety by detecting objects in their separate fields of view, these vision systems work independently. By using an integrated surround view camera system that employs four synchronized cameras (covering 360 degrees), we can obtain a bird’s eye view of the vehicle surroundings. Resulting integrated surround-view images allow us to achieve improved object (e.g., vehicle, pedestrian) detection performance in challenging scenarios (e.g., when a pedestrian may be too close to the vehicle, presence of other vehicles in adjacent narrow lanes, etc.). The goal of this project is to develop algorithms that detect and track objects (in particular, pedestrians and vehicles) in these integrated surround-view images and videos and demonstrate superior object detection performance that such systems offer.

Description

Motivation
Being informed of the objects in the vicinity of a vehicle is critical to maintaining safety while driving. Vision is one of the major sources of sensing when humans are driving. However, many traffic accidents (e.g., backup crashes in parking lots and driveways, lane merging accidents on urban roads and highways, etc.) are caused by inadequate visibility. Vehicle camera systems can assist in improving the situational awareness for the drivers. Reduced driver attention is another major cause of accidents. Even with adequate visualization of the vehicle surroundings, drivers may still ignore such information and not take the necessary safety actions. Such attention lapses can be due to many factors, such as fatigue, alcohol, texting and other distractions. In such situations, computer vision techniques that analyze the images for potential safety problems are likely to significantly reduce the chances and impacts of accidents. Many current automotive vision approaches employ cameras that look in front of the cars or behind the cars whereas some others employ vision systems looking at the blind zones. While such vision systems can improve safety by detecting objects in their separate fields of view, these vision systems work independently. By using an integrated surround view camera system that employs four synchronized
cameras (covering 360 degrees), we can obtain a bird’s eye view of the vehicle surroundings. Resulting
integrated surround-view images allow us to achieve improved object (e.g., vehicle, pedestrian) detection performance in challenging scenarios (e.g., when a pedestrian may be too close to the vehicle, presence of other vehicles in adjacent narrow lanes, etc.). The goal of this project is to develop algorithms that detect and track objects (in particular, pedestrians and vehicles) in these integrated surround-view images and videos and demonstrate superior object detection performance that such systems offer.

Previous Research
The objective of our current research project is to develop automated object detections algorithms for videos captured using rearview cameras. Accurate detection of undesired objects behind a parked vehicle can be used to trigger alarms, which can be immensely useful in reducing the chances of backover accidents. As part of this effort, we have collected some mannequin-based data using a high definition GoPro Hero video camera mounted on the rear side of a 2011 Toyota Camry (above the license plate) as shown in Figure 1 (top). Videos were collected at different locations in Pittsburgh at different times of the day (between 10am and 6pm). The objective of this research is to detect the presence of objects, particularly children, behind a parked vehicle. To simulate the situation, child mannequins were used as subjects for data collection. Some resulting images are shown in Figure 1 (bottom). We assume that we have access to a collection of images depicting a clear background (e.g., captured at the time of parking). Our goal is to detect objects of arbitrary shapes and sizes close to the rear side of the vehicle. We use two different algorithms (background subtraction and ground surface classification) to address this problem and combine the predictions from both the frameworks to compute the final prediction. Fig. 2 (top) shows the true silhouette of a child mannequin and Fig. 2 (bottom) shows the output from the object detection algorithm. In each image, a black pixel denotes a non-object pixel and a white pixel denotes an object pixel. For quantifying the performance of this object detection approach, we used three evaluation metrics: (1) Pixel Accuracy – the percentage of pixels that get predicted correctly; (2) True Positive Rate (TPR) – the percentage of object pixels, that get classified as object pixels (the higher the better) and (3) False Positive Rate (FPR) – the percentage of non-object pixels that get classified as object pixels (the lower the better). When averaged over 121 test cases collected from 11 different locations, the mean pixel accuracy was 86.6% (with a 95% confidence interval of 4.1%), the mean TPR was 93.4% (1.4%) and the mean FPR was 14.6% (3.7%). While detecting objects in rear-view images is beneficial, safety in driving will be improved significantly by having a 360-degree view of the surroundings of the vehicle and detecting and tracking objects (in particular vehicles and pedestrians) in such surround view videos.

TECHNICAL PROPOSAL FORM

Proposed Research
The goal of the proposed research is to develop algorithms that detect and track objects in the integrated surround-view (i.e., 360-degree) images and videos and demonstrate superior object detection performance that such systems offer. Our focus will be on detecting pedestrian and vehicles. Most current computer vision models for pedestrian detection do not work well when the pedestrians are close to the vehicle, as can happen in parking lots and driveways. Also, appearances of vehicles can change dramatically as they transition from rear-view to side-view to front-view. Current vehicle detection approaches are mostly aimed at detecting vehicles in the front view and won’t work well with vehicles in
other views. On the other hand, there are definite benefits to integrating the multiple camera views into an integrated surround view since the motion of objects can be more easily and accurately tracked enabling better prediction of where those objects are likely to be in the near future. This will allow us to more accurately predict possible collisions and alert the driver sooner. As the first step in the proposed research, we have already setup a surround-view camera system on a Volkswagen Tiguan SUV. The system consists of four 180 degree fisheye wide-view cameras, each of which covers one side of the vehicle. As illustrated in Fig. 3, two side-view cameras are respectively placed under the wing mirrors, while the remaining are front-view and rear-view cameras. An electronic control unit (ECU) takes the four channels of video and displays them on a monitor. To gather visual data, a Sensoray 2255 four-channel frame grabber is used to convert the analog video signal from each view into digital ones. A single video is then recorded where each frame contains the four synchronized views. Examples of the recorded four fisheye views are shown in Fig. 4. The system basically covers the surrounding areas of the vehicle with almost no blind zones. We propose to develop object detection and tracking algorithms with state-of-the-art computer vision and machine learning technologies. Considering the increasing interest in deploying computer vision technologies in vehicles, as well as the prevalence of vehicle cameras, the system to be developed will be not only technically feasible, but also cost effective and easy to promote. Our focus will be mainly on detecting and tracking pedestrians and other vehicles in surround view images and videos. Initial focus will be day-light scenarios, but we will later consider more challenging conditions such as rainy weather and night time. Examples of frames containing such challenging conditions are shown in Fig. 5. Note that the frames have been rectified to remove fisheye distortion. Our algorithms will go beyond object detection as we intend to incorporate visual tracking. Visual tracking involves tracking objects of interest in video sequences and advanced visual tracking algorithms can handle scale changes, rotations, illumination variations and occlusions. Tracking will significantly reduce the number of false alarms, making the system practical for customer use. Also, visual tracking can assist in the prediction of where objects are likely to be in the near future, thus reducing the probability of collisions.

Recent applications of deep learning methods have brought significant advances in the performance of computer vision algorithms. Considering the direct correlation between detection performance and safety, we hope to address the detection problems using deep learning techniques to obtain results with significant improvements over traditional methods. We aim to develop a deep learning framework which will not only take object appearance as inputs, but also use motion information to further boost the performance of learned detectors. Another advantage of deep learning approaches is that, while training such networks can be computationally challenging, using or applying the networks is relatively fast as it usually involves computing only inner products and applying point nonlinearities. This speed will be extremely useful since object detection and tracking must be accomplished in real time.

Timeline

0-3 months: Large scale data collection using surround-view camera system and annotation.
4-6 months: Implementing object appearance based vehicle detector and pedestrian detector.7-9 months: Incorporating motion information to improve the detection performance in surround-view images.
10-12 months: detection optimization jointly with both appearance and motion.13-15 months: Improving detection and tracking algorithms to handle challenging conditions such as rain and night time.
16-18 months: Speeding up the detection and tracking algorithms to achieve real-time.19-24 months: Real testing and optimization of the system.

Strategic Description / RD&T

Deployment Plan

We have worked with GM engineers in the past on computer vision projects. If our research goals are met, we hope to work with them to deploy the results of our efforts. However, there are currently no agreements with GM for deployment of the proposed technology.

Expected Outcomes/Impacts

? Method(s) for creating integrated surround-view images and videos from the videos provided by the four synchronized cameras in the surround-view camera system ? Method(s) for enhanced detection and tracking of pedestrians in surround-view videos ? Method(s) for enhanced detection and tracking of vehicles in surround-view videos ? Method(s) for enhanced pedestrian and vehicle detection from surround-view videos taken in challenging conditions including rain and night time.

Expected Outputs

TRID

Individuals Involved

Email	Name	Affiliation	Role	Position
kumar@ece.cmu.edu	Bhagavatula, Vijayakumar	ECE	PI	Faculty - Research/Systems

Budget

Amount of UTC Funds Awarded

$55219.00

Total Project Budget (from all funding sources)

$55219.00

Documents

Type	Name	Uploaded
Progress Report	42_Progress_Report_2016-12-31	Sept. 27, 2017, 3:22 p.m.
Final Report	UTC_42.pdf	Nov. 30, 2018, 6:50 a.m.

Match Sources

No match sources!

Partners

No partners!