道路车辆检测

点击查看道路车辆检测视频

Vehicle Detection Project

The goals / steps of this project are the following:

  • Train a CNN based on vehicle and non-vehicle images.
  • Implement a sliding-window technique and use trained classifier to search for vehicles in images.
  • Run pipeline on a video stream and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.
  • Estimate a bounding box for vehicles detected.

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.


Writeup / README

1. Provide a Writeup / README that includes all the rubric points and how you addressed each one.

You're reading it!

Xception for classify

I used Xception net to classify the vehicle.

First, I used keras pre-difined model without pre-trained weights.

Xception = keras.applications.xception.Xception
(include_top=True,weights=None, input_tensor=None,
input_shape=None, pooling=None,classes=2)

Then I augment training data by

        shear_range=0.1,
        zoom_range=0.1,
        horizontal_flip=True

Training for 15 epochs.
Finally, I got val_acc: 0.9996 which means I have 99.96% accuracy on the whole image set.(I used the exact validation set as training set.)

Then test with an image.
006tNc79ly1fnpwtzezsxj309t09udhx

I got output of class 1.

Sliding Window Search

1. Describe how (and identify where in your code) you implemented a sliding window search. How did you decide what scales to search and how much to overlap windows?

I used 3 kinds of scale sliding window.

win_list1 = slide_window(test_img,x_start_stop=(700,1300),
y_start_stop=(350,600),xy_overlap=(0.4,0.7))

006tNc79ly1fnqb8epa31j30ed08jn2i

win_list2 = slide_window(test_img,x_start_stop=(700,1300),
y_start_stop=(350,600),xy_window=(80,80),xy_overlap=(0.4,0.7))

006tNc79ly1fnqb8gkb8tj30ed08jjwv

win_list3 = slide_window(test_img,x_start_stop=(700,1300),
y_start_stop=(400,656),xy_window=(128,128),xy_overlap=(0.4,0.7))

006tNc79ly1fnqb8i8mu3j30ed08jq8

2. Show some examples of test images to demonstrate how your pipeline is working. What did you do to optimize the performance of your classifier?

First, I get all bboxs for one image.
006tNc79ly1fnqbagz7jrj30ed08jaf

Generate heat map.
006tNc79ly1fnqbaqe84hj30ed08jq2v

Threshold heat map.
006tNc79ly1fnqbas3qp7j30ed08jmx3

Get labels for heat map.
006tNc79ly1fnqbatkh6mj30ed08jdfq

Finally, get the output.
006tNc79ly1fnqbbo0jb4j30ed08jwk

Then I test it on 6 test images.
006tNc79ly1fnqbbvy48jj30io0huao

Video Implementation

1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (somewhat wobbly or unstable bounding boxes are ok as long as you are identifying the vehicles most of the time with minimal false positives.)

Here's a link to my video result

2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

I recorded the positions of positive detections in each frame of the video. From the positive detections I created a heatmap and then thresholded that map to identify vehicle positions. I then used scipy.ndimage.measurements.label() to identify individual blobs in the heatmap. I then assumed each blob corresponded to a vehicle. I constructed bounding boxes to cover the area of each blob detected.

For this purpose, I used a deque to store the previous 6 frames bboxs.

In the process function, it will be processed like this:

    if len(previous_frame) < 6:
        previous_frame.append(result)
    else:
        previous_frame.append(result)
        previous_frame.popleft()
    
    all_result = []
    for i in previous_frame:
        all_result += i

This can use all the 6 previous frames to the final frame.

Discussion

1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?

This approach needs to compute every sliding windows class via the CNN, which need a lot of redundant computation in the same area. But for this project, I will use this simple way to implement and this pipeline cannot process video in real time.

However, there appeared some single stage method for object detection like YOLO, SSD can do this efficiently and effectively. And I believe these better method will be used on a real self-driving car.