Robot Detection and Tracking

Project 1: Robot vision tracking 

Introduction

      The goal of the project is going to build a computer vision system  that can detect, track an enemy that has specific features. The enemy is a robot vehicle with a blue shell and a green health bar in the rear part. The following will introduce the detail steps of detection and tracking. 

Detection

        Compared with RGB model, HSV is a more common color space model for computer vision. Hence, the enemy can be easily detected by using color detection. The program accesses HSV value of every pixel to check whether it satisfies the threshold of green or blue. Next erode and dilate operations are applied to eliminate noise points. Then we should find the connected domain which has the max area. Finally, the tightest rectangle box is used to bound the connected domain. In this way, the connected domains which correspond to the car shell and health bar respectively are found and bounded with rectangle in different color.

        Since two rectangles are found, maybe there are more rectangles than two, the following step is to judge whether the object which has been found is the required enemy vehicle. Due to the physical structure of the vehicle, it is an enemy only when the centroid of the health bar is above the centroid of the car shell and the distance of two centroids in horizontal direction is small than a threshold value.  See the link here.

Tracking

        Recently the term optical flow has been co-opted by roboticists to incorporate related techniques from image processing and control of navigation. The first method to estimate the optical flow was proposed by Horn in 1981. In computer vision, a more widely used method for optical flow estimation is the Lucas-Kanade method developed by Bruce D. Lucas and Takeo Kanade. The Lucas-Kanade method assumes that the displacement of the image contents between two nearby instants (frames) is small and approximately constant within a neighborhood of the point p under consideration. In fact, object motion is not always small enough to satisfy the assumption. JY Bouguet proposed a pyramidal implementation of the classical Lucas-Kanade algorithm in oder to handle large motion of the object. Here I use the algorithm developed by  JY Bouguet which combines the classic optical flow method with image pyramid to get a better tracking result.

Project 2: Robot vision tracking 2

Introduction

      The goal of the project is going to build a computer vision system  that can detect and track a UAV ardrone.  The key step is to identify ardrone in a series of video frames taken by CCD camera. In the project, I will applied some algprithms SURF/SIFT and HOG svm to recognize the goal.

SURF/SIFT in tracking

Firstly, I calculate the features of a group of sample images and obtain the descriptors. Secondly, I calculate the descriptor of every video frame. Thirdly, I match the frame descrtptor with sample descriptors try to get very good matches. However, there are many wrong matches in the results. Hence, some work is done to refine the matching results. As we know, a pair of match consists of points of a sample picture and points of video frame. To reduce the crossing matches, I set an angle limitation for matching lines. A matching line is a line between a point on the video frame and a corresponding point on the sample picture. Hence, there is an angle between the matching line and the horizontal line. If the difference of the angle between two matching lines was smaller than 3 degree, the two matching lines are assumed as parallel. The goal of the refining method is to find a pair of matches which contain the most parallel line. This may help improve the result of the SURF/SIFT matching.

HOG+SVM

        The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. According to the survey from Caltech, HOG is the most powerful descriptor for pedestrians. Here I use HOG as feature descriptor and SVM as classifier to locate the ardrone in the video frames. I train my own SVM classifier by 1380 positive samples and 2500 negative samples. The detection function defined in the HOG class is used to detect the goal.