Adaptive key SURF feature extraction and application in unmanned vehicle dynamic object recognition

2015-04-24 05:30:24DUMingfang杜明芳WANGJunzheng王军政LIJing李静LINan李楠LIDuoyang李多扬

Journal of Beijing Institute of Technology 2015年1期

DU Ming-fang(杜明芳), WANG Jun-zheng(王军政), LI Jing(李静)LI Nan(李楠) LI Duo-yang(李多扬)

(1.Key Laboratory of Intelligent Control and Decision of Complex System, Beijing Institute of Technology, Beijing 100081, China; 2.Automation School, Beijing Union University, Beijing 100101, China)

DU Ming-fang(杜明芳)1,2, WANG Jun-zheng(王军政), LI Jing(李静)1,LI Nan(李楠)1, LI Duo-yang(李多扬)1

(1.Key Laboratory of Intelligent Control and Decision of Complex System, Beijing Institute of Technology, Beijing 100081, China; 2.Automation School, Beijing Union University, Beijing 100101, China)

A new method based on adaptive Hessian matrix threshold of finding key SRUF (speeded up robust features) features is proposed and is applied to an unmanned vehicle for its dynamic object recognition and guided navigation. First, the object recognition algorithm based on SURF feature matching for unmanned vehicle guided navigation is introduced. Then, the standard local invariant feature extraction algorithm SRUF is analyzed, the Hessian Metrix is especially discussed, and a method of adaptive Hessian threshold is proposed which is based on correct matching point pairs threshold feedback under a close loop frame. At last, different dynamic object recognition experiments under different weather light conditions are discussed. The experimental result shows that the key SURF feature abstract algorithm and the dynamic object recognition method can be used for unmanned vehicle systems.

dynamic object recognition; key SURF feature; feature matching; adaptive Hessian threshold; unmanned vehicle

Image feature matching is the core technology of real-time vision system for image guidance, robot vision navigation. In this type of application, feature extraction and matching search algorithm are the keys to realize the real-time, robustness of the system when the template image and real-time images are selected properly. Many foreign literatures have pointed out that the more the feature points extracted in complex content images, the more influence to the accuracy of image matching. The application of moving target recognition under complex scenes is even more difficult. Matching and recognition of moving objects pays more attention to the abstraction and tracking of a stable feature set, the number and the density of feature points are not the necessary conditions of accurate identification. On the contrary, if the key feature points extraction accuracy is high and the identification is strong, the fewer the more conducive to the reliability and stability of the tracking system. This is actually similar to the efficient visual data screening mechanism of the important results in recent years in many visual psychology and physiology experiments which are called visual saliency (visual saliency, also called selection attention mechanism). In the past, researchers have proposed feature map, saliency estimation, WTA neural network, inhibition of return (IOR) method[1-3]to describe the significant features, but it is very difficult to be practical. In this paper, a method of extracting key SURF features based on adaptive Hessian threshold is proposed.

Hessian threshold is explored with unmanned vehicle city road environment sensing background. The feature extraction method is applied to the dynamic object recognition through feature matching for a C30 unmanned vehicle. Beijing City real road scene perception experiment results show that the proposed method can be used for dynamic object recognition and stable object tracking under an allowable error range.

1 Image feature matching in unmanned vehicle system

Guided navigation is an important way to realize the unmanned vehicle autonomous navigation in which navigation object recognition is the basis[4]. In this paper, the navigation object (the car in front) recognition is realized through the image feature matching. Fig.1 shows the unmanned vehicle navigation method using SURF feature tracking.

Object recognition algorithm flow based on SURF feature matching for the vehicle mounted camera is shown in Fig.2.

Image matching technology involves three aspects: one is the feature detection, feature similarity measure isthe second, the third is the search strategy. In real time application, the matching speed can be improved from 3 aspects[5-7], one is to reduce the total number of matching features involved, namely reduction optimization feature space; two is the operation to reduce the amount of similarity computation; three is to reduce the number of matching search cycle. In this paper, the matching speed is improved through adaptive key SURF feature extraction and sequential similarity detection algorithm (SSDA).

Fig.1 Guided navigation by the SURF feature tracking of leader vehicle

Fig.2 Dynamic object recognition based on SURF feature matching flowchart

Classic SURF feature detection algorithm is used for the static image, it does not take into account the special needs for real-time image processing speed. In SURF algorithm, the scale space is divided into octaves. An octave represents a series of filter response maps obtained by convolving the same input image with a filter of increasing size[8].In the real-time application field of camera moving and object also moving such as unmanned vehicle object recognition and tracking, SURF feature detection algorithm needs to be improved for promoting system performance.

2 Key SURF feature abstraction algorithm based on Hessian Matrix

2.1 SURF feature abstraction algorithm

As the acceleration and improvement, of SIFT(scale invariant feature transform), SURF (speeded up robust features) was first proposed by Bay et al. in 2006[8-9]. The SIFT algorithm obtains the Gauss Pyramid through the input image and Gauss function kernel convolution repeatedly and the down sampling[10], so each layer depends on the original image. SURF algorithm operates on the integral image, with a cartridge filter (box filters) to replace approximately two order Gauss filter, judging the extreme point using the determinant of Hessian matrix, the down sampling method is applied to increase the image nuclear size, thus multi images in scale space are processed at the same time, which improves the algorithm performance.

SURF feature extraction steps are listed as follows.

①Construction of Hessian matrix and the multi-scale space.

②Detecting the extreme points by the Hessian determinant.

③Further precise positioning of feature points by using Hessian matrix threshold.

④To determine the main direction of the feature points by Haar wavelet response.

In practical applications, if the image is not required to have rotation invariance in range 360°, the calculation of the Haar wavelet responses in range [-α,+α] can be chosen. When a vehicle object on urban road are tracked by unmanned vehicle, the road is almost flat and the rotation range of the object is not large, soα=30° is enough. It can greatly improve the speed of SURF algorithm.

⑤Construction of SURF feature descriptor.

2.2 Analysis of Hessian matrix

Hessian matrix is the key of the SURF algorithm. To a pixelx=(x,y)in imageI, its Hessian matrixH(scale isσ)is defined as

(1)

The determinant of Hessian matrixis

det(H)=LxxLyy-LxyLxy

(2)

Convolution values of box filter and image are marked asDxx,Dyy, andDxyrespectively. UsingDxx,Dyy, andDxyto replaceLxx(x,σ),Lxy(x,σ), andLyy(x,σ), the determinant of Hessian matrix can be described as follows

det(H)=DxxDyy-(ωDxy)2

(3)

In order to compensate and balance the approximation error, set the weight coefficientω.

(4)

where |x|Fis Frobenius norm. So the determinant of Hessian matrix can be obtained.

det(H)=DxxDyy-(0.9Dxy)2

In actual use,ωis a suitable constant.

The determinant of Hessian matrix is the product of its eigenvalues. Decision rule of local extreme points of the different scales are as follows:

①det (H)<0→ opposite sign eigenvalue ofH→(x,y) which is not the local extreme point;

②det (H)>0→ same sign eigenvalue ofH→(x,y) which is the local extreme point.

Comparing each extreme point with other 26 points of the three-dimensional neighborhood, when the extreme point is greater than (or less than) all 26 points, only the extreme point is the candidate feature.

The fitting function of precise positioning to obtain the feature point is as follows

(5)

Doing the derivation and let equations equal to zero. The extreme point can be obtained.

(6)

The equation has a value at the corresponding extreme value point.

(7)

If |D()|≤0.03, regarded as the feature point with lower contrast and can be eliminated. In order to improve the real-time performance and stability of matching tracking, unstable feature points need to be further eliminated and deleted, only key feature points are saved. This can be realized by Hessian matric threshold.

The maximum eigenvalueαand minimum eigenvalueβofHrepresent the gradients ofXandYdirections respectively.

Trace ofHis

tr(H)=Dxx+Dyy=α+β

(8)

Determinant ofHis

det(H)=αβ

(9)

Ifα=rβ，so

(10)

(11)

The greater therthe greaterεH, the more looser the eliminating conditions. Obviously, preserved feature point is less, the more conducive to algorithm to improve real-time performance, but too little can lead to system instability even algorithm failure. So the characteristics and complexity of the application will determine the Hessian threshold adaptively , which has the very strong practical significance.

2.3 Obtain method of adaptive Hessian threshold

The real road scene images will show different characteristics in different weather, different illumination, different time, so it is obviously very complicated to rely solely on the experiment test method to determine the Hessian matric threshold, and unable to achieve versatility. Summary of the SURF feature for object recognition and tracking study show that the practices of previous studies are essentially based on the open loop mode. In inspiration of adaptive control system thought, this study proposes the concept of SURF closed-loop and the adaptive Hessian threshold determination method according to feedback correct matching point pairs. The principle is shown as Fig.3.

Fig.3 SURF closed-loop based adaptive Hessian threshold determination method

In Fig.3,TNiis the total number threshold of correct matching point pairs in different weathers with different light conditions (according to the corresponding empirical measurement). If the correct matching point pairs count is lower thanTNi, the system judges the object can not be identified. This reflects the algorithm design purpose according to the application.

3 Algorithm experiment and analysis

In this section, some city road sensing images from C30 unmanned vehicle are used to prove the effectiveness of our method. The sensing results can be used for dynamic object recognition and tracking under complex natural backgrounds. A C30 vehicle produced by Beijing Automotive Group is modified as the unmanned vehicle, which is our research and experimental platform. CPU of the algorithms performance hardware platform is configured as Intel i5, clocked at 2 GHz.

3.1 Adaptive key SURF extraction

The SURF feature extraction results on sample image under different Hessian thresholds are shown in Fig.4.

Fig.4 Feature extraction results under different Hessian thresholds(Hessian thresholds and feature point numbers of the first to the last image are: (100, 51), (300,33), (600,24), (1 000,18), (1 500,9), (2 000,7), (2 500,2), (3 000,2), (3 500,2), (4 000,1), (4 500,1), (5 000,1))

The feature extraction results under different Hessian thresholds reflect a truth, that is, thekey SURF features are closely related to the Hessian threshold. The higher the Hessian threshold is, the sparser the feature is. The reserved sparse features are the most discriminative visual features. A new conception named Hessian threshold node can be defined according to the experimental results. The Hessian threshold node is a Hessian threshold that makes the number of SURF feature points approach a constant small value, such as 2 500 or 4 000 in the above experimental vehicle image.

The adaptive Hessian threshold in different weathers with different light conditions are shown in Tab.1.

Tab.1 Adaptive Hessian threshold in different weathers with different light conditions

3.2 Key SURF feature matching under regular conditions

To test the effectiveness of the key SURF features extracted by the above method, we select different Hessian thresholds for object recognition. Key SURF feature matching is used to search the sample vehicle object in ROI of the road scene image (PNG format with 512×288 pixels, 222 K). The experimental results are shown in Fig.5.

The most discriminative feature points are the feature points finally reserved. These feature points can best ensure the recognition of objects in scenes. It is clear in Fig.4 that, when 4 500 is used for the Hessian thresholds, respectively, the locations of extracted SURF feature points in an image are the same as the ones when 5 000 is used. So the relative feature points are the most discriminative features.

To explain the characteristics and the meanings of the data in Tab.1 clearly, curves showing the relation of data are drawn in Fig.6.

Fig.5 Feature matching results under different Hessian thresholds(Hessian thresholds from the first to the last image are: 1 500, 2 000, 2 500, 3 000, 3 500, 4 000, 4 500, 5 000)

Fig.6 Relation between Hessian threshold and recognition time

It can be seen from the experimental results that the Hessian threshold is helpful to find the sparsest and most stable SURF features. Under this circumstance, the problem of matching error does not exist, because the number of feature points is few enough and the features are the most stable. It is a very valuable conclusion that the most salient low-level features can be determined by adjusting Hessian threshold nodes, because finding saliency features itself is a difficult but important work.

3.3 Key SURF feature matching under irregular conditions

The general feature robustness testing criteria are that whether the feature has scale invariance, rotational invariance, illumination invariance, and affine invariance. In this study, different conditions such as sunny day with low and high illuminance, rainy day with low and high illuminance, and night with very low illuminance. Sevral vehicle interference also has been tested. The experiment results prove that the key SURF feature matching method is feasible. As a surprised result, with many cars interference at night, this method can still robustly identify specific object, and has a better recognition effect than the result during the day. This is because in the night scene images, because of lighting reason, the features of moving object becomes more obvious, so easier to be accurately matched.The key features of matching results are shown in Fig.7.

Fig.7 Sparse saliency features matching when object size changed

Although sparse key SURF features can help to recognize objects at most conditions, but when the features are too sparse or the dynamic object’s posture is changed, the method will lose its function. When the object attitude changes, for example in Fig.8 vehicle object began to turn right, the template image is no longer applicable, unable to find the matching area accurately, so the template need to be updated.

Fig.8 Matching failure examples

After the template is updated, the car object can be correctly identified, as shown in Fig.9.

Fig.9 Object correctly identified using new template

4 Conclusions

In recent years, indoor SLAM (simultaneous localization and mapping) based on SIFT feature for mobile robot navigation has been applied successfully, but the application in an outdoor environment has not been really realized[11-13]. The outdoor mobile robot navigation using SURF feature has not yet seen any successful case report. This paper explores a method using SURF feature for the unmanned vehicle outdoor guided navigation. In this method, a kind of sparse SURF feature extraction method based on adaptive Hessian threshold is proposed, and a feature matching based moving object recognition method is used to realize the vision navigation. When there is no guidance or perceptual condition is very poor, the reactive navigation mode is more applicable, and when the unmanned vehicle runs along a fixed road, the guided navigation is more facilitated and easier to be realized obviously. Therefore the method described in this paper has a strong practical value, and has a great potential in improving the unmanned vehicle autonomous navigation.

[1] Folker W, Harald W, Arjan K. Composing the feature map retrieval process for robust and ready-to-use monocular tracking[J]. Computers & Graphics, 2011, 35(4):778-788.

[2] Abdullah B, Sami A, Tolga C.A clustering-based method to estimate saliency in 3D animated meshes[J]. Computers & Graphics, 2014, 43:11-20.

[3] Reuter-Lorenz P, Jha A, Rosenquist J N. What is inhibited in inhibition of return[J]. Journal of Experimental Psychology, 1996, 22 (2): 367-378.

[4] Sathiyanarayanan, Mithileysh. Self controlled robot for military purpose[J]. International Journal for Technological Research in Engineering, 2014, 1 (10): 1075-1077.

[5] Wang Shoukun, Li Delong, Guo Junjie, et al. Robot stereo vision calibration method with genetic algorithm and particle swarm optimization[J]. Journal of Beijing Institute of Technology, 2013, 22(2): 213-221.

[6] Bai Tingzhu, Hou Xibao. An improved image matching algorithm based on SIFT[J]. Journal of Beijing Institute of Technology, 2013,33(6):622-627. (in Chinese)

[7] Miao Lingjuan, Zhang Xuemin, Ma Xiaowei. An improved map matching algorithm for embedded vehicle navigation[J]. Journal of Beijing Institute of Technology, 2012,32(3):268-273. (in Chinese)

[8] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.

[9] Bay H, Ess A, Tuytelaars T, et al. Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3):346-359.

[10] Juan L, Gwun O. A comparison of SIFT, PCA-SIFT and SURF[J]. International Journal of Image Processing, 2009, 3: 187-245.

[11] Farzan N, Mohammad A B, Saeid P. Robust recognition against illumination variations based on SIFT[J]. Intelligent Robotics and Applications, 7508, 2012: 503-511.

[12] Se S, Lowe D G, Little J. Vision-based mobile robot localization and mapping using scale-invariant features[C]∥Proceedings of International Conference on Robotics and Automation, Seoul, Korea, 2001: 2051-2058.

[13] Se S, Lowe D G, Little J. Global localization using distinctive visual features[C]∥Proceedings of International Conference on Intelligent Robots and Systems, IROS 2002, Lausanne, Switzerland, 2002: 226-231.

(Edited by Wang Yuxia)

10.15918/j.jbit1004-0579.201524.0112

TP 391.41 Document code: A Article ID: 1004- 0579(2015)01- 0083- 08

Received 2013- 09- 20

Supported by the National Natural Science Foundation of China(61103157); Beijing Municipal Education Commission Project (SQKM201311417010)

E-mail: wangjz@bit.edu.cn