侯丽 刘琦
【摘 要】跨摄像机行人因光照、视角、姿态的差异,会使其外观变化显著,给行人再识别的研究带来严峻挑战。文中提出基于深度学习和度量学习的行人再识别方法。首先采用手工特征和深度特征融合网络FFN提取行人图像特征,然后将核矩阵应用于KISSME距离度量学习中,获取更优的距离度量模型。在具有挑战的VIPeR和PRID450S两个公开数据集上进行仿真实验,实验结果表明所提出的行人再识别算法的有效性。
【關键词】行人再识别;特征融合网络;深度学习;距离度量学习
中图分类号: TP391.41文献标识码: A文章编号: 2095-2457(2019)29-0112-002
DOI:10.19694/j.cnki.issn2095-2457.2019.29.051
Deep Learning and Metric Learning Based Person Re-identification
HOU Li LIU Qi
(School of Information Engineering,Huangshan University,Huangshan Anhui 245041,China)
【Abstract】Pedestrian may vary greatly in appearance due to differences in illumination, viewpoint, and poses across cameras, which can bring serious challenges in person re-identification. A deep learning and metric learning based algorithm is proposed for person re-identification in this paper. Features of human images are first extracted by a feature fusion net (FFN) composed of handcraft features and deep features, and then a kernel matrix is applied to KISSME distance metric learning to obtain a better distance metric model. Experimental results have shown that the proposed algorithm effectively improves recognition rates on two challenging datasets (VIPeR, PRID450s).
【Key words】Person re-identification; Feature fusion net; Deep learning; Distance metric learning
0 引言
行人再识别属于一种智能视频分析技术,对行人目标的跨摄像头跟踪以及行人行为分析等具有重要的研究意义。行人再识别技术,是指让计算机去判断不同摄像头拍摄的行人图像是否具有相同身份,通过行人的外观去匹配不同摄像头拍摄的行人图像。因监控场景的多变性和跨摄像机行人外观变化的复杂性,对行人再识别的研究极具挑战性。
当前对行人再识别的研究主要集中于两方面:一是提取具有辨识力的特征来描述行人外观[1-11],二是探索具有辨识力的距离度量学习方法[12-18]。然而,大多数手工提取的特征(颜色/纹理/形状等)在进行跨摄像机行人匹配时,或者辨识力不够,或者对视角变化不具有鲁棒性。深度特征在一定程度上弥补了手工提取特征的不足,但需要通过大量样本的监督学习才能获取更优的特征模型。而距离度量学习在一定程度上减轻了跨摄像机行人匹配时的外观差异,然而因有限的训练样本数据,可能无法获取跨摄像机行人更优的距离度量。
为了更好地解决跨摄像机行人外观的显著变化,文中结合深度学习技术和度量学习技术进行行人再识别,其算法流程如图1所示。首先采用手工特征和深度特征融合网络FFN对行人的训练样本进行辨识特征提取,然后将核矩阵K应用于KISSME距离度量学习中,以获取更优的距离度量模型,从而提高行人再识别的准确率和鲁棒性。
图1 算法流程
1 辨识特征提取
为了更准确地描述行人外观,文中采用手工特征和深度特征融合网络FFN提取行人图像特征[3],如图2所示。FFN由两个子网络组成。第一个子网络使用传统的CNN(卷积、池化、激活函数)来处理输入行人图像;第二个子网络使用额外的手工特征(RGB, HSV, LAB, YCbCr, YIQ颜色特征和Gabor纹理特征)来表示相同的行人图像。两个子网络共同作用形成更加充分的行人图像描述。第二个子网络在特征学习过程中用于调整第一个子网络的学习方向。最终,在融合层产生4096维的FFN特征向量。
图2 FFN特征提取图解[3]
2 核距离度量学习
为了减轻跨摄像机行人外观的变化,在行人匹配阶段,采用基于核技巧的KISSME[12]距离度量学习方法,获取最优的马氏距离度量学习模型。
给定一对样本(xi,xj),其马氏距离定义如公式(1)所示:
d■■(xi,xj)=(xi-xj)TM(xi-xj)(1)
式中:M=∑■■-∑■■为正的半正定马氏距离矩阵,能够很容易地从训练样本中学习。∑S=■∑■(x■-x■)(x■-x■)■和∑D=■∑∑■(x■-x■)(x■-x■)分别表示行人图像相似对S和不相似对D的协方差矩阵。
文中通过核技巧将样本特征向量从输入特征空间映射到高维核空间,样本特征向量之间借助核函数的映射获取核矩阵K,即:K=ΦT(X)Φ(X)表示。X表示样本特征,Φ(X)表示输入特征空间到核空间的非线性映射。核函数的引入避免“维数灾难”,可大大减少计算量,也可通过自由的选取合适的核函数改善算法的性能。
3 实验结果
文中应用具有挑战性的两个公开数据集:VIPeR和PRID450S,估计所提出的行人再識别算法的累计匹配特性(CMC)。通过随机选取行人数的一半作为训练样本集,另一半作为测试样本集。训练集中的样本用于学习距离度量模型,测试集中的样本用于衡量跨摄像机行人图像的特征距离。
表1和图3给出了VIPeR和PRID450S两个数据集的实验结果。由表1可知,基于相同特征FFN,在PRID450S数据集中有更优的识别率。在VIPeR数据集排序为1时识别率仅为26.9%,而在PRID450S数据集排序为1时识别率为49.33%。
表1VIPeR和PRID450S两个数据集的最高识别率(%)。列出了排序为1,5,10,20的累积匹配分数。
表1
图3 VIPeR和PRID450S两个数据集的最高识别率(%)
4 结论
文中提出了基于深度学习和度量学习的行人再识别算法。采用深度特征和手工特征融合网络FFN提取行人图像特征,并将核矩阵K应用于KISSME距离度量学习中,获取更优的距离度量模型。在具有挑战的VIPeR和PRID450S两个行人再识别数据集上的实验结果展示了文中提出的行人再识别算法的有效性。
【参考文献】
[1]S. Liao, Y. Hu, X. Zhu, and S. Z. Li, “Person re-identification by local maximal occurrence representation and metric learning,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA, 2015.6.7-2015.6.12.
[2]T. Xiao, H. Li, W. Ouyang, and X. Wang, “Learning deep feature representations with domain guided dropout for person re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016.6.26-2016.7.1.
[3]S. Wu, Y. C. Chen, X. Li, A. C. Wu, J. J. You, W. S. Zheng, “An enhanced deep feature representation for person re-identification,” IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, 2016.3.7-2016.3.9
[4]D. Cheng, Y. Gong, S. Zhou, J. Wang, and N. Zheng, “Person re-identification by multi-channel parts-based CNN with improved triplet loss function,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016.6.26-2016.7.1.
[5]Y. Chen, X. Zhu, and S. Gong, “Person re-identification by deep learning multi-scale representations,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017.7.21-2017.7.26
[6]H. Zhao, et al., “Spindle net: Person re-identification with human body region guided feature decomposition and fusion,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017.7.21-2017.7.26.
[7]X. Liu, et al., “Hydraplus-net: Attentive deep features for pedestrian analysis,” IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.10.22-2017.10.29.
[8]Y. Sun, et al., “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” European Conference on Computer Vision (ECCV), Munich, Germany, 2018.9.8-2018.9.14.
[9]L.Zhao,et al.,“Deeply-learned part-aligned representations for person re-identification,”IEEE International Conference on Computer Vision (ICCV),Venice,Italy,2017.10.22-2017.10.29.
[10]L. He, et al., “Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, 2018.6.18-2018.6.22.
[11]X. Chang, et al., “Multi-level factorisation net for person re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, 2018.6.18-2018.6.22.
[12]M. Koestinger, et al., “Large scale metric learning from equivalence constraints,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, Rhode Island, USA, 2012.6.16-2012.6.21.
[13]S. Pedagadi, et al., “Local fisher discriminant analysis for pedestrian re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, 2013.6.23-2013.6.28.
[14]F. Xiong, M. Gou, O. Camps, M. Sznaier, “Person re-identification using kernel-based metric learning methods,” European conference on computer vision (ECCV), Zurich, Switzerland, 2014.9.6-2014.9.12.
[15]S. Paisitkriangkrai, C. Shen, A. Hengel, “Learning to rank in person re-identification with metric ensembles,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA, 2015.6.7-2015.6.12.
[16]Y. Yang, S. Liao, Z. Lei, S. Z. Li, “Large scale similarity learning using similar pairs for person verification,” AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona, USA, 2016.2.12-2016.2.17.
[17]L. Hou, K. Han, W. G. Wan, J-N Hwang, H. Y. Yao, “Normalized Distance Aggregation of Discriminative Features for Person Re-identification,” Journal of Electronic Imaging, 2018, 27(2): 023006.
[18]X. Yang, M. Wang, and D. Tao, “Person re-identification with metric learning using privileged information,” IEEE Transactions on Image Processing, 2018, 27(2),791-805.