A matting method based on color distance and differential distance①

2015-04-17 06:38NieDongdong聂栋栋
High Technology Letters 2015年3期

Nie Dongdong (聂栋栋)

(College of Sciences, Yanshan University, Qinhuangdao 066004, P.R.China)



A matting method based on color distance and differential distance①

Nie Dongdong (聂栋栋)②

(College of Sciences, Yanshan University, Qinhuangdao 066004, P.R.China)

A new matting algorithm based on color distance and differential distance is proposed to deal with the problem that many matting methods perform poorly with complex natural images. The proposed method combines local sampling with global sampling to select foreground and background pairs for unknown pixels and then a new cost function is constructed based on color distance and differential distance to further optimize the selected sample pairs. Finally, a quadratic objective function is used based on matte Laplacian coming from KNN matting which is added with texture feature. Through experiments on various test images, it is confirmed that the results obtained by the proposed method are more accurate than those obtained by traditional methods. The four-error-metrics comparison on benchmark dataset among several algorithms also proves the effectiveness of the proposed method.

natural image matting, local sampling, global sampling, color distance, differential distance, texture feature

0 Introduction

Digital matting refers to the accurate extraction of a foreground object from an image which plays a fundamental role in image processing, video editing and film production and has been widely researched and applied. The matting process is mathematically modeled by regarding observed color Izof pixel z as a combination of foreground color Fzand background color Bzusing the compositing equation given by Porter and Duff[1]:

Iz=αzFz+(1-αz)Bz

(1)

where Eq.(1) is called matting equation and αzis called alpha matte representing the foreground opacity. Alpha matte αzlies in the range of [0,1] with pixels having αz=1 belonging to the foreground, those αz=0 belonging to the background and 0<αz<1 belonging to the mixture of foreground and background that is referred to “mixed pixels”. The formation of mixed pixels usually appears in many cases, such as translucent objects, foreground edge of objects with soft nap, discontinuity and motion occurred in the image processing of discretization or the blurring caused by light.

For color images, it is necessary to estimate foreground and background colors and alpha matte for each pixel. The dimension of variables Iz, Fz, Bzis three in RGB color space. From Eq.(1), since pixel color Izis known, three highly ill-posed equations could be got, in which there are only three known variables and seven unknown variables, and the number of unknowns is much larger than the number of equations. Therefore, previous assumptions or additional information are often required to solve this matting problem.

Roughly speaking, many matting approaches rely on the strong correlation between nearby image pixels to alleviate the problems. Based on the matting methods of making use of natural image statistics, the present matting methods can be categorized as sampling based methods and propagation based methods.

Sampling based methods assume that foreground and background colors of an unknown pixel can be explicitly estimated by examining nearby pixels which have been specified by the user as foreground or background. These color samples are then used to estimate the alpha value directly. These methods perform well in the cases that the true foreground and background colors are in the sample set. While the true foreground/background colors are not always covered, because these methods only collect samples near each unknown pixel and the number of samples is rather limited.

Propagation based methods assume foreground and background colors are smooth locally and do not account for foreground and background colors explicitly, for example, they can be modelled as constant or linearly varying. They have succeeded in many cases, but may fail when preconditions are not always met.

In this paper, a new matting method is proposed to explicitly avoid these limitations as much as possible. The method combines local sampling with global sampling to consider all available samples, thus avoiding the problem of missing true samples. Combining with propagation based methods, the paper proposes a sample selection cost function consisting of color distance and differential distance to select effective samples when foreground/background color distributions overlap leading to erroneous samples for F and B. Finally, a quadratic objective function based on matte laplacian coming from KNN matting which is of added texture feature is used for post-processing. Experiments show that our matting method provides both visually and quantitatively good results.

1 Related work

Since the method combines sampling based method with propagation based method, it reviews work related to these two approaches of matting. A more thorough review can be found in Ref.[2].

1.1 Sampling based methods

In most images, for sampling based methods, similar pixels always have correlation on statistical characteristics, so samples can be selected from similar pixels and matting parameters F, B, α of pixels can be estimated in unknown regions according to the characteristic of samples. In the Knockout system[3], a weighted sum of known samples was used to estimate true foreground and background colors of unknown samples. The weight was determined by the spatial distance between the known sample and the unknown pixel. Ruzon and Tomasi[4]introduced probability and statistic in the digital matting at the earliest and analyzed the statistical distributions of foreground and background samples for alpha estimation using Gaussian mixture model (GMM). Bayesian Matting[5]further improves their approach and treats the estimation of unknown pixels as a maximum posteriori probability problem using Bayesian formula. Since it needs to do clustering and statistic, the amount of calculation is large and the error will be accumulated in the process. However, this method wins the best effect at the time, so it gets a wide range of reference and improvement and is considered as the classic method of digital matting. To reduce the amount of calculation, Chang, et al.[6]estimates colors only based on the space distance and confidence without using clustering and statistics. Juan, et al.[7]uses global GMM for clustering and statistics about foreground and background regions respectively.

For the construction of candidate sample set, Robust matting[8]relies on “confidence coefficient” based on color distance and space distance to select samples not only close to unknown pixels, but also expanding along the edge of known regions nearby. Shared matting[9]uses different combinations of spatial, photometric and probabilistic characteristics of image to find the best samples for unknown pixels. Shahrian and Rajan[10]take the advantage of texture feature to complement color for sample selection. Besides, Global matting[11]designs a global sampling method and constructs a “FB search space” for candidate samples. Comprehensive matting[12]builds a comprehensive sampling set by sampling from all color distributions in known regions which contains highly correlated boundary samples as well as samples inside of the F and B regions.

1.2 Propagation based methods

Propagation based methods assume that neighboring pixels are correlated under some image statistics and use their affinities to propagate alpha values of known regions toward unknown ones. The closed-form matting[13]assumes that local smoothness on foreground and background colors can be fit with linear models and obtain a quadratic cost function in alpha that can be minimized globally. Furthermore, the Poisson matting[14]assumes that intensity change in the foreground and background is smooth and operates directly on the gradient of the matte, then reconstructs the matte by solving Poisson equations. A similar method based on Random walks is proposed in Ref.[15]. The nonlocal matting[16]approach capitalizes on accurate clustering of foreground and background to reduce user input, where ideally the user only needs to constrain a single pixel in each cluster for computing the optimal mattes. Inspired by nonlocal matting, KNN matting[17]uses an improved matching metric to achieve good results. It does not assume the local color-line model[13], nor does it require a large kernel[16]to collect good samples, and yet it does not require sophisticated sampling strategies[3-12]. KNN matting capitalizes on the nonlocal principle by using k nearest neighbors and enjoys a closed-form solution that can harness the preconditioned conjugate gradient method (PCG). This method generalizes well to any color or feature space in any dimension.

2 Proposed method

2.1 Gathering a comprehensive sample set

The objective of sampling process is to find the best linear combination of foreground and background to represent the color at a given pixel. In this context, local sampling is combined with global sampling to achieve the goal. First, it needs to partition the image into regions and the purpose of partition could be drawn by the user and two contour lines are drawn for foreground and background. Then the image is partitioned into three regions: known foreground, known background and unknown regions that consist of a mixture of foreground and background colors. This image is referred to as a trimap. For local sampling, it just uses foreground and background colors in the boundary of known regions and selects m nearest known foreground and background colors far from unknown pixels as samples. In order to obtain a more comprehensive sample set, global sampling is combined to ensure that each color distribution in a pixel’s sampling range is represented in the sample set. Samples are selected through GMM clustering with respect to color and the number of GMM distribution is n. Therefore, a comprehensive sample set is obtained including each color distribution so as not to miss out on the true samples. In the experiment, m=5, n=10.

2.2 Selection of best samples

In this context, the sample selection standard is based on color distance and differential distance. In the RGB color space, a straight line is determined by two sample points. If the line passes through or is close to a certain unknown pixel, then the convex combination of the foreground and background samples can represent the unknown pixel’s color. If a pair of samples (Fi, Bj) is used to represent the colors of thus two samples, then alpha value α of unknown pixel I is computed as

(2)

Then a color cost function dcis defined which describes how well a sample pair fits the matting Eq.(1), whose expression is as follows:

dc(Fi, Bj)=‖I-(αFi+(1-α)Bj)‖

(3)

Eq.(3) is the color distance in the RGB space, which indicates the color difference between unknown pixel’s true color value I and its estimated color value. The same color distance is first introduced in Ref.[8]. For each sample pair in sample set, the smaller the color distance is, the more effective the sample pair for unknown pixel is. However, if the samples’ foreground and background color values, or the unknown pixel’s true color value and its estimated color value vary widely, it is possible that a false pair of samples occasionally well explains the unknown pixel’s color. Hence, a differential cost function dgis further defined to avoid this:

(4)

where dxand dyare obtained from the partial differential equations of the matting Eq.(1) with respect to x and y:

(5)

(6)

here, F represents the pixel foreground colors. Eq.(6) indicates the derivative for F with respect to x. Initially, the F values equal to the image color values in the two pixels dilated foreground regions, and zeros in other regions.

Our final sample selection cost function d is a combination of the color cost and the differential cost:

d(Fi, Bj)=ωdc+dg

(7)

where ω is a weight that trades off the color fitness and differential distance. In the experiment, set ω=5.

Obviously, the smaller d is, the higher the fitness degree is, which shows that the selected sample is effective for unknown pixel.

For each pixel in unknown region of the image, all the foreground/background colors are combined in the sample set and corresponding cost function d is computed, then three foreground/background color combinations are selected with smallest values. Thus, the alpha matte for the unknown pixel is the average value of the three combinations computed by Eq.(2), and the unknown pixel’s foreground color F and background color B update to the average value of the three groups of F and B.

2.3 Post-processing

To further improve the matting quality, a quadratic objective function for the initial alpha matte constructed in last section is optimized.

The eigenvector of a given pixel in the image is given first, its computation is similar to the eigenvector computation in KNN matting[17]. But in order to better enhance the foreground/background constraints, this paper introduces texture feature and uses the computation of texture feature in Ref.[18]. Thus, the dimension of eigenvector increases from six to seven and the eigenvector X(i) is computed as:

X(i)=(cos(h), sin(h), s,v,x,y,t)

(8)

where h,s,v are the respective HSV coordinates (in HSV space, H is Hue, S is Saturation, V is Value), (x, y) are the spatial coordinates of pixel i, and t is the texture feature in Ref.[18].

Thus the Kernel function is computed as:

(9)

C is the least upper bound of ‖X(i)-X(j)‖, so the range of k(i, j) is [0,1] not biasing to either foreground or background, and 1-norm is used for ‖·‖.

Furthermore, affinity matrix A is obtained as:

A=[k(i, j)]N×N

(10)

Then the Laplacian matrix is constructed as:

L=M-A

(11)

where M=diag(Mi) is a diagonal matrix with the size

(12)

3 Experimental results

In this section, the proposed matting method is compared with other two methods: Robust matting[8]and KNN matting[17]. The performances of these methods are evaluated with 5 groups of test images which are available at www.alphamatting.com. Each group consists of original image, trimap image and ground-truth image (the ground-truth image is available in benchmark dataset[19]), as shown in Fig.1. The algorithms in Refs[8,17] are implemented by ourselves.Besides, an independent quantitative evaluation is provided in terms of the mean squared error (MSE). From the experiment, it is demonstrated the effectiveness of the proposed algorithm to address the two major drawbacks mentioned in the abstract. Finally, the matting method is compared with six other matting methods by means of four measure errors and the experiment results show that our matting method performs better on the whole.

Fig.1 Test images

Visual comparisons with other methods. Fig.2 presents visual comparisons with two methods. The extracted alpha matte for the original images using Robust, KNN matting and proposed method are shown in Fig.2(a), (b) and (c), respectively. In the bear image in the fifth column, the color of the flag in the bear’s hand is similar to that of the wall, which causes that the flag is wrongly estimated as foreground for Robust and KNN matting methods. However, using differential distance to complement color distance, the proposed method brings out a better matte with the flag belonging to the background as seen in Fig.2(c). The problem is the same for doll image in second column in which some grass on the grassland are considered as the small doll’s hair. Considering the brush pot in the third column, its grid structure increases the complexity of the matting process, while we can see the proposed method gives a more accurate alpha estimation and Robust matting performs poorly in this image. The girl (first column) and the plant (forth column) have discontinuous foreground, which makes it hard to discriminate between foreground and background, while the proposed method considering texture feature achieves better results.

Fig.2 Visual comparison between various methods

Quantitative evaluation of performance. Fig.3 shows the MSE generated from our experiment. The smaller the value of MSE is, the higher the accuracy is. The proposed method of using color distance and differential distance with Laplacian post-processing outperforms the other two methods.

To further evaluate the performance of the matting algorithm, the proposed algorithm is compared with six other matting methods that mostly represent the current state-of-the-art, including SVR matting[20], weighted color and texture matting[10], global matting[11], closed-form matting[13], Bayesian matting[5]and Poisson matting[14]. In this paper, four metrics like MSE, the sum of absolute differences (SAD), gradient error (Grad.) and connectivity error (Conn.) are used for our task and detailed definitions for them can be found in Ref.[19]. All of the above mentioned algorithms are evaluated on all the test images in trimap3 from benchmark dataset and the accuracy of the resulting alpha matte is computed with respect to the four error measures. These calculation results, averaged over all test images, are shown in Table 1 which shows that our matting method performs slightly less than SVR matting, but it performs considerably better than the others.

Fig.3 MSE of different matting methods for the 5 sets of images

SADMSEGrad.Conn.SVRmatting12.10.70.90.6Weightedcolorandtex-turematting13.20.81.10.9Globalsamplingmat-ting12.80.81.01.3Closed-formmatting17.71.61.50.5Bayesianmatting25.62.92.56.3Poissonmatting47.47.34.510.2Ours12.50.81.00.7

4 Conclusion

A new image matting method is proposed that combines local and global sampling to build a comprehensive set of known samples which avoids missing true samples. This set includes not only highly correlated local samples but also the boundaries of known regions by considering global samples as well. The proposed method uses differential distance to complement color distance in extracting accurate mattes in images when foreground and background colors overlap. This is handled through the partial differential equations of the matting equation and linearly combining color and difference information. The final matte is further refined using matte Laplacian that comes from KNN matting added with texture feature. Experimental results show that the matting method performs better.

Future work includes investigating the relationship between sampling based methods and propagation based methods applied in general alpha matting of multiple image layers and video matting.

[ 1] Porter T, Duff T. Compositing digital images. In: Proceedings of Computer Graphics, Annual Conference Series, ACM SIGGRAPH, New York, USA, 1984. 253-259

[ 2] Zhang Z P, Zhu Q S, Xie Y Q. The latest research progress on digital matting. Acta Automatica Sinica, 2012, 38(10): 1571-1584

[ 3] Berman A, Dardourian A, Vlahos P. Method for removing from an image the background surrounding a selected object. US Patent: 6134346, 2000

[ 4] Ruzon M, Tomasi C. Alpha estimation in natural images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, USA, 2000, 18-25

[ 5] Chuang Y, Curless B, Salesin D H. et al. A bayesian approach to digital matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Kauai, USA, 2001. 264-271

[ 6] Chang H, Yang Q, Pan C H. An iterative Bayesian approach for digital matting. In: Proceedings of IEEE International Conference on Pattern Recognition, Hong Kong, China, 2006. 122-125

[ 7] Juan O, Keriven R. Trimap segmentation for fast and user friendly alpha matting. In: Proceedings of the 3rd IEEE Workshop on Variational, Geometric, and Level Set Methods in Computer Vision, Beijing, China, 2005. 186-197

[ 8] Wang J, Cohen M. Optimization color sampling for robust matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA, 2007. 1-8

[ 9] Gastal E, Oliveira M. Shared sampling for real-time alpha matting. Computer Graphics Forum, 2010, 29(2): 575-584

[10] Shahrian E, Rajan D. Weighted color and texture sample selection for image matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, USA, 2012. 718-726

[11] He K, Rhemann C, Rother C X, et al. A global sampling method for alpha matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, 2011. 2049-2056

[12] Shahrian E, Rajan D, Price B. et al. Improving image matting using comprehensive sampling sets. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013. 636-643

[13] Levin A, Lischinski D, Weiss Y. A closed-form solution to natural image matting. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2008, 30(2): 228-242

[14] Sun J, Jia J, Tang C, et al. Poisson matting. In: Proceedings of Computer Graphics, ACM SIGGRAPH, Los Angeles, USA, 2004. 315-321

[15] Grady L, Schiwietz T, Aharon S. et al. Random walks for interactive alpha-matting. In: Proceedings of Visualization Image and Image Processing, ACTA Press, 2005. 423-429

[16] Lee P, Wu Y. Nonlocal matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, 2011. 2193-2200

[17] Chen Q, Li D, Tang C. Knn matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, USA, 2012. 869-876

[18] Yin F, Wu R, Chen D Y, et al. Improved text image segmentation based on spectral clustering. Chinese High Technology Letters, 2013, 23(10): 1024-1029 (In Chinese)

[19] Rhemann C, Rother C, Wang J. et al. A perceptually motivated online benchmark for image matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009. 1826-1833

[20] Zhang Z P, Zhu Q S, Xie Y Q. Learning based alpha matting using support vector regression. In: Proceedings of IEEE International Conference on Image Processing, Orlando, USA, 2012, 2109-2112

Nie Dongdong, born in 1977, She is an assistant professor at the Yanshan University of College of Sciences. She received her Ph.D degree in computer applications from the Department of Computer Science & Engineering of Shanghai Jiao Tong University in 2007. She is the author of more than 25 journal papers. Her current research interests include digital image processing, computer vision, and pattern recognition.

10.3772/j.issn.1006-6748.2015.03.008

①Supported by the National Natural Science Foundation of China (No. 61133009, U1304616)

②To whom correspondence should be addressed. E-mail: NDD1WL2@163.com Received on Mar. 19, 2014, Wang Li