Anti-JPEG Compression Steganography Based on the High Tense Region Locating Method

2019-04-29 03:21:34YangWuWeipingShangandJiahaoChen
Computers Materials&Continua 2019年4期

Yang Wu, Weiping Shang and Jiahao Chen

Abstract: Robust data hiding techniques attempt to construct covert communication in a lossy public channel. Nowadays, the existing robust JPEG steganographic algorithms cannot overcome the side-information missing situation. Thus, this paper proposes a new robust JPEG steganographic algorithm based on the high tense region location method which needs no side-information of lossy channel. First, a tense region locating method is proposed based on the Harris-Laplacian feature point. Then, robust cover object generating processes are described. Last, the advanced embedding cost function is proposed. A series of experiments are conducted on various JPEG image sets and the results show that the proposed steganographic algorithm can resist JPEG compression efficiently with acceptable performance against steganalysis statistical detection libraries GFR (Gabor Filters Rich model) and DCTR (Discrete Cosine Transform Residual).

Keywords: Robust data hiding, steganography, JPEG compression resistant, Harris-Laplacian feature.

1 Introduction

Steganography is now a fairly standard concept in computer science [Ker, Bas, Böhme et al. (2013)]. It focuses on establishing a stable and effective covert channel by using the public channel [Fridrich (2009)]. Thus, the secret information can be transmitted through public carrier with a supervising monitor by steganographic (stego) technology(especially in an enemy-controlled environment). At present, social and blog-like networks are gradually entering the daily life of human beings, and the images (most images are JPEG format in social networks) transmitted therein are of mass amount and spread widely. Thus, it is easy to cover up the stego images and the identities of covert communication users when the covert channel is established on such networks. However,such networks tend to use lossy compression algorithms to save computing power and network bandwidth in the transmitting process [Zhang, Luo, Yang et al. (2016)]. How to reduce the influence of such lossy operations on the embedded information is an essential problem of applying steganography in social networks.

On this problem, Zhang et al. [Zhang, Luo, Yang et al. (2016)] first proposed the a steganography algorithm DCRAS (Discrete Cosine Relationship Adaptive Steangography)which is based on the relative relationship between DCT coefficients of adjacent8× 8 blocks in the same in-block position of JPEG images; Then, FRAS (Feature Region based Adaptive Steganography) algorithm [Zhang, Luo, Yang et al. (2017)] is proposed based on the invariant-feature-point region where the modified elements are concentrated in the hard-to-detect area to reach higher resisting performance against steganographic statistical detections [Pevný and Fridrich (2007); Kodovský, Pevný and Fridrich (2010);Holub and Fridrich (2015); Denemark, Boroumand and Fridrich (2016); Ma, Luo, Li et al.(2018)]. In the JPEG compression channel, DCRAS and FRAS algorithms can extract the embedded information correctly with much higher probability than the traditional adaptive JPEG steganographic algorithms, such as NPQ (New Perturbed Quantization) [Huang, Luo,Huang et al. (2012)], UED (Uniform Embedding Distortion) [Guo, Ni and Shi (2012)], JUNIWARD (JPEG image UNIversal WAvelet Relative Distortion) [Holub, Fridrich and Denemark (2014)] and so on. It is worth noting that, as stated in the literature [Zhang, Luo,Yang et al. (2016); Zhang, Luo, Yang et al. (2017)], a necessary condition of the DCRAS and FRAS algorithms is that the sender needs to know the quality factor value of JPEG compression used in the lossy channel (named as side information). Furthermore, the generated stego image can only resist the JPEG compression whose quality factor value is same to the side information. When the side information is missing, the DCRAS and FRAS algorithms can hardly work properly in many compression situations [Zhang, Qin, Zhang et al. (2018); Bao, Luo, Zhang et al. (2018)].

To improve the resistance against JPEG compression of steganography without the side information, an anti-JPEG compression steganography algorithm is designed in this manuscript. First, the region with strong anti-JPEG compression is proposed based on Harris-Laplacian transform. Second, the anti-JPEG cover generation method for borderless information and the corresponding embedding distortion function are given.Last, the concatenated error correction code is combined to elevate the extraction accuracy. The effectiveness of the proposed algorithm is verified by a series of comparative experiments on the standard steganalysis image library BOSSbase 1.01 against existing JPEG adaptive steganography, robust watermarking, DCRAS algorithm and FRAS algorithm for anti-JPEG compression and anti-statistical detection. The results imply that the proposed method can effectively resist JPEG compression under the condition of missing quality factor information with acceptable resistance performance to statistical detection.

The paper has a simple structure. The knowledge of adaptive JPEG steganography and matrix embedding coding are introduced in Section 2 first. Then, Section 3 describes the details of proposed method. Last, the experimental results and conclusions are presented in Section 4 and Section 5 respectively.

2 Related works

In this section, the adaptive JPEG steganography and matrix embedding coding methods are briefly introduced in the following subsections.

2.1 Adaptive JPEG steganography

JPEG format is popular in the social networks for the high image quality and compression performance. Usually, most JPEG steganographic algorithms such as NPQ,UED and J-UNIWARD embed the secret message into the cover JPEG image by modifying the DCT (Discrete Cosine Transform) coefficients of it.

In general, the original spatial image needs to perform color space conversion (from RGB domain to YUV domain) and down-sampling operation first. Then the three independent YUV sub-images are divided into continuous non-overlapping8× 8blocks respectively,and then independently perform discrete cosine transformation operation. The ready to stored DCT coefficients are gained after quantization and rounding processes. Because the inter-relationship between elements of U sub-image and V sub-image is sensitive, it is suggested to apply embedding process on Y sub-image to increase the security.Meanwhile, the Y sub-image store the luminance information, and researchers try to brief the JPEG image. Thus, the experiments of existing researches on JPEG steganography are focused on grayscale images which can ignore the effects of color space conversion and down-sampling processes. The experiments of this manuscript will also follow this setting and focus on grayscale images.

At present, most popular JPEG adaptive steganographic algorithms consist of embedding cost function and steganographic embedding encoder. This framework is based on the minimum distortion model introduced by Fridrich et al. [Fridrich and Filler (2007)]. This architecture can concentrate the modifications caused by embedding on the DCT coefficients of smaller “cost value” in the embedding cost function. The anti-statistical detection capability of the algorithm is increased if the embedding cost function is well defined. Therefore, the anti-statistical detection capability of the JPEG steganographic algorithm under the framework is closely related to the embedding cost function. On this research, Holub et al. [Holub, Fridrich and Denemark (2014)] proposed the JPEG adaptive steganography algorithm J-UNIWARD, which is defined by The embedded distortion function is composed of decomposition coefficients of a plurality of twodimensional wavelets (two of which are perpendicular to each other), which can finely describe the smoothness of pixels in multiple directions, so that the embedded modification can be more concentrated to be difficult to detect. On the element, the embedded distortion function can be defined as:

The symbolsXandYrepresent cover object and stego object respectively, the symbol J-1(.)represents the inverse DCT from the frequency domain to the spatial domain, and the symbolrepresents the uv-th decomposition coefficient of the r-th (r=1, 2, 3)wavelet (u and v represent the position of the two sub-waveletrespectively),ε>0is a constant value which is used to prevent the divisor from appearing 0, it is usually set to a small value, such asε=10-5.

2.2 Robust steganography

Zhang et al. [Zhang, Luo, Yang et al. (2016)] proposed a framework for designing robust steganography algorithm. This framework combines the traditional JPEG adaptive steganography algorithm with a famous robust watermarking algorithm, and tries to reach the goal of resisting the statistical detection and JPEG compression.

Under this framework, on the sender:

1 Determining the domain in which the robust steganographic embedding modification is performed.

2 Determining the specific modification measurement on the domain in Step 1 to make the embedded information can effectively resist the lossy JPEG compression operation.

3 Defining the embedding cost function according to the embedded modification measurement to reduce the influence on statistical aspect.

4 Encoding the embedded secret information by using the error correction code to improve the robustness.

5 Embedding secret information by STCs embedding algorithm and packaging the stego object to JPEG format.

For the receiver, after receiving the JPEG image transmitted over the lossy channel:

1 Reading the JPEG format image and the corresponding stego object in the domain.

2 Extracting the information using the STCs extraction algorithm.

3 Correcting the errors in the extracted information by the error correcting code, and finally obtaining the original embedded information.

Under the framework above, the DCRAS and FRAS algorithms are proposed [Zhang,Luo, Yang et al. (2016); Zhang, Luo, Yang et al. (2017)], they can protect the embedded information against JPEG compression and statistical detection well. Nevertheless, the QF (quality factor) of JPEG compression used by the lossy channel should be known in advance (regard as side information) nor leading to significant decline in resisting JPEG compression, and the generated stego object can only resist the specific JPEG compression whose QF value is same to the pre-known QF.

2.3 Harris-Laplacian feature

At present, in the research of robust watermarking algorithms, the information embedding method for information protection through local image feature points has become one of its hot spots. The robust watermarking algorithm calculates the invariant feature points of the watermark carrier image, and then uses the feature point as a center point to generate a watermark embedded region and hides the information. The literature[Lu, Lu and Chung (2010)] uses filtering residuals to calculate and locate the feature points of the carrier image, and gives a normalization method for the regions delimited by the feature points, which can better guarantee the robustness of the embedded information. Using the idea of invariant feature points, a robust watermarking algorithm based on image Harris-Laplacian feature points are proposed in Tsai et al. [Tsai, Huang and Kuo (2011)]. It performs Harris-Laplacian transformation of the image and calculates its corresponding feature points, and selects feature regions to resist specific ones. Lossy operation improves the ability of the watermark information to resist multiple types of lossy attacks. At the same time, the literature [Tsai, Huang, Kuo et al. (2012)]demonstrates the effectiveness of such Harris-Laplacian transform image features in robust watermarking, and proposes a more robust and secure one based on the literature[Lu, Lu and Chung (2010)].

3 Proposed method

On the problem of eliminating QF side information in robust steganography and resisting JPEG compression with multiple QF values, a new anti-JPEG steganography based on a region location method is proposed in this section. Frist, the proposed region location method is proposed. Then, the improved cover generation method is described. Last, the proposed embedding cost function and the error correction method setting is briefly introduced.

Notice that the proposed anti-JPEG steganography is under the framework in Section 2.2,and the diagram is shown in Fig. 1.

Figure 1: Diagram of the proposed algorithm

3.1 High tense region locating method based on Harris-Laplacian feature point

A natural thought is that the complex-area of image will lose more information than the plain-area after processing JPEG compression. It is reasonable because the discrete cosine transform will concentrate the “energy” of image to the low frequency area, and the quantization and rounding processes cut the details of image to reach the goal of compression.

However, in literature Lu et al. [Lu, Lu and Chung (2010)], a new idea is pointed that the edges of object in image owns strong robustness because they contains much more information about the corresponding object. Furthermore, the pixels of object edges in image are more suitable to be modified than the plain area pixels in most adaptive steganographic algorithms. Thus, this sub-section proposes a high tense region locating method based on the Harris-Laplacian feature point that can be rebuilt the embedding region after the stego image is compressed and the modified elements are concentrated in the complex-area.4

The processes of the method are presented as follows:

Step 1. Functions that convert image I into scale space L are defined as:

where symbola=(i, j)denotes the spatial coordinates of a certain pixel of the image,function G denotes a standard Gaussian kernel function,σDdenotes a scale parameter of the kernel function, and “*” denotes a convolution calculation operation.

Step 2. In order to characterize the local structure of the image in the scale space, the autocorrelation matrix is defined in the scale space obtained in Step 1:

whereσIis the integral scale and Lxand Lyrepresent the first-order derivative function of the x-axis and y-axis directions, respectively, in the scale space.

Step 3. An angle response functionc(a, σI, σD)is designed based on µ(a,σI,σD)to quantify the local curvature amplitude of the image(i, j)position pixel:

where “det” represents matrix determinant, and symbol tr represents the trace of the matrix.The larger the value of the angle response function is, the greater the probability that the corresponding image pixel can be repositioned after being subjected to a lossy attack.

Step 4. Laplacian-of-Gaussian operationLoG(a,σn)and combining the angle response functionc(a, σI, σD)are used to find the robust edgepixels of the object in multiple dimensions in the image. Selecting the pixels with the largest 1% value of the angle response functionc(a, σI, σD)in the image to be candidate points first. Then, the extreme point of the absolute value of the Gaussian-LaplacianLoG(a,σn)on σn∈{(1.1)i×1.5|i=1,2,...,n}is selected. The measure of determining the extreme points is: supposeσD=0.7, n=15 when f satisfies:

The pixels selected by the above algorithm is named as “featurepixel with scale value σc=(1.1)i×1.5”, and form a sequence set C={a1,a2,...,ak}in order from left to right in the image from top to bottom.

Step 5. The radius value r of several same-radius circles whose centers are the elements ofC={a1,a2,...,ak}are determined, and the embedding area are within the circles. The method of determining r iteratively is given by setting the initial value of r to 1 (the unit is the minimal distance between two pixels), and the t-th iteration steps are:

(1) Counting the numberntof pixels within the circles whose centers are located by the feature pixels of the setCand the radius is the r of this literation.

(2) If the length m of information to be embedded satisfies m ≥nt/3, then r=r+1 and enters the t+1 iteration.

(3) If the length m of information to be embedded satisfies m

This ensures that there are enough embedded points in the selected area.

3.2 Generating cover object and modifying method

To against the JPEG compression operation, a cover generating (Step 1 and Step 2) and the corresponding modifying method (Step 3 to Step 5) without side information is described as follows:

Step 1. A set of non-overlapping8× 8DCT coefficient-blocks is obtained from the JPEG cover image. SymbolDk={Dk(i),i=1,2,...,64},k=1,2,...,mis used to denote the set where m is the number of DCT blocks, and scalars i and k denote the i-th coefficient in the k-th block (in order from left to right, up to bottom).

Step 2. A n-element robust cover objectX={x1,x2,...,xn}is generated by:

whereMkiis the rounded mean value of three neighbor-blocked DCT coefficients Dk1(i),Dk2(i),Dk3(i).

Step 3. The robust virtual stego object Yobtained by modifying element values of cover objectX , and the mapping rule of applying the modifications to {Dk}1≤k≤Bis expressed by:

Step 4. The stego object after suffering JPEG compression operation is denoted by symbol Y whose element,(1≤ j≤n)is:

where the symbols D,Dk={(i),i=1,2,...,64} and {}k,irespectively denote the sets of DCT coefficients,8× 8blocks and neighboring mean values after JPEG compression.Step 5. The value ofσkiis obtained on:

3.3 Design of embedding cost function and error correction method setting

After the cover object is generated by Section 3.2, the embedding process and error correction method are applied to concentrate the modifications on the hard-to-detect area and increase the robustness.

According to the typical steganographic scheme, the design of embedding cost function effects the detection resisting performance a lot because the embedding encoder STCs nearly reach the bound. The construction of embedding cost function used in Bao et al.[Bao, Luo, Zhang et al. (2018)] achieves good performance and it is expressed as:

Even though xj=yj, the function value calculated by formula (10) is not “0” when xj=yj=1&Dk( i)<Mki+σkiand xj=yj=0&Dk( i)>Mki-σki.

Because the STCs embedding encoder tries to concentrate the modifications on lowfunction-value elements, thus the element in the situations above cannot be fully used.Thus, a new embedding cost function DFpro(xj,yj)is proposed as follows:

Then, error correction encoder RS (Reed and Solomon) is applied on secret information before using STCs algorithm. RS is set to parameter (40, 90) which means 40 input elements encoded to 90 output elements, and error diffusion method proposed in Bao et al.[Bao, Luo, Zhang et al. (2018)] are also used in this process to increase the robustness of cover object to against JPEG compression.

4 Experiments

In order to verify the effectiveness of the proposed method, experiments were conducted based on BOSSbase image library. First, the experimental setups and the used image database are introduced. Then, the anti-JPEG compression performance and anti-statistical detection performance of the proposed method are compared with the JPEG adaptive steganography algorithm J-UNIWARD [Holub, Fridrich and Denemark (2014)], robust watermarking algorithm [Chen, Ouhyoung and Wu (2000)], robust steganography algorithm DCRAS [Zhang, Luo, Yang et al. (2016)] and FRAS [Zhang, Luo, Yang et al. (2017)].

4.1 Setups

All the experiments presented in this section were performed on a personal computer equipped with Intel Core i7-8700 CPU (3.2 GHz) and Windows 10 system. The software used in the experiment is MATLAB R2017a, and the spatial image library used is BOSSbase 1.01 (proposed by Patrick Bas, Tomas Filler, Tomas Pevny on ICASSP 2013,the download address is: http://agents.fel.cvut.cz/stegodata/).

In the experiment of anti-JPEG compression, the embedded information in the stego image is extracted after lossy compression of JPEG. The error rate of information extraction is used to measure the resistance of robust steganographic algorithm to JPEG compression. It is defined as the rate of error bits number in extracting information to the total bits number of embedded information.

In the anti-detection performance experiment, two famous JPEG image statistical detection feature libraries GFR (Gabor Filters Rich model [Song, Liu, Yang et al. (2015)]) and DCTR (Discrete Cosine Transform Residual [Holub and Fridrich (2015)]) are used with ensemble classifier [Kodovský, Fridrich and Holub (2012)]. Last, the test error of the ensemble classifier is used to measure the anti-detection performance of the steganography algorithm. The closer value to 50% means the stronger anti-detection performance.

4.2 Anti-JPEG compression experiments

In this section, the 10,000 JPEG images compressed with QF=75 is generated from BOSSbase 1.01 image database. The steganographic information is embedded by JUNIWARD [Holub, Fridrich and Denemark (2014)], robust watermarking algorithm[Chen, Ouhyoung and Wu (2000)], robust steganography algorithm DCRAS [Zhang, Luo,Yang et al. (2016)], FRAS [Zhang, Luo, Yang et al. (2017)] and the proposed method in this manuscript. The embedding rate is varied from 0.01 bpnzAC (bits per non-zero Alternating Current coefficient) to 0.10 bpnzAC with 0.01 interval on the BOSSbase library with the 5 different algorithms mentioned above. Each setting generates 10,000 stego images and they are JPEG compressed by quality factors of 65, 75 and 85. Then,the embedded information is extracted to count the error rate. It is worth noting that in order to simulate the lossy transmission of unbounded information, the side information used by the carrier generated by DCRAS algorithm in the experiment is QF=90. The error extraction rate counting results of JPEG compression attacks with QF=75, 85 and 95 are shown in Fig. 2.

From the Figs. 2(a), 2(b), 2(c), we can see that the error rates of extracted information of J-UNIWARD, DCRAS and FRAS are about 50%. It means that they can hardly resist the damage on embedded information from JPEG compression. The DCRAS and FRAS lost the ability of resisting JPEG compressing when the side-information is missing. The proposed algorithm and the robust watermarking algorithm both have low error rate of information extraction under most JPEG compression conditions. Among different compression settings, the error extraction rates of the proposed algorithm in this manuscript can be reduced by 9.47% at most (JPEG compression attack with QF=65)compared with that of the watermarking algorithm. It is also noted that the error extraction rates decreases with the increase of the quality factor of JPEG compression attacks. In JPEG compression attacks with QF=75 and 85, the proposed algorithm can also guarantee the low error extraction rate. This is because the higher the quality factor of JPEG compression, the less image information lost during compression.

Figure 2: Experimental results of anti-JPEG compression attack on different steganographic and robust watermarking algorithms: (a), (b) and (c) present the results under JPEG compression attack of quality factor=65, 75 and 85 respectively

4.3 Statistical detection experiments

In this section, a set of carrier images with a quality factor of 75 and 85 is generated from all 10,000 spatial images in the BOSSbase 1.01 image database, and then random secret information is embedded using robust watermarking algorithm [Chen, Ouhyoung and Wu(2000)], DCRAS [Zhang, Luo, Yang et al. (2016)], FRAS [Zhang, Luo, Yang et al.(2017)] algorihtmsand and the proposed algorithm. The embedding rate is varied from 0.01 bpnzAC to 0.10 bpnzAC with 0.01 interval, and 10,000 stego images are generated of each different embedding algorithms with different embedding rates. Then, we use two statistical detection features of GFR (contains 17,000 features) and DCTR (contains 8,000 features) to extract the features of the carrier and the carrier image. Then we use the extracted features from 5,000 random chosen cover-stego pairs to train the ensemble linear classifier, and then the trained classifier is applied on the remained 10,000 images(5,000 cover images and 5,000) in the image database. Distribution) is used to classify the ensemble classifier. The test error rates of classifier with GFR and DCTR features are shown in Tab. 1 and Tab. 2 respectively.

Table 1: Experimental results of statistical detection under GFR feature library against different steganographic and robust watermarking algorithms (The QF of cover JPEG image=75 and 85, and the bolt numbers denotes the results under QF=85)

Table 2: Experimental results of statistical detection under DCTR feature library against different steganographic and robust watermarking algorithms (The QF of cover JPEG image=75 and 85, and the bolt numbers denotes the results under QF=85)

FRAS [Zhang, Luo, Yang et al. (2017)]0.4189 0.4328 0.3355 0.3543 0.2738 0.2821 0.2096 0.2245 0.1749 0.1803 Proposed Algorithm 0.3914 0.4108 0.1645 0.1539 Methods Embedding Rate (bpdzAC)0.2926 0.3104 0.2020 0.2177 0.1619 0.1765 0.06 0.07 0.08 0.09 0.1 Robust Watermarking Algorithm [Chen,Ouhyoung and Wu (2000)]0.0016 0.0016 0.0013 0.0014 0.0011 0.0013 0.001 0.0001 0.001 0.001 0.0517 0.0529 FRAS [Zhang, Luo, Yang et al. (2017)]DCRAS [Zhang, Luo, Yang et al. (2016)]0.1370 0.1448 0.1127 0.1202 0.0909 0.1001 0.0694 0.0721 0.0574 0.0627 Proposed Algorithm 0.1101 0.1143 0.1413 0.1487 0.1175 0.1259 0.0992 0.1051 0.0741 0.0793 0.0829 0.0903 0.0613 0.0676 0.0424 0.0471 0.0115 0.0109

From the results of Tab. 1 and Tab. 2, it can be seen that the detection error rate of robust watermarking algorithm is very low. It is due to the primal design of watermarking is to make the embedded message can be detected even after suffering JPEG compression operation. Therefore, the stego image generated by the robust watermarking algorithm can be easily detected by statistical detection method. Compared with the side informed robust steganography algorithms DCRAS and FRAS, the proposed robust steganography algorithm loss little detection resistance at low embedding rate, while a sharp decline in detection resistance occurs at embedding rate higher than 0.05 bpnzAC. This phenomenon may be caused by the embedding regions selected by proposed method in 3.1 are nearly to the full image. It means that the region select strategy becomes useless in this case. Thus, to ensure security, the proposed robust steganography is suggested to work at a low embedding rate which owns strong anti-statistical detection ability.

5 Conclusions

In order to design robust steganography against JPEG compression without side information, this manuscript proposed a robust watermarking algorithm based on the high tense region method. The region locating method combines the LoG operator and robust steganography and cover generating method without side information is described. New embedding cost function is proposed on overcoming the defects of the existing function.Comparative experimental results show that the proposed algorithm can reach a high compression resistance on sacrifice a small part of the anti-statistical detection ability.How to overcome the resistance decline problem in high embedding rates and more kinds of lossy operation are our further researching interests.

Acknowledgement:This work was supported by the National Key Research and Development Program (2016YFB0801601) and the Pilot Technology Transfer Innovation Reform of Xidian University (No. 90904180001).