On Hiding Secret Information in Medium Frequency DCT Components Using Least Significant Bits Steganography

2019-02-20 12:17SahibKhanIrfanArslanArifSyedTahirHussainRizviAsmaGulMuhammadNaeemandNasirAhmad

Sahib Khan,M A Irfan,Arslan Arif,Syed Tahir Hussain Rizvi,Asma Gul,Muhammad Naeemand Nasir Ahmad

Abstract:This work presents a new method of data hiding in digital images,in discrete cosine transform domain.The proposed method uses the least significant bits of the medium frequency components of the cover image for hiding the secret information,while the low and high frequency coefficients are kept unaltered.The unaltered low frequency DCT coefficients preserves the quality of the smooth region of the cover image,while no changes in the high DCT coefficient preserve the quality of the edges.As the medium frequency components have less contribution towards energy and image details,so the modification of these coefficients for data hiding results in high quality stego images.The distortion due to the changes in the medium frequency coefficients is insignificant to be detected by the human visual system.The proposed methods demonstrated a hiding capacity of43.11%with the stego image quality of a peak signal to the noise ration of36.3dB,which is significantly higher than the threshold of30dB for a stego image quality.The proposed technique is immune to steganalysis and has proved to be highly secured against both spatial and DCT domain steganalysis techniques.

Keywords:DCT steganography,image processing,information security,data hiding,steganalyisis.

1 Introduction

A steganography technique embeds secret information in a digital cover medium(e.g.,image,audio,or video),in such a way that its existence remain imperceivable and do not stir up an intruders’doubt.It is different from cryptography in the sense that in cryptography the message is encrypted,and the third person knows about the secret communication between two parties,but a decryption method is required to know what actually is communicated.It totally depends on the strength of encryption algorithm to prevent intruders from getting access to the messages exchanged between the sender and the desired receiver[Morsy,Nossair,Hamdy and Amer(2011)].

In steganography,the exchange of the message occurs in a manner such that an intruder cannot detect the exchange of secret communication other than sending and receiving of media files.To avoid an eavesdroppers’suspicion the least significant bits in the covering medium are used to hide the secret messages.The use of redundant bits transfers information without affecting the cover medium statistical properties[Giri and Bashir(2017);Wong,Qi and Tanaka(2017)].The medium having high levels of redundant bits is considered most suitable for hiding secret messages and is always the preferred medium to be used by the steganographers.

The steganography can be performed either in the spatial domain or a suitable transformation of the image such as discrete cosine transform(DCT).In the spatial domain steganography techniques,the least significant bits(LSB)of the cover medium pixels are used directly to hide the secret messages[Wang,Ni,Zhang et al.(2017);Khan,Arif,Rizvi et al.2018].LSB steganography and 4LSB steganography are among the well-known spatial domain steganography methods.In variable least significant bits(VLSB)steganography,a different number of LSB of the cover pixels are used for hiding the secret message[Khan,Ismail,Khan et al.(2016)].The VLSB steganography has been implemented using decreasing distance decreasing bits algorithm(DDDBA)[Khan,Yousaf and Akram(2011)],modular distance technique(MDT),and varying index varying bits substitution(VIVBS)[Khan,Ahmad and Wahid(2016);Khan and Tiziano(2018)].

In transform domain steganography,the secret message is hidden in the coefficients of transformed image instead of the image pixels directly.The Discrete cosine transforms(DCT)is one of the widely used transformed domain used by the steganographers.The method proposed in Hazra et al.[Hazra,Ghosh and Rahman(2018)],the secret message is hidden in the least significant bits of DCT transform coefficients.The work presented in Vleeschouwer et al.[Vleeschouwer,Delaigle and Macq(2001)],and Goljan et al.[Goljan,Fridrich and Du(2001)],proposed invertible steganography techniques in the transform domain.However,the reported hiding efficiency was very low and increase in hiding efficiency resulted in the severe degradation of the image quality.In Xuan et al.[Xuan,Zhu,Chen et al.(2002)],a steganography method with high hiding efficiency has been proposed using wavelet transform;however,the quality of the resultant stego image was very low.A variable data hiding method in the DCT domain is proposed in Khan et al.[Khan,Khan,Iqbal et al.(2013)].

The main goal of the steganography techniques is to keep the presence of hidden information unnoticed.The natural characteristic of the human visual system(HVS)is that it is much sensitive to distortions in the smooth areas of the image and less sensitive to those in the complex regions[Zhang and Wang(2005)].DCT is a better classifier of the smooth and complex region in the image.The information about the smooth regions are captured in low frequency coefficients while those of the edges are captured in the high frequency coefficients[Qian,Wang and Qiao(2012);Chang,Lin,Tseng et al.(2007)].Thus,keep in view the HVS characteristics,researchers prefer to use high frequency DCT coefficients for data hiding while keeping the low frequency coefficients unaltered.However,data hiding in high frequency DCT coefficients affects the edges and results in blurring effect.To preserve the quality of a stego image,both the smooth areas and edges are important from the steganography point of view.

In this paper,a new data hiding technique is proposed which utilizes the medium frequency coefficients of DCT transform for data hiding.The proposed method divides DCT coefficients into three groups,i.e.,low frequency coefficients,high frequency coefficients and medium frequency coefficients.The secret information is hidden in the LSB of medium frequency coefficients.The low and high frequency coefficients are left unaltered thus both the smooth and complex regions of the cover image are preserved.The results show that the technique provides high data hiding capacity with high quality stego images as compared to the other state of art techniques.The key contributions of this work includes good quality stego images,100%recovery of the hidden message,high hiding capacity and significant evaluation metrics.

The rest of the paper is prearranged as follows.Section 2 describes the proposed data hiding technique.Section 3 presents the experimental results and discusses the robustness of the proposed technique to various steganalysis attacks.The comparison of the proposed technique with the state of art techniques is demonstrated in Section 4.The final Section 5,concludes the findings of this research.

2 Proposed technique

The main goal of steganography is to hide secret information in the cover media in a way that the data seems to be unchanged.In case of image steganography,the aim is to avoid such distortion in the images which can be detectable by the HVS.The proposed method attempts to achieve this goal by hiding the secret message in the image in such a way that both smooth and complex region of the cover images are least affected[Qian,Wang and Qiao(2012)].For this purpose,the information relating to the smooth region,the intermediate region and complex region are de-correlated by taking the DCT of the cover image.The intermediate portion of the image is then utilized for hiding the secret information.

The main characteristics of DCT transform is the de-correction,i.e.,DCT transform remove the interdependencies between the coefficients.DCT transform convert the cover image to an array of statistically independent coefficients,concentrating most of the energy in the top left coefficients and most details in bottom right coefficients.The top left coefficients have most of the energy and represent the smooth region of the image while,bottom right coefficients i.e.,high frequency coefficients correspond to the edges.To preserve the edges and the overall view of the image,the high and low frequency coefficients are left unaltered and only the medium frequency coefficients are used for data hiding.The medfrequency coefficients consisting of the50%central DCT coefficients are used for hiding secret information.

The data hiding process is divided into four steps:de-correlation of cover image data using DCT and segmentation of DCT coefficients into low,high and medium frequency com-ponents;embedding secret information in the least significant bits of med-frequency coefficients;converting the modified DCT coefficients back to the spatial domain by taking inverse Discrete Cosine Transform(IDCT);and,calculating quality measuring parameters and hiding capacity.

2.1 De-correlation and segmentation

For hiding secret information in a cover image using the proposed technique,the image is converted to statistically independent coefficients by applying DCT transform.The DCT transform removes the interdependencies between coefficients and helps to get insight of the image’s details such as edges,energy,etc.To get intermediate DCT coefficients extract,the desired region for data hiding,DCT has been used.

A cover imageC(i,j)of size(M×N),is subjected to DCT and results in an array of coefficientsC(u,v)of the same dimension(M×N).The calculation of DCT coefficients is done through the Eq.(1)given below[Cintra and Bayer(2011);Kim,Lee,Lee et al.(2018)].

where

and

The DCT arranges the frequency content of an image in order of their frequencies,from low to high.The first coefficient,also called DC coefficient,represents the average intensity of the image.The remaining coefficients are called AC coefficients.The low frequency components are concentrated in upper left corner of DCT coefficients array,while,high frequency components are placed at bottom right of the array.The medium frequency components are present in the mid-range of the DCT coefficients array.The DCT coefficients matrix obtained is converted into vectorCzof sizeL=N×M,using the zigzag pattern in Eq.(4).

where vectorCzis further divided into three sub-vectors of low frequency coefficientsLc,medium frequency coefficientsMcand high frequency coefficientsHcusing Eq.(5),Eq.(6)and Eq.(7),respectively.

The hiding capacity of the proposed method depends on the number of coefficients selected for data hiding,while the quality of resulting image depends on the redundancy in least significant bits of the selected coefficients.Using more DCT components create more distortion and hence creates very significant artifacts in the stego images that attract the intruder attention.For good steganography techniques,the quality of stego image should be high30dB.On theother side the use of less number of mediumfrequency DCT coefficients will results in a decrease in hiding capacity.The tradeoff between hiding capacity and image quality was investigated in the proposed techniques,and it was found that using 50%medium frequency DCT components is the best choice as the quality of stego image degrades significantly above50%.

2.2 LSB substitution

The medium frequency components vector obtained in the segmentation step is used for data hiding.The coefficients of the medium frequency components vector,i.e.,Mcare subjected to LSB substitution one by one.Thenleast significant bits of each of the coefficients are replaced withnnumber of message bits.The number of substituted LSB i.e.,n,determine the hiding efficiency and the distortion created in the stego image.Greater the number of bitsn,higher the hiding capacity and more distortion.The modified medium frequency coefficientsMcmvector is obtained.The substitution operation is expressed as;

where◦represents the substitution operation and m(k)represent the message bits substituted inkthmedium frequency coefficient.

2.3 Inverse transformation

The modified stego vector obtained in LSB substitution step is combined with the unaffected low and high frequency coefficient vectors within their respective order to get the final stego vectorSvas given in Eq.(9).

The stego vector is then converted into a stego matrix of DCT coefficientsSof sizeM×N using the inverse zigzag process.

To get the final stego image,the modified DCT coefficients’matrixSis transformed back to the spatial domain by applying inverse discrete cosine transform(IDCT).The IDCT is given mathematically in Eq.(9)[Cintra and Bayer(2011);Kim,Lee,Lee et al.(2018)].

where

and

The final stego imageSIwith the embedded secret message is of the same size as that of cover imageCI.

The Fig.1,shows the implementation of the proposed data hiding technique.The processes of de-correlating DCT coefficients,extracting medium frequency components,hiding secret message in these components and obtaining resultant stego image are explained with the help of block diagram.

Figure 1:The process of hiding a secret message in the medium frequency components of the cover image

2.4 Message retrieval

After hiding a secret message in the least significant bits of medium frequency components the stego image can be stored or transmitted to send the hidden information to the intended receiver.The receiver,after receiving the stego image,will retrieve the hidden message following the retrieval process.

To retrieve the hidden message,the stego image is transformed to DCT coefficients using DCT-II transform.As the secret message is hidden the LSB of the medium frequency components,therefore the medium frequency components are extracted from the array of coefficients.The hidden message is obtained by retrieving the LSB of the medium frequency components.The process of message retrieval is explained with the help of block diagram,shown in Fig.2.

Figure 2:Retrieval of secret message

2.5 Hiding capacity

The hiding capacity is the ratio of the number of bits hidden to the number bits of the cover image and is expressed mathematically as in Eq.(15).

Letsnbe the number of bits substituted in a single medium frequency coefficient andNmfbe the number of medium frequency coefficients selected for LSB substitution.As each DCT coefficient has16bits formation,so the hiding capacity in transform domainhctcan be calculated as in Eq.(16).

In the spatial domain,the gray scale image has a bit depth of 8 bits.So,the spatial domain hiding capacityhcsis mathematically given by Eq.(17)as:

The spatial domain hiding capacity is the effective hiding capacity and is directly proportional to the number of bitsnhidden in a medium frequency coefficient and the number of medium frequency coefficientNmf.

2.6 Evaluation measures

The quality of the stego image with respect to the original cover images,can be measured using different quality measuring parameters for example mean square errorMSE,peak signal to noise ratioPSNRand mean structure similarity indexMSSIM[Hore and Ziou(2010);Amirtharajan and Rayappan(2012)].TheMSEgives a measure of difference between the original image and the stego image.The zeroMSEmeans no difference and both the images are perfectly same.In perspective of steganography,theMSEshould be as minimum as possible.On the other side thePSNR,expressed indB,gives measure closeness of the stego image to the original cover images.Higher thePSNR,closer the images are.However,it worth mentioning thatPSNRworks for intensity comparison and it does not provide any structural information.Therefore,MSSIMhas been used as to take structural information in account and compare setgo image with cover image.The MSE,PSNRandMSSIMare mathematically expressed as:

whereµcthe mean of cover image

σscthe covariance of the cover and stego images

C1andC2the variables to stabilize the division with weak denominator.

3 Experimental results

The proposed method is tested by using different cover images taken from USC-SIPI image database[Wu,Han,Niu et al.(2018)],and other images taken using different cell phones.The results obtained for one cover image from the USC-SIPI image database,i.e.,House,as shown in Fig.1(a),are presented here.The image shown in Fig.3(b),is used as a secret message.For complete message hiding,the message image is resized according to the hiding capacity.The image is first converted into a gray-scale image.The cover image data is de-correlated using DCT and the coefficients are segmented in low,medium and high frequency components as explained in Section 2.In these experiments,the first 25%coefficients are labeled as low frequency coefficients and the last25%coefficients are labeled as high frequency coefficients.The50%coefficients in the middle are declared as medium frequency component.

Figure 3:Images used,(a)Cover Image,(b)The secret message

The lower and high frequency coefficients are not subjected to data hiding.Because,hiding messages in the lower frequency coefficients create significant distortion.While,hiding messages in the high frequency coefficients affects the texture of the image.To experimentally check the effect of hiding in the lower and higher frequency components,secret messages are hidden in these coefficients separately.Only3least significant bits are used for embedding secret messages.The stego images obtained are shown in Fig.4.It can be clearly observed that the resultant stego images are significantly distorted.

Figure 4:Stego images of(a)3 bits substitution in 25%lower frequency coefficients(b)3 bits substitution in 25%higher frequency coefficients

Experimentally,analysing the effect of data hiding in medium frequency coefficients,the medium frequency coefficients are subjected to data hiding process i.e.,LSB substitution.A different number of least significant bits substitutions,ranging from1bit to15bits have been used in each coefficient of the medium frequency coefficients.The hiding capacity,MSE,PSNRandMSSIMare calculated for each case.The resultant stego images for 1to15bits substitutions are shown in Fig.5.The result shows that by hiding upto12bits data in each medium frequency DCT coefficient of the cover image,i.e.,Fig.5(a-l),the quality of the stego image is quite good.However,hiding more than12bits of data per coefficients i.e.,Fig.5(m,n,o),results in visually significant distortion.

Figure 5:Stego images of(a)1 bit substitution,(b)2 bits substitution,(c)3 bits substitution,(d)4 bits substitution,(e)5 bits substitution,(f)6 bits substitution,(g)7 bits substitution,(h)8 bits substitution,(i)9 bits substitution,(j)10 bits substitution,(k)11 bits substitution,(l)12 bits substitution,(m)13 bits substitution,(n)14 bits substitution,(o)15 bits substitution

Figure 6:Retrieved secret message

In each of the experiment the secret message image Fig.3(a),is resized to meet the hiding capacity.In this way the message is completely hidden in the cover image.However,in practice the case is differen and the message size may be larger or smaller than the hiding capacity.In the scenario of a large size message,the secret message is hidden in more than one cover image.While in the case of small size of secret message some coefficients are left unaffected.The proposed algorithm of data hiding has the property of reversibility and hidden message is recovered in its full strength.The secret message recovered from a stego image of8bits hiding is shown in Fig.6.

The resultant hiding capacity,MSE,PSNRandMSSIMfor a different number of LSB substitution are listed in Tab.1.The results shown in Tab.1 reveals that the hiding capacity andMSEincreases with increasing the number of bits used for data hiding,while the PNSRandMSSIMdecreases with increasing the number of hiding bits per coefficient.This is quite obvious because increasing the number bits used for data hiding the medium frequency coefficients of the cover image are modified more.Which helps to accommodate a larger number of secret message bits and also contribute to the distortion of the resultant stego image.

Steganalysis process tries to detect and retrieve the hidden data and/or tries to retrieve the hidden information.It is very important for a steganography technique to resist the steganalysis attacks.The results of different steganalysis attacks on the proposed technique are presented in this section.Some well know images such as Lena,Mandrill,Tiffany,Peppers,Jellybeans and others,from USC-SIPI image database were used[Wu,Han,Niu et al.(2018)].The images were first converted to the gray-scale image of different format e.g.,JPEG,PNG and BMP as a different steganalysis technique works on different formats only.These images were subjected to data hiding using the proposed technique.The setgo images are generated for1bit,2bits,3bits and so on up to16bits data hiding in the medium frequency components.All these images are tested for steganography using StegExpose[Boehm(2014)]and StegSecret[Muñoz(2015)]tools.

The StegExpose is a steganalysis tool,detecting LSB steganography.It analyses digital images in bulk and generates a comprehensive report for steganalysis experts.StegExpose uses,on the other hand,not only detects steganography,but also determines the length of hidden message[Boehm(2014)].First StegExpose is applied to a group of 391 images including17cover images and17stego images of each of the1bit,2bits,3bits and so on up to16bits data hiding.A steganalysis report is generated in the form of.CSV file.It was observed from the report that no steganography is detected in the stego images of 1bit,2bits up to14bits data hiding.Only a few images were detected for15bits and 16bits hiding.However,the15bits and16bits substitution also create significant visible distortion as shown in the previous section and these two combinations are not used for data hiding.However,the stego images of these two combinations are included to test the strength of the proposed methods.The result shows that steganography is even not detected in all stego images of these two combinations15bits and16bits stego images.The reason for the negative result may be due to the fact that the proposed technique is DCT coefficients based and is not a spatial domain LSB steganography technique,due to which the steganography is not detected.

Table 1:Hiding capacity,PSNR,MSSIM and MSE for different number of hiding bits using 50%DCT coefficients

To further test the strength of the proposed technique against steganalysis and another tool called StegSecret was used.StegSecret is a Java based,open source steganalysis project(GNU/GPL)is used to detect hidden information digital media.StegSecret is capable of detecting EOF,LSB,DCTs and other techniques[Muñoz(2015)].The tool was applied to the set of images including cover and stego images of JPEG and PNG formats and these experiments also resulted in negative and steganography is not detected even in a single case.The result is shown here in Fig.7.The result shows that the proposed technique is immune not only to spatial domain steganalysis technique but also resistant to DCT domain techniques.

Figure 7:Steganalysis results of StegSecret

4 Comparison with other techniques

The main purpose of steganography is to ensure the secrecy of the secret information,and the selection of a suitable steganography technique plays the key role in achieving this goal.The steganography technique should be able to hide large amount of information in cover file without creating any visible artifacts imperceptible by an intruder.The overall performance of a steganography technique is measured by its hiding capacity and quality of resultant stego image.

The proposed data hiding technique is compared with state of the art technqiues.The hiding capacity and stego image quality of the proposed technique is compared with the previously reported steganography techniques and the results are shown in Tab.2.From these results,it is evident that the hiding capacity of the proposed technique is higher than all these techniques.The proposed technique and the state of the art techniques are applied to a set of images from USC-SIPI image database[Wu,Han,Niu et al.(2018)]and average values are reported for comparison.The experimental results obtained show that[Alam,Zakariya and Akhtar(2014)]results in the best quality stego images with aPSNR=51.098,but the hidingcapacityofthistechniqueverylowandisequalto10.96%ontheaverage.While,the proposed algorithm results in different values of hiding capacity depending on the choice of number of LSB used for substitution and also results in stego images.The proposed technique can achieve a hiding capacity of87.5%with aPSNRof resultant stego image equal to36dB,which is well above the required threshold of30dB.

Table 2:Hiding capacity,PSNR,MSSIM and MSE for different number of hiding bits using 50%DCT coefficients

5 Conclusion

The medium frequency coefficients based LSB image steganography technique proposed in this research uses the medium frequency coefficients of DCT domain representation of the image for data hiding.The technique preserves both,the edges and smooth areas of images as only the medium range coefficients of DCT are subjected to LSB substitution.It provides a hiding capacity of0.5bpp to7bpp with a reasonably high quality stego image withPSNRranging from39.08dB to36.30dB.The hiding capacity can be changed according to user needs by varying the number of bits used for LSB substitution or by changing the number of coefficients in medium frequency range.The variations created in stego image are not visually insignificant and can not be detected by HVS.The proposed technique gives quite high hiding capacity as compared to other steganography techniques.Moreover,the medium frequency coefficients based LSB image steganography technique is immune to steganalysis and shows significant resistance to staganalysis methods like Sample Pairs,RS Analysis,Chi Square Attack and Primary Sets attacks as well as EOP,LSB and DCT detection techniques.