Performance Evaluation of Diﬀerent Universal Steganalysis Techniques in JPG Files

– Steganalysis is the art of detecting the presence of hidden data in ﬁles. In the last few years, there have been a lot of methods provided for steganalysis. Each method gives a good result depending on the hiding method. This paper aims at the evaluation of ﬁve universal steganalysis techniques which are “Wavelet based steganalysis”, “Feature Based Steganalysis”, “Moments of characteristic function using wavelet decomposition based steganalysis”, “Empirical Transition Matrix in DCT Domain based steganalysis”, and “Statistical Moment using jpeg2D array and 2D characteristic function”. A large Dataset of Images -1000 images-are subjected to three types of steganographic techniques which are “Outguess”, “F5” and “Model Based” with the embedding rate of 0.05, 0.1, and 0.2. It was followed by extracting the steganalysis feature used by each steganalysis technique for the stego images as well as the cover image. Then half of the images are devoted to train the classiﬁer. The Support vector machine with a linear kernel is used in this study. The trained classiﬁer is then used to test the other half of images, and the reading is reported The “Empirical Transition Matrix in DCT Domain based steganalysis” achieves the highest values among all the properties measured and it becomes the ﬁrst choice for the universal steganalysis technique.

The success of the steganography plan refers to: the two parties must have a reason for communicating, and the behaviour of communication must not change.
Today's steganographic methods use images or audio files to hide data, the information that needs to be concealed is dispersed within the least significant bits of a carrier file, which serves as a hiding place. It is important that the carrier files do not lose their actual appearance during the embedding process.
All digital file formats can be used for Steganography, but the formats that are more suitable are those with a high degree of redundancy [2]. The redundant bits of an object are those bits that can be altered without the alteration being detected easily [3].

Image Steganographic techniques
Due to the high degree of redundancy present in digital images (despite compression), there has been an increased interest in using digital images as cover-objects for the purpose of steganography. There has been much more work on embedding techniques which make use of the transform domain or more specifically JPEG images due to their wide applicability.
Image steganography techniques can be divided into two groups: those in the Image Domain and those in the Transform Domain [4].
The image (spatial) domain techniques embed messages in the intensity of the pixels directly, while for the transform (frequency) domain, images are first transformed and then the message is embedded in the image [5].
The image domain techniques encompass bit-wise methods that apply bit insertion and noise manipulation and are sometimes characterized as "simple systems" [6]. The image formats that are most suitable for the image domain steganography are lossless. Steganography in the transform domain involves the manipulation of algorithms and image transforms [6]. These methods hide messages in more significant areas of the cover image, making it more robust [7]. Many transform domain methods are independent of the image format and the embedded message may survive conversion between lossy and lossless compression.

JPEG Steganography (Transform Domain)
Originally it was thought that it would not be possible to use Steganography with JPEG images because they use lossy compression which results in parts of the image data being altered. However, the most important characteristics of Steganography is that information is hidden in the redundant bits of an object and since redundant bits are left out when using JPEG, it was feared that the hidden message would be destroyed.
One of the properties of JPEG is exploited to make the changes to the image invisible to the human eye. During the DCT transformation phase of the compression algorithm, rounding errors occur in the coefficient data that are not noticeable [8]. Although this Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 07/06/2022 06:41:28 property is what classifies the algorithm as being lossy, this property can be also used to hide messages. It is neither feasible nor possible to embed information in an image that uses lossy compression, since the compression would destroy all information in the process.

Outguess
Outguess proposed by Provos [9] performs the embedding process in two steps. First, it identifies the redundant DCT coefficients that have minimal effect on the cover image, and then chooses bits in which it would embed the message.

F5
F5 [10] was proposed by Westfeld and embeds messages by modifying the DCT coefficients. The most important operation done by F5 is matrix embedding with the goal of minimizing the amount of changes made to the DCT coefficients.

Model Based Embedding Technique
The model-based technique, proposed by Sallee, [11] tries to model statistical properties of an image and preserves them during the embedding process. Sallee breaks down the transformed image coefficients into two parts and replaces the perceptually insignificant component with the coded message bits.

Image Steganalysis
Steganalysis is the art and science of detecting messages hidden using Steganography. The art of Steganalysis plays a major role in the selection of features or characteristics a typical stego message might exhibit while the science helps in reliable testing the selected features for the presence of hidden information [12].

Steganalysis Techniques
The Steganalysis Techniques are classified into two categories:

Specific Steganalysis
The specific detection consists of subjective and statistical methods. The subjective methods make use of human eyes to look for suspicious artifacts in the image.
The statistical methods perform mathematical analysis of the images to find the discrepancy between the original and stego images.

Universal Steganalysis
The general Steganalysis detection methods provide detection regardless of the steganographic techniques. They involve the extraction of image features and the Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 07/06/2022 06:41:28 U M C S classification of the input image into containing the embedded message or not having the hidden message. The following is the description of 5 states of the art universal Steganalysis techniques used to detect hidden data in the JPEG files and will be used during this research.
Wavelet based Steganalysis A Steganalysis approach is provided by Farid et al [13] to detect hidden messages in images based on wavelet-like decomposition to build higher order statistical models of natural images.
The decomposition is based on separable quadrature mirror filters (QMFs) [14,15,16]. This decomposition splits the frequency space into multiple scales and orientations. This is accomplished by applying separable lowpass and highpass filters along the image axes generating vertical, horizontal, diagonal and lowpass subbands. Subsequent scales are created by recursively filtering the lowpass subband.
Feature Based Steganalysis In her paper [17], Jessica provides a new steganalysis method which combines the concept of calibration with the feature-based classification to devise a blind detector specific to the JPEG images. By calculating the features directly in the JPEG domain rather than in the wavelet domain, it appears that the detection can be made more sensitive to a wider type of embedding algorithms because the calibration process increases the features' sensitivity to the embedding modifications while suppressing image-to-image variations. Another advantage of calculating the features in the DCT domain is that it enables more straightforward interpretation of the influence of individual features on detection as well as easier formulation of design principles leading to more secure Steganography.
Moments of characteristic function using wavelet decomposition based steganalysis: Shi et al. [18], proposed a new steganalysis technique depending on the statistical moments of characteristic functions of the image, its prediction-error image and their discrete wavelet subbands are selected as features. It has been shown that the usage of the moments of characteristic functions, the moments from all of wavelet subbands including the low-low (LL) subbands.

Empirical Transition Matrix in the DCT Domain based steganalysis
Fu et al [19] proposed a new steganalysis techniques, Markov empirical transition matrices are proposed to capture both intra-block and inter-block dependencies between the block-DCT coefficients in the JPEG image Since the hidden messages are sometimes independent of the cover data, the embedding process often decreases the dependencies exiting in the original cover data to some extent. Therefore, the proposed second order statistics can capture such kind of changes. To reduce high dimensionality of the proposed empirical transition matrices, a threshold technique is applied to generate efficient features. Some of the steganographic methods have made great efforts to maintain the marginal histogram of the block-DCT coefficients (first order statistics) Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 07/06/2022 06:41:28 U M C S or try to keep the histogram appear unchanged by decreasing or increasing the DCT coefficient values by only one. This fact suggests that the steganalysis schemes based only on the first order statistics are not sufficient. In this method, they propose to employ higher order statistics for steganalyzing the JPEG steganography.

Statistical Moment using jpeg2D array and 2D characteristic function:
Chen et al [20], developed a new universal steganalysis method based on statistical moments derived from both image 2-D array and JPEG 2-D array. In addition to the first order histograms, the second order histograms are considered. Consequently, the moments of 2-D characteristic functions are also utilized for steganalysis.

Classifiers
There is a number of detectors for steganography. Some return a binary decision (something embedded / nothing embedded). In most cases this decision is based on comparison with a predefined threshold. The reliability can be judged by the detector's error rate.
The calculated feature vectors obtained from each universal steganalysis technique are used to train a classifier. We have used The Support Vector Machine:

Support Vector Machine (SVM)
The support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression [21]. The support Vector Machine (SVM) is a classification and regression prediction tool that uses machine learning theory to maximize predictive accuracy while automatically avoiding over-fit to the data. The formulation uses the Structural Risk Minimization (SRM) principle, which is superior, to the traditional Empirical Risk Minimization (ERM) principle, used by the conventional neural networks. SVMs were developed to solve the classification problem [22]. A classification task usually involves training and testing data which consist of some data instances. Each instance in the training set contains one "target value" (class labels) and several "attributes" (features). The goal of SVM is to produce a model which predicts targets value of data instances in the testing sets which are given only the attributes.

Prepare a set of Images to act as cover images
The data set used is a collection of 1000 images with the quality factor 80%, and the size 512 * 768 or 768 * 512.

Embed message in the different cover to get the stego
The DCT domain embedding techniques are very popular due to the fact that the DCT-based image format gives high compression and a small size image. JPEG is widely used in the public domain in addition to being the most common output format of digital cameras. Various steganographic embedding methods are proposed, with the purpose of minimizing the statistical artifacts introduced to the DCT coefficients.
Here, three embedding techniques, Outguess, F5 and Model based are used with the embedding rates of 0.05, 0.1, 0.2.

Get the Statistical properties of all images (cover and stego)
Implementation of different steganalysis techniques is made to get the statistical properties measured by each technique.

Choosing a classifier
From the defined classifiers, the Support Vector Machine is used in this research because it is more powerful, but on the other hand, it requires more computational power, especially if a nonlinear kernel is employed. To avoid high computational cost and to obtain a reasonable success, a linear SVM has been used. There are two classes for the cover image, and the others for the stego image. To train and use the classifier, LIBSVM [23], is used. LIBSVM is a library for SVM, contains classes that perform training, and classification. • The Model class encapsulates the SVM model. It has no constructor but its object is always created using the static member Train of the class Training.
• The Train method (static member of the Training class) takes 2 objects: object from the Problem class, and object from the Parameter class and return an object from the Model class which contains the trained SVM Model. The class Prediction has a static method named Predict which takes an object from the Model class which contains the trained SVM Model, and the feature of the image that we want to classify. The Predict Method returns a value representing cover or a stego image.

Analyzing the results
Most of the universal steganalysis techniques return binary decision (contain hidden data / do not contain hidden data). In most cases the decision is based on comparison with a predefined threshold.
In the research, after training the classifier, and testing an image, one of the following results can be obtained. The classifier gives right data, or the classifier gives wrong data even if it classifies the image with hidden data and actually it does not contain hidden data or, if it classifies the image without hidden data and actually it contains hidden data. This can be summarized in the contingency

False Positive Rate (FPR)
Tthe proportion of negative instances that were reported as being positive is: (1)

True Positive Rate (TPR)"Sensitivity"
This is equivalent to sensitivity, which measures the proportion of actual positives which are correctly identified. (3)

Accuracy
Accuracy is the degree of conformity of a measured or calculated quantity to its actual (true) value. Accuracy is closely related to precision.
Sensitivity and specificity are statistical measures of the performance of a binary classification test. The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the ROC curve.

Experimental Results
Each experiment has been repeated 10 times and the average is taken. 1000 pictures are used and hiding data with different rates in these samples gives 2000 pictures in each rate (1000 covers and 1000 stegoes). The half of these samples is used for training, and the other half is used for testing.

The Wavelet Based Steganalysis
The above Table 2 presents the data obtained when applying the wavelet based steganalysis with different steganographic method techniques used (Outguess, F5, and Model Based) and the embedding rates (0.05, 0.1. 0.2)

The Moments of characteristic function using the wavelet decomposition based steganalysis
The above Table 4 presents the data obtained when applying the data of moments of characteristic function using the wavelet decomposition based steganalysis on different steganographic method techniques used (Outguess, F5, and Model Based) with embedding rates (0.05, 0.1. 0.2).  Table 4. The data of the moments of characteristic function using the wavelet decomposition Based Steganalysis. techniques used (Outguess, F5, and Model Based) with the embedding rates (0.05, 0.1. 0.2).

Statistical Moment using the jpeg2D array and 2D characteristic function
The above Table 6 presents the data obtained when applying the data of the Statistical Moment using the jpeg2D array and 2D characteristic function on different steganographic method techniques used (Outguess, F5, and Model Based) with the embedding rates (0.05, 0.1. 0.2).

Parameter calculation
The data recorded from Table 2 to Table 6, describes the effects of different steganalysis techniques on different Steganographic techniques in terms of "True Negative", "True Positive", "False Negative", False Positive". But these data itself can not be an indication. So, some calculation has to be done to convert these data to some distinguished properties such as "Sensitivity", "Accuracy", and "Specifity".

Sensitivity
As indicated before, Sensitivity or "True Positive Rate" measures the proportion of actual positives which are correctly identified. Accuracy is the degree of conformity of a measured or calculated quantity to its actual (true) value. Accuracy is closely related to precision.

Specifity
It measures the proportion of negatives which are correctly identified, it is equal to 1 -False Positive Rate          For the outguess embedding technique. The great difference in sensitivity is found between the "Wavelet Based Steganalysis" and the "Moments of CF using Wavelet" as a part, and the "Feature based steganalysis", "Empirical Transition Matrix" and "2D array and 2D CF" as the other part • At the low embedding Rate "0.05 the "Feature based steganalysis" has higher sensitivity than the two others in the same group.
• At the embedding rate of "0.1", "The Empirical Transition Matrix" and "2D array and 2D CF" are more sensitive, and this continues to the embedding rate "0.2" For the F5 embedding technique, the difference in sensitivity between the "Wavelet Based Steganalysis" and the "Moments of CF using Wavelet" as a part, and the "Feature based steganalysis", "Empirical Transition Matrix" and "2D array and 2D CF" as the other part in the embedding rate "0.1" and above is found.
• At the low embedding Rate "0.05 the "Feature based steganalysis" has higher sensitivity than the two others in the same group.
• At the embedding rate of "0.1", "The Empirical Transition Matrix" sensitivity continues to increase with the increasing rate and is higher than that of others. "The feature based has a very small increase, and the "2D array and 2D CF" increase with the increasing rates but do not reach the "Feature based".
For the Model Based steganographic embedding technique, great difference in sensitivity between the "Empirical Transition Matrix" and the other steganalysis methods is found.
• As the embedding rate increases, the sensitivity of all steganalysis methods increases except the "Wavelet based", it decreases till the embedding rate "0.1" is reached and then increases.
• The increasing sensitivity rate of the "2D array and 2D CF" is higher than the other,one but still does not reach the sensitivity of the "Empirical Transition Matrix".

Accuracy
It measures the true indication of the classifier related to the entire tested sample. When using the Outguess as a hiding technique, the great difference in accuracy between the "Wavelet Based Steganalysis", "Moments of CF using Wavelet" and the "Feature based steganalysis" as a part, and the, "Empirical Transition Matrix" and "2D array and 2D CF" as the other part is found.
• At the low embedding Rate "0.05 the "Empirical Transition Matrix" has higher accuracy than the "2D array and 2D CF".
• With the increase in the embedding rate, all the steganalysis accuracy rates increase.
• The "Empirical Transition Matrix" and the "2D array and 2D CF" appear to increase with the same rate. • The "Feature based" accuracy increases with very high rates. And at the "0.02" embedding rate it almost reaches the "Empirical Transition Matrix" and "2D array and 2D CF".
When using the F5 as a hiding technique, it appears that the accuracy of all steganalysis techniques is very similar differing in about 2%. The embedding rate is "0.05".
• With the increase of the embedding rates, the accuracy of the steganalysis assumes different behaviour: -"Wavelet based" provides a little increase, and "Moments of CF using wavelet" almost remain constant. -The "Feature based" and "2D array and 2D CF" increase with the same rate till the embedding rate reaches"0.1", then the "Feature based" gains higher increase in accuracy than the "2D array and 2D CF". Thistakes place at the embedding rate "0.2". -The "Empirical transition matrix", gets a high rate increase in the accuracy compared to that of the embedding rate.
When using the MB as a hiding technique, the "Empirical Transition Matrix" has the highest accuracy in all the other techniques.
• The "2D array and 2D CF" and "Feature Based" have almost the same accuracy at the embedding rate "0.05", but the increase rate of the "2D array and 2D CF", is greater than that ofthe "Feature Based" .
• The "Moment of CF using wavelet" has a low increasing rate at the embedding rate from "0.05" to "0.1", which is by "0.1" to "0.2".

Specifity
It measures the proportion of negatives which are correctly identified by the classifier. Using the Outguess as a hiding technique, The "Empirical Transition Matrix" and the "2D array and 2D CF", are the highest in specifity and almost get the same increase rate.
The "Feature Based" is the worst in specifity for the low embedding rate, but increases significantly in Specifity with the increase of the embedding rate. Using the F5 as a hiding technique.
• All the steganalysis techniques except the"Feature Based" have almost the same Specifity at the low embedding rate "0.05".
• The "Moments of CF using wavelet" Specifity decreases with the increase of the embedding rate up to "0.1", then increases with the increasing embedding rate.
• The "Empirical Transaction Matrix" and "2D array and 2D CF" Specifity increases gradually with the increase of the embedding rate.
Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 07/06/2022 06:41:28 U M C S • The "Feature Based" has low Specifity at a low embedding rate, but it increases signifiacantly with the increase of the embedding rate. It reaches the "2D array and 2D CF" specifity at the embedding rate of "0.1".
Using the MB as a hiding technique.
• The "Empirical Transition Matrix" has the highest specifity for the three embedding rates.
• The "2D array and 2D CF" specifity is less than that of the "Feature Based" at the embedding rate of "0.05", but it increases more than the "Feature Based" with the increase of the embedding rate.
• The "Moments of CF using wavelet" and "wavelet based" are very low regarding the other steganalysis techniques.
From the above observations, the "Empirical Transition Matrix in the DCT domain based Steganalysis" gives the best values compared with the other four methods. This result matches thos e of the experiments applied by Fu et al. [19] while a different image data set is used.
The "Empirical Transition Matrix in DCT domain based Steganalysis" is better than the others, not only because it uses high order statistics, but because of the feature extraction which is the dependency among the quantized block DCT Coefficient. It uses the Markov empirical transition matrices to capture the intra block and inter block dependencies between the block DCT coefficients.
The "Feature Based Steganalysis" comes second for F5 because the co-occurrence matrix is used to capture the dependency between the DCT coefficient pairs from the neighbouring block. This method is successful to some extent because it takes into consideration only the inter-block. The intra-block is not considered.
The "Statistical Moments Based Universal Steganalysis Using JPEG 2-D Array and 2-D characteristic function" comes second for the Outguess and MB, because it uses discrete the Wavelet Transform (DWT) subbands and the characteristic function of each of these subbands. It also, employs the Second order histogram. This success with Outguess and MB, depends on many factors calculated by this method, also due to the methodology of Outguess and MB which attempts to make less change in the image histogram. But it does not give a good result with the F5, because it tries to keep the histogram unchanged by decreasing or increasing the coefficient value by one.