CONTENT BASED IMAGE RETRIEVAL METHOD WITH DISCRETE COSINE FEATURE EXTRACTION IN NATURAL IMAGES

Data or information at this time is not only presented in written form, but also in the form of images that require greater storage. Most of the images in the digital world use the JPEG format, where the Discrete Cosine Transform is the heart of the JPEG format, the use of DCT coefficients for indexing and image retrieval causes the retrieving process to be slower because more coefficients are processed compared to the DC coefficient method, which is only 1 /64 (1 DC coefficient) of the DCT coefficient. In this research, we perform Content Based Image Retrieval with DC feature extraction of 15,000 natural images, then calculate the distance between the images using the Manhattan Distance method. The final result of calculating precision and recall shows a value of 0.6624 and a time of less than 2 seconds, with a maximum value of 1.876 seconds.


INTRODUCTION
Currently data or information is not only presented in text form, but can also be presented in other forms, for example in the form of images. Nonmoving image data and moving image data require much larger storage compared to data in the form of text. Most of the images currently circulating in the digital world and on the storage media use the JPEG format. One of the advantages of the JPEG format is that the image size of the JPEG format is smaller than other image formats. More than 95% of images on the web are compressed in JPEG format where the Discrete Cosine Transform (DCT) is the heart of JPEG format images and DCT is one of the feature extraction and directly still a promising method for processing and retrieval of compressed images [1].
In JPEG format images, the image consists of a matrix of 8 x 8 pixel blocks, based on this block the JPEG image is indexed so as to create a flat plane. Meanwhile, in each 8 x 8 block, each block consists of 64 pixels where each pixel has a value or coefficient. So that in each block consists of 64 coefficients, where the first coefficient, which is located on the upper left in the block is called the discrete cosine (DC) coefficient and the remaining 63 coefficients are called AC coefficients [2].
Previous research used the Content-Based Image Retrieval (CBIR) method to search for the same image using an RGB color histogram comparison. Namely, a color histogram is created by taking the RGB color from the image and then converting it to an HSV value. After conversion, quantize up to 112 colors and calculate the contribution of each color. After generating the color contribution, each color is divided by the total number of pixels. The way to compare the color histograms of digital images is by subtracting each color histogram by the query image and the database image. The lowest difference value is the best result. The same image application uses the Content-Based Image Retrieval (CBIR) method, namely by reviewing all existing calculations and applying all these calculations in programming using Visual Studio 2008, [13] in our study using the python3 application.
Research conducted in recent years used all DCT coefficients for indexing and image capture. This method causes the retrieving process to be slower because more coefficients are processed compared to the DC coefficient method, which is only 1/64 (1 DC coefficient) of the DCT coefficient [3]. By using the DC coefficient (1 DC coefficient and 63 AC coefficient) in comparing (matching) the query image of 100 images with the images in the database of 500 images. The results of the study stated that the results of the JPEG image format were much smaller and did not reduce the information displayed. Its effectiveness is also still quite high (around 0.65) [4].
Based on this description, this study will use the Content Based Image Retrieval method on 15,000 natural images with a resolution of 256x256 pixels using only the DC coefficients in the DCT coefficients which will then be calculated for their similarity with Manhattan Distance.

Image
Picture or image is a matrix where the row and column indexes represent a point in the image and the matrix elements (which are referred to as image elements / pixels) represent the gray level at that point [5] Image compression is an image processing process that involves many methods. This process has the characteristics of input data and output information in the form of images. Image compression is divided into two, namely lossy compression and lossless compression. Loss of compression is usually used for images that require high accuracy, while lossless compression is usually used for images that do not require high accuracies such as landscape photos, or images used for medical purposes

Content Based Image Retrieval (CBIR)
Content based image retrieval (CBIR), is a process to get a number of images based on the input of one image. The term was first proposed by Kato in 1992 [6]. Image retrieval or Image querying is an image processing application that can help users retrieve or search quickly for an image in an image database based on user queries or requests [7].
The initial stage in the image retrieval system based on content is to perform the extraction and description process on the image in the database so as to produce a feature vector. After that, the extraction and description process is carried out on the query image entered by the user. Then, Similarity Comparison is carried out between the query image and the image in the database. The similarity distance between the query image and the image in the database will be sorted and displayed as output [8]  CBIR aims to capture images in databases that have visually similar content or features. For this HID as a feature extraction algorithm was tested and validated. The HID feature extraction algorithm is combined with the DCT feature to increase the precision of the CBIR system. Manhattan and Euclidean are considered as distance metrics for the feature matching process. It was found that the system performed best when Manhattan distance was used as the metric distance. [12] In another study, extensive experiments showed that the proposed technique achieves competitive performance compared to existing DCT-based methods. The proposed method has a significant advantage over the pixel domain method by requiring only partial decompression. The proposed content descriptor is also suitable for real-time implementation and application. [11]

Calculation of the distance between two images
Distance is a commonly used approach to achieve image search. Its function is to determine the similarity or dissimilarity of two feature vectors. The level of similarity is expressed by a score or ranking. The smaller the ranking value, the closer the similarities between the two vectors [9], one of the methods to measure the distance between two images is the Manhattan Distance.

Manhattan Distance
Manhattan distance is a formula for calculating the shortest distance between two points. Manhattan Distance calculation to find the minimum distance from two points ( 1 , 1 ) and which can be done by calculating the value of | 2 − 1 | + | 2 − 1 | [10]. The formulation of Manhattan Distance can be described as follows: Where: n : Data Dimension |…| : Absolute Value i : Testing Data j : Training Data

Precision and Recall
Recall is a comparison of the number of relevant documents retrieved according to the given query with the total collection of documents relevant to the query. Precision is the comparison of the number of documents relevant to the query with the number of documents retrieved from the search results. Precision can be interpreted as the accuracy or match (between the request for information and the answer to the request) [11] Precision and Recall According to (Kurniawan 2010) Recall is a comparison of the number of relevant documents retrieved according to the given query with the total collection of documents relevant to the query. While Precision can be interpreted as the exactness or match (between the request for information with the answer to the request). If someone is looking for information on a system, and the system offers some documents, then this Hal. 171-178 p-ISSN : 2339-1103 e-ISSN : 2579-4221 exactness is actually also of relevance. That is, how precise or suitable the document is for the needs of the information seeker, depending on how relevant the document is to the seeker. [1]

III. RESEARCH METHODS
The precision of image retrieval is dependent on the (1) feature extraction process, (2) feature similarity method. Some of the CBIR algorithms uses shape features extracted from the shape of an object and objects are classified with higher accuracy compared conventional features like texture and color. The color content of an image plays very important role in content-based image retrieval. Global histogram is used to represent popularly for color contents. Three RGB color channels are used to represent global color histogram. These three individual color histogram provides similarity between different images as it is scale and rotation invariant features.
In this paper hybrid features which combines three types of feature descriptors, including spatial, frequency, CEDD and BSIF features are used to develop efficient CBIR algorithm. Individual analysis of descriptors is also studied and results are presented. The rest of the paper is organized as follows. Section 2 briefly reviews important algorithms of CBIR techniques. Various spatial, frequency and hybrid domain feature extraction methods are explained in Section 3. Section 4 presents simulation results and discussions. Finally, Section 5 concludes the paper.
This study uses an experimental model. Experimental model is a research model that is testing, manipulating, and influencing things related to all variables or attributes of this research. The stages of the research are as follows: The hardware and software used are as follows:

Data Collection
The dataset used is in the form of natural images or photos of animals (birds, dogs, cats), plants (fruits) as many as 15,000 photos downloaded through the website address www.robots.ox.ac.uk/~vgg/data/ and http: //chaladze.com/l5/ with JPEG image format.

Data Preprocessing
Color is considered as one of the important lowlevel visual features as the human eye can differentiate between visuals on the basis of color.
The images of the real-world object that are taken within the range of human visual spectrum can be distinguished on the basis of differences in color [24][25][26][27]. The color feature is steady and hardly gets affected by the image translation, scale, and rotation [28][29][30][31]. Through the use of dominant color descriptor (DCD) [24], the overall color information of the image can be replaced by a small amount of representing colors. DCD is taken as one of the MPEG-7 color descriptors and uses an effective, compact, and intuitive format to narrate the indicative color distribution and feature. Shao et al. [24] presented a novel approach for CBIR that is based on MPEG-7 descriptor. Eight dominant colors from each image are selected, features are measured by the histogram intersection algorithm, and similarity computation complexity is simplified by this.
According to Duanmu [25], classical techniques can retrieve images by using their labels and annotation which cannot meet the requirements of the customers; therefore, the researchers focused on another way of retrieving the images that is retrieving images based on their content. The proposed method uses a small image descriptor that is changeable according to the context of the image by a two-stage clustering technique. COIL-100 image library is used for the experiments. Results obtained from the experiments proved that the proposed method to be efficient [25].
Wang et al. [26] proposed a method based on color for retrieving image on the basis of image content, which is established from the consolidation of color and texture features. This provides an effective and flexible estimation of how early human can process visual content [26]. The fusion of color and texture features offers a vigorous feature set for color image retrieval approaches. Results obtained from the experiments reveal that the proposed method retrieved images more accurately than the other traditional methods. However, the feature dimensions are not higher than other approaches and require a high computational cost. A pairwise comparison for both low-level features is used to calculate similarity measure which could be a bottleneck [26].
Various research groups carried out a study on the completeness property of invariant descriptors [27]. Zernike and pseudo-Zernike polynomials which are orthogonal basis moment functions can represent the image by a set of mutually independent descriptors, and these moment functions hold orthogonality and rotation invariance [27]. PZMs proved to be more vigorous to image noise over the Zernike moments. Zhang et al. [27] presented a new approach to derive a complete set of pseudo-Zernike moment invariants. The link between pseudo-Zernike moments of the original image and the same shape but distinct orientation and scale images is formed first. An absolute set of scale and rotation invariants is obtained from this relationship. And this proposed Hal. 171-178 p-ISSN : 2339-1103 e-ISSN : 2579-4221 technique proved to be better in performance in recognizing pattern over other techniques [27].
Guo et al. [28] proposed a new approach for indexing images based on the features extracted from the error diffusion block truncation coding (EDBTC). To originate image feature descriptor, two color quantizers and a bitmap image using vector quantization (VQ) are processed which are produced by EDBTC. For assessing the resemblance between the query image and the image in the database, two features Color Histogram Feature (CHF) and Bit Pattern Histogram Feature (BHF) are introduced. The CHF and BHF are calculated from the VQ-indexed color quantizer and VQ-indexed bitmap image, respectively. The distance evaluated from CHF and BHF can be used to assess the likeliness between the two images. Results obtained from the experiments show that the proposed scheme performs better than former BTC-based image indexing and other existing image retrieval schemes. The EDBTC has good ability for image compression as well as indexing images for CBIR [28].
Liu et al. [29] proposed a novel method for region-based image learning which utilizes a decision tree named DT-ST. Image segmentation and machine learning techniques are the base of this proposed technique. DT-ST controls the feature discretization problem which frequently occurs in contemporary decision tree learning algorithms by constructing semantic templates from low-level features for annotating the regions of an image. It presents a hybrid tree which is good for handling the noise and tree fragmentation problems and reduced the chances of misclassification. In semantic-based image retrieval, the user can query image through both labels and regions of images. Results obtained from the experiments conducted to check the effectiveness of the proposed technique reveal that this technique provides higher retrieval accuracy than the traditional CBIR techniques and the semantic gap between lowand high-level features is reduced to a significant level. The proposed technique performs well than the two effectively set decision tree induction algorithms ID3 and C4.5 in image semantic learning [29]. Islam et al. [30] presented a supreme color-based vector quantization algorithm that can automatically categorize the image components. The new algorithm efficiently holds the variable feature vector like the dominant color descriptors than the traditional vector quantization algorithm. This algorithm is accompanied by the novel splitting and stopping criterion. The number of clusters can be learned, and unnecessary overfragmentation of region clusters can be avoided by the algorithm through these criteria.
Jiexian et al. [31] presented a multiscale distance coherence vector (MDCV) for CBIR. The purpose behind this is that different shapes may have the same descriptor and distance coherence vector algorithm may not completely eliminate the noise.
The proposed technique first uses the Gaussian function to develop the image contour curve. The proposed technique is invariant to different operations like translation, rotation, and scaling transformation.
Before going through the steps in the DC coefficient extraction process, data preprocessing is carried out so that the image is easy to compute with an image size of 256x256 pixels. The sample image used in this study can be seen in Figure 2.

DC Coefficient Feature Extraction
The equation or algorithm used for DC extraction can be written as follows: Where H is the indexing key and

Distance Measurement Method
The distance measurement method used in this study is the Manhattan Distance method. This method Hal. 171-178 p-ISSN : 2339-1103 e-ISSN : 2579-4221 is used to determine the search for an image whether the image is the same or not between two images by determining the distance from the two images to be tested. For the level of similarity can be expressed by a value. The smaller the resulting value, the closer the similarity between the two images.

Effectiveness of Image Search (image retrieval)
In this research, 25 query images will be retrieved from each database. For each query as many as 20 images are called (displayed) then the precision and recall will be calculated.

DC Coefficient Extraction Results
After the extraction process is carried out, the results obtained are a new image which previously contained DCT coefficients (1 DC coefficient and 63 AC coefficients) to contain only DC coefficients, the results of the image extraction can be seen in Table 1   Table 1. DC Coefficient Feature Extraction Results Table  No Original Image Image Extraction DC 1 2 3 4 5

Content Based Image Retrieval Result
The image results that have been extracted DC coefficients are used for the Content Based Image Retrieval process using the Manhattan Distance method in Figure 4.

Manhattan Distance Precision and Recall Results
The results of precision and recall using the Manhattan Distance method are shown in Figure 5. In the graph above is a graph of the image processing time following the CBIR test using the appropriate Manhattan distance calculation

Discussion
The Manhattan Distance method shows interesting results where the best precision value is with a value of 1 for the fruit image class and the image of a cat and the worst precision value with a value of 0.24 also for the dog image class. Meanwhile, the average value of precision results with the Manhattan Distance method shows a value of 0.66.
The results of Content Based Image Retrieval also become faster because it only uses DC coefficients instead of using all DCT coefficients, making the indexing process and image retrieval much faster, less than 2 seconds with a maximum value of 1.876 seconds as shown in Figure 6

V. CONCLUSION
Based on the results obtained, it can be concluded that by extracting DC features in the image can reduce image storage and then using the Content Based Image Retrieval method can increase effectiveness with precision and recall results using a Manhattan Distance of 0.6624. The use of an extracted image with a DC coefficient feature can increase the effectiveness of Content-Based Image Retrieval. By using the Euclidean Distance method, the precision and recall results are higher than the Manhattan Distance method. The results of this study only use digital images so that they can be developed with other types of images such as artificial images.
The dataset used in this study only uses natural images. For this reason, future research is expected to be able to use artificial images. Then for the method of measuring the distance between two images, other methods other than Manhattan Distance can be used in order to give better results.