Content area
Data binary encoding has proven to be a versatile tool for optimizing data processing and memory efficiency in various machine learning applications. This includes deep barcoding, generating barcodes from deep learning feature extraction for image retrieval of similar cases among millions of indexed images. Despite the recent advancement in barcode generation methods, converting high-dimensional feature vectors (e.g., deep features) to compact and discriminative binary barcodes is still an urgent necessity and remains an unresolved problem. Difference-based binarization of features is one of the most efficient binarization methods, transforming continuous feature vectors into binary sequences and capturing trend information. However, the performance of this method is highly dependent on the ordering of the input features, leading to a significant combinatorial challenge. This research addresses this problem by optimizing feature sequences based on retrieval performance metrics. Our approach identifies optimal feature orderings, leading to substantial improvements in retrieval effectiveness compared to arbitrary or default orderings. We assess the performance of the proposed approach in various medical and non-medical image retrieval tasks. This evaluation includes medical images from The Cancer Genome Atlas (TCGA), a comprehensive publicly available dataset, as well as COVID-19 Chest X-rays dataset. In addition, we evaluate the proposed approach on non-medical benchmark image datasets, such as CIFAR-10, CIFAR-100, and Fashion-MNIST. Our findings demonstrate the importance of optimizing binary barcode representation to significantly enhance accuracy for fast image retrieval across a wide range of applications, highlighting the applicability and potential of barcodes in various domains.
Introduction
Digital imaging has emerged as a transformative technology, revolutionizing numerous fields such as healthcare1. Clinical imaging techniques, including X-rays, magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound, are central to the digitalization of healthcare. The application of these technologies in visualizing diseases within different organs has led to the accumulation of a vast medical image database2. This revolution hinges on the ability to convert traditional glass microscope slides, containing formalin-fixed paraffin-embedded (FFPE) tissue sections, into high-resolution, digital representations. These large, detailed digital images are known as whole slide images (WSIs), and they form the foundation for modern computational pathology3. The advancement of deep learning4,5 has empowered researchers to perform complex analyses in computational pathology, such as precise tissue analysis, diagnostic support, and efficient image representation through feature extraction3. Training deep learning models enables quick and efficient feature extraction, facilitating the creation of compact image representations for storage and processing. Its impact is particularly evident in the widespread adoption of large pre-trained deep networks (both conventional and foundational models)3, which generate feature representations, the so-called embeddings utilized in tasks like image search and retrieval6,7. This integration of large image data, especially when paired with the powerful feature extraction capabilities of deep learning, can transform medical practices, ensuring more effective and personalized patient care through improved image analysis, retrieval, and evidence-based decision making. Image search enables a novel method of visual information retrieval, effectively linking textual descriptions, molecular data, and the visual characteristics of tissue samples8. Content-based image retrieval (CBIR) is a specific type of image search to find the semantically closest neighbor from the high-dimensional image representation archive with various applications in libraries, manufacturing, and histopathology2,9. These systems possibly face high costs and complexity when using full precision values. To tackle this issue, hashing methods are suggested to encode the deep feature vectors to N-bit binary codes (i.e., barcoding)10. However, the end-to-end training of deep networks to learn “hash codes” can be challenging due to the lack of sufficient data, which limits the model’s ability to generalize and perform accurately. This approach not only makes deep learning more accessible but also ensures its scalability and practicality in real-world applications where storage and data limitations are prevalent.
Data encoding is an efficient method for reducing data size, which can enhance storage efficiency and stability11. Barcodes are binary-formatted codes that enable efficient product tracking and information retrieval, streamlining operations and reducing human error. In manufacturing, for example, they support real-time tracking of goods from production to distribution, significantly improving logistics. In libraries and asset management, barcodes simplify inventory control and equipment tracking, speeding up processes and minimizing losses. Businesses can also customize barcode graphics to suit specific needs, whether for creating distinctive product labels, incorporating security features, or optimizing packaging materials. This adaptability allows enterprises to seamlessly integrate barcode systems into their workflows, enhancing inventory management, accelerating supply chain processes, and improving product traceability across various non-medical sectors12,13. In healthcare, ordinary barcodes are commonly used to uniquely assign a file or a specimen to a specific patient14. Such barcodes may be used for locating patient-related information, enabling access from multiple locations simultaneously without the need for physical records. This can significantly speed up treatment planning15. In contrast, feature-driven barcodes in medical image analysis can support the efficient retrieval of complex patterns, aiding medical practitioners in diagnosing abnormalities by comparing patient records with similar cases, thereby improving diagnostic accuracy12,13,16,17. CBIR systems allow users to search through large databases by querying specific data features to find closely related entries. This method is beneficial for refining initial analyses and gaining deeper insights into the data, which can lead to more accurate conclusions and decisions18,19.
Hashing algorithms are broadly segmented into label-agnostic and label-dependent types. The former derives binary representations without utilizing class annotations, while the latter incorporates such annotations for code generation10. Various methods, such as local binary patterns (LBP)20, iterative quantization (ITQ)21, and MinMax barcoding22, are label-agnostic techniques. Label-dependent deep hashing approaches, such as Deep Supervised Hashing (DSH)23 and HashNet24, have recently gained attention due to their ability to learn compact and semantically meaningful binary codes directly from raw data. These methods leverage deep neural networks to jointly optimize feature extraction and hash code generation, often outperforming traditional label-agnostic techniques in image retrieval and classification tasks. However, these methods are often complex due to their reliance on deep neural networks that simultaneously learn features and hash functions. One key limitation is that the networks typically produce continuous outputs, which are then binarized through indirect approximations rather than learning the binary codes directly. As a result, the final binary codes are not explicitly optimized during training.
Furthermore, integrating retrieval-specific evaluation metrics such as Precision@k, mean Average Precision (mAP), or F1-score into the training objective is not straightforward in deep hashing frameworks. These metrics depend on the ranking of retrieved items and are non-differentiable, making them difficult to incorporate into gradient-based learning processes.
On the other hand, non-learnable hashing methods like dHash25 and DFT-Hash offer a label-independent way to convert feature vectors into binary representations. This technique compares the change direction between adjacent features to determine binary values. While computationally efficient, it is highly sensitive to the order of features, meaning the resulting code depends entirely on the sequence in which features are arranged. This makes the problem fundamentally combinatorial.
In this work, we introduce an evolutionary optimization approach to improve dHash25 by finding an effective permutation of feature indices. The goal is to directly maximize neighborhood ranking performance, measured with mAP. Our method preserves the simplicity of traditional barcoding while enhancing retrieval accuracy, offering a practical alternative to deep hashing models without the overhead and limitations of neural network-based training. The proposed method can be applied in any hashing method where feature order significantly influences performance and where achieving an optimal order is essential. Notably, our study is the first to explore the effect of feature vector order in barcoding, a topic that has not been previously investigated in the literature. Previous works have only addressed converting real-valued deep features to binary codes. We define this challenge as a combinatorial optimization problem and propose an evolutionary framework to address it. Our scheme demonstrates improvements in the accuracy of CBIR tasks across both medical and non-medical domains. The remainder of this paper is organized into the following sections. section 2 reviews the related literature within the scope of the paper. section 3 explains the proposed barcode optimization using the CGA framework. section 4 analyzes our proposed solution’s improved performance based on medical and non-medical datasets. Finally, the paper is concluded in section 5.
Background review
As mentioned earlier, the key concept of this research paper is to find the optimal order of features to generate more accurate barcodes. In this section, we review papers that conduct experiments on encoding various data formats, including images, video, and text, in barcode representation for CBIR. Additionally, this research investigates the importance of the order of features extracted by learning models.
Content-based image retrieval
In recent years, deep learning-based hashing methods have gained popularity for image retrieval tasks, particularly due to their ability to learn task-specific representations that preserve semantic similarity in Hamming space2. These include supervised approaches such as DSH23, deep hashing network (DHN)26, deep supervised hashing with triplet labels (DTSH)27, deep pairwise-supervised hashing (DPSH)27, central similarity quantization (CSQ)28, and unsupervised quantization-based network hashing (Quantization)29 which utilize neural networks to generate compact binary codes from image data. While such techniques have shown strong performance in various benchmarks, they also come with significant drawbacks that limit their practicality and generalization.
In contrast, traditional hashing methods, such as difference hashing (dHash)25, average hashing (aHash)30, MinMax hashing22, discrete Fourier transform hashing (DFT-Hash)31, LBP20, ITQ21, and locality-sensitive hashing (LSH)32 operate independently of task-specific training and instead apply fixed transformations to hand-crafted or pre-extracted features. For instance, dHash25 converts images into compact binary signatures based on gradients or pixel differences. The core idea is to capture relative patterns by comparing adjacent pixel intensities and generating a binary hash representing the structural content of an image. In addition, DFT-Hash-based hashing transforms the input, whether an image or a deep feature vector, into the frequency domain using the discrete Fourier transform. This transformation encodes the global ordering and oscillatory trends of the input signal. A binary hash is then generated by thresholding either the real components or the magnitude spectrum of the DFT coefficients. Unlike dHash25, which focuses on local differences, DFT-Hash hashing captures global structural characteristics, making it particularly sensitive to the ordering and distribution of features. Although these approaches are highly efficient and do not rely on end-to-end learning, they typically ignore the underlying data distribution and feature shifting, which limits their ability to adapt to semantic structure in the feature space. As a result, their retrieval performance, especially on large-scale or complex datasets, tends to lag behind learning-based methods that use convolutions to capture feature representations.
However, the gap between traditional and learning-based hashing may not be as wide as previously thought. Most deep hashing models extract features from intermediate layers of DNNs and then apply learned hash projections optimized for retrieval-specific loss functions33. These retrieval losses (e.g., contrastive, triplet, or ranking loss) differ significantly from classification objectives such as cross entropy34,35, which are commonly used in pre-training large-scale neural networks for tasks like image classification.
This raises an important question: Is it necessary to retrain deep networks with specialized retrieval losses when high-quality feature representations already exist from pre-trained classification models? In practice, many pre-trained deep neural networks, trained solely on classification objectives, are capable of producing feature embeddings that encapsulate semantic relationships between images36. These embeddings, particularly those extracted from layers before the classification head, can serve as a strong foundation for retrieval without any further fine-tuning37, 38, 39–40. If combined with an effective transformation or encoding strategy, such as optimized feature ordering prior to hashing41, these representations may allow traditional hashing techniques to bridge the performance gap.
This observation challenges the prevailing notion that effective image retrieval requires task-specific deep supervision. Instead, it opens the door for hybrid approaches that decouple feature extraction from binary encoding42, eliminating the need for expensive retraining pipelines while retaining high retrieval quality43. By rethinking the role of feature representations in the hashing process, one can exploit the representational power of classification, which trained models in conjunction with computationally efficient hashing strategies, thereby achieving scalability, robustness, and strong semantic preservation in retrieval tasks44,45.
Evaluation Metrics Retrieval quality is assessed on the ranked list returned by the nearest neighbor search in Hamming space using three complementary criteria:
For a given query image, Precision@K measures the fraction of the top-k retrieved items that share the query’s ground-truth label:
1
where denotes the label of the neighbour and is the indicator function. Averaging Precision@K over all queries summarises “first-page” retrieval quality.mean Average Precision (mAP): Let R be the total number of relevant instances for a query and n the database size. Average Precision (AP) is defined as
2
and mAP is the mean of AP over all q query images:3
mAP reflects ranked-list quality across the entire ranking and is robust to class imbalance.F1-score@k: Treating retrieval as a classification task, each query is assigned the class label that appears most frequently among its k-nearest neighbors in Hamming space. Let be this pblueicted label and the ground-truth label for query q. Over the full query set, we compute the following:
4
and the harmonic mean:5
This statistic gauges how well the binary codes support downstream classification when combined with a simple k-NN rule. Using this trio of metrics provides complementary insights: early-rank fidelity (Precision@K), global ranking coherence (mAP), and classification robustness (F1).
Data barcoding across multiple domains
In this section, we explore a range of innovative barcoding techniques for retrieving data. Our focus includes several key areas, including image analysis, genomic data analysis, video barcoding, and text barcoding. Each subsection investigates how different studies have applied these techniques to enhance accuracy, efficiency, and effectiveness in data retrieval across various domains.
Image Barcoding Mobile advertising increasingly uses barcodes to turn printed materials into interactive mobile commerce platforms. Traditional barcodes are often visually unappealing. To address this, a novel method46 is proposed for embedding data in halftone images. This approach combines data-hiding and halftoning algorithms, utilizing the entire image to embed data without standard barcode formats. The paper47 discusses the development of a highly accurate and portable artificial scent screening system for real-time monitoring of meat freshness. The system combines cross-reactive colorimetric barcode combinatorics with deep convolutional neural networks to achieve robust scent fingerprint recognition. The proposed method, Viscode by Zhang et al.48, embeds data into visualization images using deep neural networks, similar to QR code barcoding, without distorting the visuals. It employs an encoder-decoder network and a saliency-based layout algorithm to ensure high-quality encoding and decoding.
Turning to medical imaging, numerous research studies focus on converting medical images into barcodes to facilitate the image retrieval process. Tizhoosh et al.22 proposed a new method that uses Radon transform to retrieve x-ray images in the IRMA dataset with 14,6000 images of various body parts. The proposed MinMax22method utilizes Radon transform to generate Radon barcodes using their transitions from min to max values, an approximation of 1D derivative of a vector. Karla et al.49 presented a similarity-based search engine called Yottixel for a large archive of histopathology WSI images. The first stage of developing this engine is employing a Convolutional Neural Network (CNN) to extract features of the query images, followed by MinMax algorithm to create a selection pool for indexing the features to form a memory-saving bunch of barcodes. Due to using these binary representations, Yottixel has emerged as one the fastest and most storage-efficient search engines17. Zhu and Tizhoosh50 proposed a model that utilizes normalized Radon projection with associated image labels to build a Support Vector Machine (SVM) classifier and concurrently generate the Radon barcode codes. Babaei et al.51 suggested a novel local Radon descriptor for content-based pathology image searching. Tizhoosh and Rahnamayan52 introduced a Radon transformation-based method to generate an optimal number of projections to barcode the medical IRMA dataset in an expressive way. Considering non-medical datasets, Angadi and Purad53 applied Radon transform to the astronomy dataset to match and retrieve images. In order to determine the ocean wavelength and wave direction in SAR images, Zhao et al.54 used feature detection in Radon-transformed ocean images. Regarding further image retrieval methods, Ozturk55 presented the Content-Based Medical Image Retrieval (CBMIR) method to generate hash codes using a stacked autoencoder and deep discriminative features extracted in the last fully connected layer in their suggested deep neural network model. To address the low accuracy of medical image retrieval in traditional models due to failure in capturing latent features in the multi-dimensional datasets, Karthik et al.56 introduced a deep CNN to CBMIR task that is evaluated on multi-view classification.
Genomic Data Barcoding Choi et al.57 provided visions toward using barcoding for oomycetes identification and emphasized the significance of choosing the right DNA areas for barcoding in various species. In another research58, the authors outlined a technique for barcoding and identifying several mycobacterium tuberculosis (MTB) lineages for use in epidemiological studies. The research shows how barcoding techniques can be used to identify bacterial lineages in extensive studies.
Video Barcoding With respect to the higher memory usage of full-length videos compared to images, proposing efficient methods to barcode videos is a promising area. Despite widespread video processing applications in medicine, including surgery, endoscopy, and radiology, few works have studied this challenge. Eminaga et al.59 presented a surgically-directed framework to extract brief cystoscopy video clips with a bladder for more effective content visualization. A screenshot of the lesion is taken and then labeled with the corresponding spot in the bladder. This procedure is followed by generating a search algorithm to retrieve identified lesions by estimating the mean square error (MSE) between the screenshot and the video frames. Accordingly, each video clip’s beginning adds a QR code containing pathologic information, frame annotation, and quality control resources. Another study60 introduced an innovative method to address video indexing challenges using face recognition, aiming to reduce space and time complexity. The approach involves key frame extraction from the input video, face recognition within these key frames, face identification using barcodes, and finally, indexing the video using EAN-8 linear barcodes representing human faces. In another study, Ghatak et al.61 presented a method aimed at video indexing using Local Gradient Feature Analysis (LGFA) and a sliding window technique to detect human faces. Then, to represent the human faces in the video, a barcode is generated by the EAN-8 linear video indexing technique, which lowers bandwidth and storage requests. Ben-Artzi et al.62 addressed video retrievals consisting of dynamic appearance changes using motion barcodes. The method involves first separating the video into shots, after which the optical flow field is used to create motion barcodes for each shot.
Text Barcoding Piro et al.63 introduced layout-based text retrieval using the Radon transform along with Dynamic Time Wrapping (DTW). The outcome of the model is a barcode representation of the document database. Sharif et al.64 proposed a joint barcode and text orientation detection scheme. The scheme employs a CNN-based barcode localization model to calculate four arbitrarily aligned 1D code vertices. After that, a post-processing module uses it to determine the angle at which the text should be aligned.
Does feature ordering matter?
Feature ordering refers to the arrangement of extracted features based on their relative importance, particularly in the context of deep neural networks (DNNs). This concept becomes especially pertinent when utilizing features extracted from pre-trained DNNs before inputting them into a fully connected decoder head. The decoder’s role is to discern both global and local relationships within the feature representations. However, traditional nearest neighbor-based retrieval algorithms, which often rely on Hamming distance, may not effectively capture these global feature relationships. Consequently, in content-based image retrieval (CBIR) systems, retrieving the k-nearest neighbors for a given query might lead to a high false positive rate.
Recent studies have underscored the significance of feature selection and ordering in enhancing the performance of CBIR systems. For instance, Deep Feature Screening (DeepFS) employs a two-step approach: first, it uses a deep neural network for feature extraction, and then it applies multivariate feature screening to select the most informative features. This method has demonstrated effectiveness in handling ultra high-dimensional data, ensuring that the most relevant features are prioritized65.
Moreover, the stability and reliability of feature selection can be improved through ensembling techniques. By aggregating feature importance scores across multiple models or training epochs, ensembling mitigates the variability inherent in deep learning models, leading to more consistent and robust feature selection66. In addition, recent advances in decision tree67 models have emphasized that the ordering and selection of features critically affect model behavior and performance. For example, methods such as UnbiasedGBM aim to correct bias in feature selection and emphasize the importance of careful feature handling during tree construction68. Similarly, new frameworks combining feature learning with decision tree training further illustrate that the structure and performance of decision trees are highly sensitive to feature representations and their ordering69. These findings reinforce that feature ordering plays a fundamental role not only in deep learning-based hashing methods but also in broader machine learning contexts, motivating further investigation into order-aware optimization techniques.
In CBIR, the choice and arrangement of features play a pivotal role in bridging the semantic gap, the disconnect between low-level visual features and high-level semantic concepts. Utilizing high-level features derived from pre-trained deep learning models has been shown to enhance retrieval accuracy, as these features capture more abstract and semantically meaningful information70.
In summary, the ordering and selection of features are critical components in the design of effective CBIR systems. By prioritizing the most informative features and leveraging advanced techniques like DeepFS and ensembling, it is possible to enhance retrieval accuracy and reduce false positives, thereby improving the overall performance of these systems.
Proposed method
The idea presented in this paper is to produce an optimal order of features before encoding data in a barcode. We aim to improve the dHash technique, which is heavily affected by a directional change in the feature vectors. To this effect, we design an evolutionary process to find the best order of features for a given CBIR task by minimizing fitness function. In an iterative procedure, we employ the Combinatorial Genetic Algorithm (CGA) to search for the optimal permutation of features by evaluating the generated barcodes for CBIR task. Our proposed method consists of three components: 1) Combinatorial Genetic Algorithm (CGA), 2) Hashing of deep features with respect to permutation, and 3) fitness function. Figure 2, illustrates the overall flow of the proposed optimization process to generate a set of optimal feature orders to increase the classification accuracy of binary barcodes generated from each data sample. As demonstrated, specifically for image search tasks on WSIs, deep features are extracted by inputting images into backbone pre-trained deep models. These deep features undergo binarization through an optimization process, and the optimal order is determined using an evolutionary algorithm. We further explain each component in detail. However, before providing the details of the proposed framework, we describe how a barcode can be generated using a feature vector.
Barcoding feature vectors
To turn continuous deep features into compact binary representations, we apply a hashing method (dHash25 or DFT-Hash) to the pre-trained DNN’s pooling output. Given an image (or sample) x, let denote its d-dimensional feature vector. In dHash25, the associated deep barcode is obtained by evaluating the gradient of adjacent features:
6
Where denotes the -th element of the deep feature vector . This encoding is sensitive to the local ordering of feature magnitudes and captures gradient-like structures within the representation.In contrast, DFT-Hash applies the discrete Fourier transform to the deep feature vector, projecting it into the frequency domain. This operation reveals the global ordering and periodic structure of the features by converting the signal into its complex spectral components. To generate the deep barcode, the real part of the transformed signal is binarized using the sign function:
7
Where denotes the discrete Fourier transform of , and extracts the real component of the complex value. The resulting binary representation captures global frequency characteristics and is highly sensitive to the ordering of input features.dHash25 and DFT-Hash are both sensitive to the ordering of features; the local position of each feature strongly influences the resulting binary code. In nearest-neighbor retrieval using the Hamming distance, an unfavorable feature ordering can distort the true similarity between samples, leading to degraded retrieval performance. This is particularly critical when binary codes are used to approximate semantic similarity, as even minor misalignment in feature order can result in significant Hamming distance deviations.
Combinatoric optimization formulation
Let be a fixed training set of feature vectors with and let be a permutation of . For each sample, we write:
8
Define an error functional that aggregates the retrieval loss (e.g. mean average precision error) over the training set,9
Where measures how well the barcode produced under preserves neighborhood structure. The task is the combinatorial optimization of the following:10
Because grows factorially, exhaustive search is infeasible even for moderate d.Combinatorial Genetic Algorithm (CGA)
The purpose of this step is to rearrange the extracted feature p by permutation to increase the performance of retrieval. Hence, we employ a genetic algorithm (GA), which is a stochastic search and optimization based on natural evolution theory, which articulates survival of the fittest. The reason for adopting GA lies in its effective global and local search to explore the entire space for a potential solution using population, rather than grid search on a single candidate solution, which is computationally expensive. The main segments of GA are 1) selection, 2) crossover, and 3) mutation. To adopt GA to our combinatoric optimization problem, we propose to formulate GA as CGA with specialized operators to explore the permutation space efficiently. First and foremost, a population P of random permutations as candidate solutions is initialized. A pair of parents is selected by tournament selection for mating using order crossover (OX) and inversion mutation operators, as illustrated in Figure 1, to produce new permutations.
Fig. 1 [Images not available. See PDF.]
Illustration of operators in CGA. Order crossover (a) involves taking the values from parents in the order that appears at each parent or chromosome. Inversion mutation (b) reverses a contiguous segment of genes.
The above operations are repeated until the stopping criteria are met. Finally, the optimal permutation is returned by .
Algorithm 5 [Images not available. See PDF.]
Fitness Function for Candidate Permutation
Fitness function
As stated before, the optimization algorithm is responsible for finding the optimal permutation of features. Algorithm Algorithm 5 shows the fitness function proposed for candidate solution evaluation in CGA. For each data point x in training dataset , binary barcodes are calculated using dHash method. To search, we use training data for both the database and the query as major components for information retrieval. To obtain publications for a query, we utilize nearest neighbor search using Hamming distance described in Algorithm Algorithm 3. We use the mAP metric as one of the key CBIR evaluation metrics directly to increase retrieval performance.
Crucially, both the barcode construction and the retrieval simulation that underpin Algorithm 5 are computed exclusively on the training dataset. This strict data partitioning strategy follows best practice in machine learning evaluation, where test data are reserved for a single, final assessment to avoid optimistic bias caused by information leakage. By confining the GA’s selection pressure to the training set, we ensure that the discoverable permutation generalizes beyond the data seen during optimization.
Fig. 2 [Images not available. See PDF.]
The overall procedure of barcode optimization framework. (A) WSI (whole slide image) is a large image consisting of many patches, (B) patches can be selected based on stain and location, (C) patches are fed into a deep network, (D) deep feature vectors are extracted for all patches, (E) average deep features are calculated to create one vector per WSI, (F) optimization generates optimal barcodes.
Experimental analysis
To comprehensively evaluate our method, we utilized generated barcodes to perform image searching across four distinct datasets, encompassing both medical and non-medical domains. Furthermore, we diversified the selection of conventional network types for the purpose of feature extraction. It aims to demonstrate the model’s flexibility in handling various data types and its potential for use in practical situations. The whole process is to extract the features from the experimental datasets and then use them. This section describes the details of the experiments that were conducted. Importantly, we adhere strictly to standard machine learning practices by performing optimization solely on the training sets of the datasets. The test sets were used exclusively to evaluate the final performance of the optimized barcodes to ensure that no information from the test sets influenced the optimization process. This distinction underlines the validity of our approach and its ability to generalize to unseen data. This section describes the details of the experiments that were conducted.
Table 1. WSI split sizes per tumor type in the TCGA dataset.
Tumor Type | Subtypes | #Train WSIs | #Test WSIs |
|---|---|---|---|
Brain | GBM, LGG | 1324 | 74 |
Endocrine | ACC, PCPG, THCA | 746 | 72 |
Gastrointestinal | COAD, READ, ESCA, STAD | 620 | 88 |
Gynecologic | CESC, OV, UCS | 277 | 30 |
Liver | CHOL, LIHC, PAAD | 416 | 51 |
Mesenchymal | UVM, SKCM | 230 | 28 |
Pulmonary | LUAD, LUSC, MESO | 719 | 86 |
Urinary Tract | BLCA, KICH, KIRC, KIRP | 1048 | 123 |
Datasets
TCGA
TCGA dataset offers public digital pathology repositories71, which are first initialized by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). The TCGA database contains 30,072 histopathological whole slide images (WSIs) of 32 cancer subtypes and corresponding metadata. This pathology dataset has been extensively used in deep learning for cancer research. In addition, the annotations are at the WSI level (no pixel-level delineations) along with morphology, primary diagnosis, and tissue or organ of origin metadata72. Due to the gigapixel size of WSIs (often much larger than 50,000 by 50,000 pixels), analyzing and processing steps could be a challenge for data scientists. One way to overcome this is to extract WSI portions, called patches, of ordinary dimensions, say 1,000 by 1,000 pixels, with less computational complexity. Since we needed to apply our model to extracted features, we collected the features of TCGA images extracted by KimiaNet, a pre-trained DenseNet-121 by Riasatian et al.73 to simplify feature extraction steps. In this paper, tissue patches derived from WSIs with pixels were fed to KimiaNet to extract deep features. However, the average feature pooling introduced in74 can lead to one feature vector of length 1024 for the entire WSI.
Due to the curse of dimensionality in feature space size of 1024, several works such as Rasool et al.75 recommended doing feature selection using GA, which significantly improves accuracy due to redundancy reduction. Recent work by Asilian et al.76 proposed an evolutionary multi-objective optimization framework to generate highly compact and discriminative WSI representations. Their method begins by extracting patch-level features using KimiaNet and applies a two-stage feature selection strategy. To further explore the impact of barcoding and CBIR on reduced feature space, we also experiment with our proposed method on the reduced feature subset derived by Asilian et al.76 in addition to KimiaNet extracted full features.
In our experiments, we utilized the extracted deep features for multiple cancer subtypes in the TCGA dataset. Details of the dataset split used in training, validation, and testing are summarized in Table 1.
COVID-19 radiography
In the second test case, we employed a publicly available COVID-19 dataset77,78, including four classes: COVID-19, Pneumonia, Lung Opacity, and Normal. We then trained the ResNet-50 architecture from scratch for feature extraction. ResNet-50, recognized for its deep network layers, has residual blocks that allow for the training of large networks while overcoming gradient-related issues. These blocks, when combined with convolutional and pooling layers, capture increasingly complex visual characteristics. The architecture culminates in global average pooling and fully connected layers, which turn these features into a compact categorization representation. Finally, the obtained feature vectors are of length 128 as the input of the image search system79,80.
Fashion-MNIST
The third dataset, Fashion-MNIST, is a popular non-medical machine learning and computer vision benchmark. It consists of grayscale photographs of T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, shoes, purses, and ankle boots from ten various fashion categories in a 28 28 pixel representation81,82. In this study, we fine-tuned EfficientNet-B083 by adding a customized feature extractor network, including global average pooling to summarize spatial information, followed by a dropout layer to prevent overfitting. Lastly, a dense layer with a softmax activation function to align with the Fashion-MNIST dataset classes. Subsequently, we utilized the 1280-length feature vector obtained from the second-to-last layer as non-medical test cases for our CBIR system.
CIFAR-10/100
The CIFAR-10 and CIFAR-100 datasets consist of 32 32 color images in 10 and 100 distinct classes84. In order to obtain the requisite feature vectors, we conducted fine-tuning on DenseNet-12185, augmenting it with a global average pooling layer and flattening layers, accompanied by a sequence of Dense Layers configured with output sizes of 1000 and 10. Similar to previous datasets, we employed the features extracted from the Dense layer preceding the final layer for our specific objective.
Implementation details
We compare our proposed CGA method to various hashing approaches in image retrieval, including dHash25 and DFT-Hash are as baselines for order-sensitive hashing methods, and other aHash30, MinMax22, LBP86and, ITQ21are non-learnable hashing methods. Learnable hashing methods have recently been widely used due to the rise of neural networks, such as, DHN26, DPSH87, Quantization29, LSH32, CSQ28, DTSH27, and DSH23. To create a fair comparison, we use pre-trained DNNs such as KimiaNet and DenseNet-121 for the TCGA dataset, ResNet-18 for the CIFAR-10 and CIFAR-100 datasets, EfficientNet-B0 for the Fashion-MNIST dataset, and ResNet-50 for the COVID-19 dataset as backbone models and fine-tune on the classification task using the cross entropy loss function. Then, we use their convolutional layer to extract feature representation by removing the classification head. Therefore, all hashing methods use the same inputs as extracted features and transform them into binary values. We then evaluate the methods on three metrics using training data as a database and test data as a query after transforming to barcodes. The test set is only used in the evaluation phase to query the test barcodes on the whole training barcode database and return the most related samples in the database. For learnable neural network-based hashing methods, the number of epochs is set to 100, and the batch size is set to 512. In our CGA method, the number of iterations is set to 100 and population size to keep the consistency in maximum number of function calls as . Furthermore, the crossover probability rate is set to 0.9, and the mutation probability rate in inversion mutation is set to 0.9. Due to the inherent stochasticity of all hashing methods including ours, each method is independently repeated 15 times with different random seeds. To evaluate statistical significance, we perform paired t-tests between each baseline method and our proposed CGA approach. The resulting p-values are reported in parentheses (e.g., ) alongside each metric in the tables.
Results and analysis
In this section, we discuss the experiment carried out to evaluate the proposed barcode optimization algorithm. The experiments include a comparison of the hashing techniques in the generation of barcodes for image search tasks, which is assessed across disparate types of datasets to show how our proposed method affects the data retrieval performance against other methods on F1-score, Precision@ , and mAP evaluation metrics in the following analyses.Table 2
Performance comparison of different methods on the KimiaNet-Pulmonary dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown inbold and the second best areunderlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.8838 (0.004) | 0.8384 (0.000) | 0.7760 (0.000) | 0.8327 | 6.0 |
MinMax22 | 0.8487 (0.000) | 0.7942 (0.000) | 0.7285 (0.000) | 0.7905 | 10.3 |
aHash30 | 0.8487 (0.000) | 0.8047 (0.000) | 0.7454 (0.000) | 0.7996 | 9.3 |
LSH32 | 0.8146 (0.000) | 0.7883 (0.000) | 0.7236 (0.000) | 0.7755 | 12.0 |
LBP86 | 0.8698 (0.533) | 0.8498 (0.455) | 0.8265 (0.000) | 0.8487 | 4.0 |
ITQ21 | 0.8528 (0.010) | 0.8319 (0.000) | 0.8069 (0.001) | 0.8305 | 6.3 |
DPSH87 | 0.8188 (0.096) | 0.8209 (0.044) |
| 0.8275 | 7.3 |
CSQ28 | 0.3191 (0.000) | 0.4758 (0.000) | 0.5197 (0.000) | 0.4382 | 15.0 |
DTSH27 | 0.8465 (0.003) | 0.8252 (0.000) | 0.8044 (0.131) | 0.8254 | 8.0 |
DSH23 | 0.6184 (0.000) | 0.5809 (0.000) | 0.5389 (0.000) | 0.5794 | 13.7 |
DHN26 | 0.8625 (0.168) | 0.8487 (0.431) | 0.8393 (0.002) | 0.8502 | 4.3 |
Quantization29 | 0.5101 (0.000) | 0.5913 (0.000) | 0.6602 (0.007) | 0.5872 | 13.3 |
DFT-Hash | 0.8954 (0.000) | 0.8407 (0.001) | 0.7948 (0.481) | 0.8437 | 4.7 |
CGA-dHash |
| 0.8743 (-) | 0.8620 (-) | 0.8741 | 1.3 |
CGA-DFT | 0.8722 (-) |
| 0.7963 (-) | 0.8402 | 4.3 |
Performance comparison of different methods on the KimiaNet-Endocrine features selected dataset (15 runs). Results show mean values with ± standard deviation in subscripts and p-values in parentheses (comparing against CGA). The best results are shown in bold and the second best are underlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.9315 (0.000) | 0.9222 (0.000) | 0.8780 (0.000) | 0.9106 | 10.3 |
MinMax22 | 0.9602 (0.000) | 0.9486 (0.000) | 0.9267 (0.000) | 0.9452 | 4.7 |
aHash30 | 0.9336 (0.000) | 0.9292 (0.000) | 0.8525 (0.000) | 0.9051 | 10.0 |
LSH32 | 0.8605 (0.000) | 0.8310 (0.000) | 0.7634 (0.000) | 0.8183 | 14.0 |
LBP86 | 0.9590 (0.000) | 0.9494 (0.000) | 0.9095 (0.000) | 0.9393 | 5.3 |
ITQ21 |
| 0.9648 (0.000) | 0.9609 (0.000) | 0.9656 | 2.7 |
DPSH87 | 0.9489 (0.000) | 0.9382 (0.000) | 0.9181 (0.000) | 0.9351 | 6.7 |
CSQ28 | 0.8754 (0.000) | 0.8899 (0.000) | 0.8271 (0.000) | 0.8641 | 12.3 |
DTSH27 | 0.9362 (0.000) | 0.9255 (0.000) | 0.8890 (0.000) | 0.9169 | 9.0 |
DSH23 | 0.8145 (0.000) | 0.8157 (0.000) | 0.7592 (0.000) | 0.7965 | 15.0 |
DHN26 | 0.9475 (0.000) | 0.9469 (0.000) | 0.9208 (0.000) | 0.9384 | 6.3 |
Quantization29 | 0.9645 (0.161) |
|
| 0.9663 | 2.3 |
DFT-Hash | 0.9450 (0.000) | 0.9486 (0.000) | 0.8552 (0.000) | 0.9163 | 7.7 |
CGA-dHash | 0.9935 (-) | 0.9898 (-) | 0.9798 (-) | 0.9877 | 1.0 |
CGA-DFT | 0.8776 (-) | 0.8628 (-) | 0.7969 (-) | 0.8458 | 12.7 |
Performance comparison of different methods on the DenseNet-121-Pulmonary dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown in bold and the second best are underlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.6749 (0.001) | 0.6012 (0.000) | 0.5106 (0.000) | 0.5956 | 3.3 |
MinMax22 | 0.6655 (0.010) | 0.5965 (0.000) | 0.5116 (0.000) | 0.5912 | 4.0 |
aHash30 | 0.6650 (0.011) | 0.5907 (0.005) | 0.5019 (0.000) | 0.5859 | 5.3 |
LSH32 | 0.5975 (0.013) | 0.5487 (0.000) | 0.4804 (0.000) | 0.5422 | 8.7 |
LBP86 | 0.6469 (0.650) | 0.5969 (0.008) |
| 0.5866 | 4.0 |
ITQ21 | 0.5948 (0.011) | 0.5627 (0.010) | 0.4855 (0.000) | 0.5477 | 8.3 |
DPSH87 | 0.3625 (0.000) | 0.4729 (0.000) | 0.4669 (0.000) | 0.4341 | 13.3 |
CSQ28 | 0.4273 (0.000) | 0.4818 (0.000) | 0.4663 (0.000) | 0.4585 | 13.0 |
DTSH27 | 0.5557 (0.000) | 0.5097 (0.000) | 0.4720 (0.000) | 0.5125 | 10.0 |
DSH23 | 0.4973 (0.000) | 0.5015 (0.000) | 0.4688 (0.000) | 0.4892 | 11.0 |
DHN26 | 0.2708 (0.000) | 0.4440 (0.000) | 0.4683 (0.000) | 0.3944 | 13.7 |
Quantization29 | 0.2708 (0.000) | 0.4419 (0.000) | 0.4681 (0.000) | 0.3936 | 14.0 |
DFT-Hash | 0.7217 (0.000) |
| 0.4983 (0.000) | 0.6078 | 3.0 |
CGA-dHash |
| 0.6536 (-) | 0.5719 (-) | 0.6441 | 1.3 |
CGA-DFT | 0.6386 (-) | 0.5791 (-) | 0.4942 (-) | 0.5706 | 7.0 |
Retrieval performance on TCGA
We evaluated the effectiveness of the proposed CGA in optimizing the ordering of features extracted from two pre-trained deep neural networks, KimiaNet and DenseNet-121, using the TCGA histopathology dataset. CGA functions as an evolutionary strategy for discovering the most informative arrangement of features, improving the quality of binary representations used for image retrieval and classification. Its performance is comparable to traditional binary encoding schemes, unsupervised hashing techniques, and deep supervised hashing models. The evaluation considers F1-score, Precision at top- retrieval (Precision@k), and mean average precision (mAP), across both complete and reduced feature sets. In the following paragraphs, each TCGA sub-dataset’s barcoding retrieval performance are analyzed individually based on feature extraction types.
Retrieval Performance on TCGA Extracted Features by KimiaNet In the first experimental setting, we evaluate all methods on the KimiaNet-Pulmonary dataset using full feature vectors. As shown in Table 2, our proposed CGA-dHashmethod outperforms all baselines by achieving the highest Precision@k (0.8743 ± 0.01) and mAP (0.8620 ± 0.00), while also ranking second in F1-score (0.8861 ± 0.01), just behind DFT-Hash (0.8954 ± 0.00). CGA-dHash attains the highest overall average score (0.8741) and the best average rank (1.3), indicating a consistent advantage across all three metrics. CGA-DFT also demonstrates robust performance, placing among the top methods with strong scores across the board F1-score of 0.8722 ± 0.01, Precision@k of 0.8522 ± 0.01, and mAP of 0.7963 ± 0.01–yielding an overall average of 0.8402 and an average rank of 4.3. These results confirm that our CGA-based optimization is highly effective in refining feature permutations to enhance semantic consistency within neighborhood retrievals. This leads to notable improvements in both image retrieval precision and classification quality. Additional results on the rest of TCGA sub-datasets are presented in Supplementary Tables 8 to 14.
Retrieval Performance on TCGA Extracted Features by KimiaNet with Fine Selection To assess CGA’s robustness when working with a smaller subset of features selected by Asilian et al.76, we apply feature selection to reduce the dimensionality of the KimiaNet’s Endocrine representations. As presented in Table 3, CGA-dHash consistently outperforms all baseline methods, achieving the best results in F1-score (0.9935 ± 0.01), Precision@k (0.9898 ± 0.01), and mAP (0.9798 ± 0.01). It also achieves the highest average score (0.9877) and the top average rank (1.0), demonstrating its strong and stable performance under feature-restricted settings. In contrast, CGA-DFT shows moderate results with F1-score of 0.8776 ± 0.04, Precision@k of 0.8628 ± 0.04, and mAP of 0.7969 ± 0.05, resulting in an average score of 0.8458 and an average rank of 12.7. Notably, while some deep supervised hashing methods, such as ITQ21 and DFT-Hash, also maintain relatively strong scores, several others, including contrastive or quantization-based objectives, deteriorate significantly in this setting. This contrast highlights the ability of CGA to remain effective and adaptable regardless of the number of features, whereas many neural-based hashing methods are more sensitive to dimensionality reduction. Additional results on the rest of TCGA sub-datasets are presented in Supplementary Tables 1 to 7.
Retrieval Performance on TCGA Extracted Features by DenseNet-121 The analysis on DenseNet-121 extracted features presented in Table 4 provides compelling evidence for the effectiveness of the proposed CGA methods on various deep neural networks. Specifically, CGA-dHash achieves the best results across all three evaluation metrics: F1-score (0.7070 ± 0.03), Precision@k (0.6536 ± 0.01), and mAP (0.5719 ± 0.01). This leads to the highest average score (0.6441) and the lowest average rank (1.3), indicating robust and consistent performance across different retrieval quality indicators. In contrast, CGA-DFT performs moderately with an average of 0.5706 and an average rank of 7.0, showing lower adaptability to DenseNet-121 derived features compared to CGA-dHash. Interestingly, most traditional and deep hashing methods, including Quantization29, DSH23, and pairwise-based approaches–exhibit considerably weaker performance in this setting. These findings suggest that DenseNet-121 features pose a greater challenge to conventional hashing strategies. However, CGA-dHash demonstrates superior adaptability, effectively identifying more informative feature permutations for retrieval and classification. Additional results on the remaining TCGA sub-datasets are presented in Supplementary Tables 15 to 21.
Across all experiments, CGA delivers superior performance compared to existing approaches. Unlike deep hashing methods that do not directly align their learning objectives with binary encoding or retrieval metrics, CGA explicitly optimizes the arrangement of features to maximize similarity-based rankings. This leads to improved structure in the binary space and better alignment with downstream evaluation objectives. Moreover, CGA operates independently of the network architecture and is equally effective on both full and reduced feature representations. These characteristics make it a practical and efficient solution for enhancing compact binary representations, particularly in retrieval tasks where interpretability, memory efficiency, and precision are essential.Table 5
Performance comparison of different methods on the CIFAR-10 dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown in bold and the second best are underlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.9311 (0.000) | 0.9305 (0.000) | 0.9240 (0.000) | 0.9285 | 7.7 |
MinMax22 |
| 0.9316 (0.000) | 0.9235 (0.000) | 0.9304 | 5.7 |
aHash30 | 0.9362 (0.008) | 0.9353 (0.001) | 0.9489 (0.000) | 0.9401 | 1.0 |
LSH32 | 0.9293 (0.000) | 0.9269 (0.000) | 0.9205 (0.000) | 0.9256 | 9.0 |
LBP86 | 0.9322 (0.006) | 0.9323 (0.024) | 0.9440 (0.453) | 0.9362 | 5.7 |
ITQ21 | 0.9339 (0.311) |
|
| 0.9384 | 2.7 |
DPSH87 | 0.9094 (0.000) | 0.9042 (0.000) | 0.8524 (0.000) | 0.8887 | 13.0 |
CSQ28 | 0.9129 (0.000) | 0.9089 (0.000) | 0.8623 (0.000) | 0.8947 | 11.7 |
DTSH27 | 0.9148 (0.000) | 0.9113 (0.000) | 0.8686 (0.000) | 0.8982 | 10.3 |
DSH23 | 0.9153 (0.000) | 0.9100 (0.000) | 0.8607 (0.000) | 0.8953 | 11.0 |
DHN26 | 0.8963 (0.000) | 0.8878 (0.000) | 0.8189 (0.000) | 0.8677 | 14.0 |
Quantization29 | 0.3321 (0.000) | 0.3955 (0.000) | 0.3675 (0.000) | 0.3650 | 15.0 |
DFT-Hash | 0.9330 (0.017) | 0.9325 (0.009) | 0.9442 (0.561) | 0.9366 | 4.7 |
CGA-dHash | 0.9337 (-) | 0.9318 (-) | 0.9274 (-) | 0.9310 | 5.7 |
CGA-DFT | 0.9345 (-) | 0.9337 (-) | 0.9444 (-) | 0.9375 | 3.0 |
Performance comparison of different methods on the CIFAR-100 dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown inbold and the second best areunderlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.7529 (0.000) | 0.6781 (0.000) | 0.5447 (0.000) | 0.6586 | 6.0 |
MinMax22 | 0.6580 (0.000) | 0.5575 (0.000) | 0.4423 (0.000) | 0.5526 | 8.7 |
aHash30 | 0.7613 (0.000) | 0.7202 (0.000) | 0.6069 (0.000) | 0.6962 | 4.0 |
LSH32 | 0.6462 (0.000) | 0.5860 (0.000) | 0.4556 (0.000) | 0.5626 | 8.3 |
LBP86 | 0.7504 (0.000) | 0.7359 (0.000) | 0.6843 (0.000) | 0.7235 | 3.7 |
ITQ21 | 0.7199 (0.000) | 0.7034 (0.000) | 0.6682 (0.000) | 0.6972 | 5.3 |
DPSH87 | 0.4962 (0.000) | 0.3921 (0.000) | 0.2514 (0.000) | 0.3799 | 12.7 |
CSQ28 | 0.5105 (0.000) | 0.4064 (0.000) | 0.2627 (0.000) | 0.3932 | 11.0 |
DTSH27 | 0.5168 (0.000) | 0.4154 (0.000) | 0.2732 (0.000) | 0.4018 | 10.0 |
DSH23 | 0.4955 (0.000) | 0.3972 (0.000) | 0.2566 (0.000) | 0.3831 | 12.3 |
DHN26 | 0.4809 (0.000) | 0.3760 (0.000) | 0.2377 (0.000) | 0.3648 | 14.0 |
Quantization29 | 0.0013 (0.000) | 0.0110 (0.000) | 0.0124 (0.000) | 0.0082 | 15.0 |
DFT-Hash | 0.7671 (0.819) | 0.7535 (0.942) | 0.7380 (0.396) | 0.7529 | 1.0 |
CGA-dHash | 0.7496 (-) | 0.6784 (-) | 0.5466 (-) | 0.6582 | 6.0 |
CGA-DFT |
|
|
| 0.7526 | 2.0 |
Retrieval performance on CIFAR-10/100
To evaluate the generalizability of the proposed CGA, we extend our experiments to two widely used natural image benchmarks: CIFAR-10 and CIFAR-100. These datasets include object classes with diverse visual characteristics and serve as a standard for image retrieval and classification performance. Each method is evaluated over 15 independent runs, and results are reported using three standard metrics: F1-score, Precision at top-k retrieval (Precision@k), and mean average precision (mAP).
Retrieval Performance on CIFAR-10 The results in Table 5 indicate that CGA demonstrates strong performance across all metrics, particularly in terms of F1-score (0.9337) and mAP (0.9274), which positions it competitively among the leading methods. The aHash30method achieves the top performance for all three individual metrics (F1-score: 0.9362, Precision@k: 0.9353, and mAP: 0.9489), consequently obtaining the lowest average rank (1.0). However, CGA outperforms many other traditional hashing and optimization methods, such as DPSH87, DSH23, and DHN26, which shows CGA’s capacity to maintain robustness and stability in performance. Despite not surpassing aHash30’s peak scores, CGA’s consistently high performance emphasizes its potential as a reliable and effective method for CIFAR-10 classification tasks.
Retrieval Performance on CIFAR-100 The analysis based on the more challenging CIFAR-100 dataset (Table 6) reveals distinct insights into the methods’ capability to manage higher complexity due to increased classes and finer-grained categorization. CGA attains the highest F1-score (0.7496) compared to all other methods, clearly indicating its superior ability to distinguish among a higher number of classes. In terms of overall average performance (Avg = 0.6582), CGA maintains competitive performance but slightly trails the leading methods LBP86(0.7235 Avg) and aHash30(0.7222 Avg). While LBP86dominates in Precision@k (0.7359) and mAP (0.6843), CGA still demonstrates commendable robustness, substantially outperforming other hashing approaches (e.g., DPSH87, DSH23, DHN26, and Quantization29), which significantly degrade in performance. Therefore, CGA showcases considerable potential and adaptability in handling datasets characterized by higher complexity and class diversity.
Cross CIFAR-10/100 Datasets Observations Across both natural image benchmarks, CGA demonstrates competitive performance and consistently outperforms a number of conventional and supervised hashing models. Its ability to discover meaningful feature orderings that preserve local similarity and neighborhood structure makes it a valuable tool in binary encoding tasks. Unlike neural-based hashing models, which often fail to generalize when trained on limited or complex distributions, CGA directly targets evaluation metrics and remains adaptable to various feature distributions and backbone architectures. These results confirm that the ordering of features plays a critical role in the performance of binary representations and that optimization via combinatorial search can lead to measurable improvements in both retrieval and classification tasks.Table 7
Performance comparison of different methods on the Fashion-MNIST dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown in bold and the second best are underlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.8568 (0.008) | 0.8025 (0.000) | 0.6209 (0.142) | 0.7601 | 3.7 |
MinMax22 | 0.7387 (0.000) | 0.5935 (0.000) | 0.2941 (0.000) | 0.5421 | 8.7 |
aHash30 | 0.8589 (0.075) | 0.7916 (0.000) | 0.6034 (0.000) | 0.7513 | 5.0 |
LSH32 | 0.7044 (0.000) | 0.5908 (0.000) | 0.3568 (0.000) | 0.5507 | 8.7 |
LBP86 | 0.8512 (0.000) | 0.7681 (0.000) | 0.5536 (0.000) | 0.7243 | 6.7 |
ITQ21 | 0.8332 (0.000) | 0.7798 (0.000) | 0.6530 (0.000) | 0.7553 | 4.7 |
DPSH87 | 0.5550 (0.000) | 0.4384 (0.000) | 0.2510 (0.000) | 0.4148 | 12.3 |
CSQ28 | 0.5650 (0.000) | 0.4444 (0.000) | 0.2513 (0.000) | 0.4203 | 11.3 |
DTSH27 | 0.6046 (0.000) | 0.4961 (0.000) | 0.3223 (0.000) | 0.4743 | 9.7 |
DSH23 | 0.5338 (0.000) | 0.4129 (0.000) | 0.2356 (0.000) | 0.3941 | 14.0 |
DHN26 | 0.5446 (0.000) | 0.4370 (0.000) | 0.2554 (0.000) | 0.4123 | 12.3 |
Quantization29 | 0.0917 (0.000) | 0.1599 (0.000) | 0.1525 (0.000) | 0.1347 | 15.0 |
DFT-Hash | 0.8640 (0.365) | 0.8015 (0.000) | 0.6173 (0.000) | 0.7609 | 3.3 |
CGA-dHash | 0.8623 (-) | 0.8078 (-) | 0.6217 (-) | 0.7639 | 1.7 |
CGA-DFT | 0.8609 (-) | 0.8029 (-) | 0.6182 (-) | 0.7607 | 3.0 |
Retrieval performance on Fashion-MNIST
To further assess the adaptability of CGA across different visual domains, we evaluate its performance on the Fashion-MNIST dataset. This dataset, although grayscale and lower in resolution, provides a challenging classification and retrieval task due to subtle inter-class differences in clothing categories. As in previous experiments, we report F1-score, Precision at top-k (Precision@k), and mean average precision (mAP), averaging over 15 independent runs.
As presented in Table 7, CGA achieves the best overall performance across all metrics. Specifically, it obtains the highest F1-score (0.8623 ± 0.01), Precision@k (0.8078 ± 0.00), and mAP (0.6217 ± 0.00). These values resulted in the top average score (0.7639) and the lowest average rank (1.3), confirming the method’s superiority in balancing classification accuracy and retrieval quality. In comparison, the aHash30 encoding approach achieves an F1-score of 0.8589 ± 0.07 and Precision@k of 0.7997 ± 0.00, ranking second overall. While LBP86and ITQ21methods perform competitively in terms of Precision@k, they fall short in mAP, indicating lower retrieval consistency across the feature space. Supervised deep hashing methods consistently underperform, with models such as Quantization29and DSH23producing extremely low scores across all metrics. This is especially notable in mAP, where the best-performing deep hashing variant DPSH87achieves only 0.2765 ± 0.00. These results further emphasize the limitations of learning-based hashing methods when applied to datasets with low inter-class variability and limited input modalities. The strong performance of CGA on Fashion-MNIST demonstrates its flexibility across both natural and synthetic visual domains. Unlike neural-based binary encoding approaches, which often fail to generalize in low-complexity settings, CGA directly optimizes for evaluation-driven objectives by discovering informative permutations in feature space. Its ability to outperform supervised methods without requiring training or labeled pairs underscores its effectiveness as a lightweight and data-efficient encoding solution. These findings further support the conclusion that feature ordering plays a vital role in retrieval-based performance and that combinatorial optimization strategies such as CGA provide a robust alternative to deep binary representation learning.Table 8
Performance comparison of different methods on the Covid19 dataset (15 runs). Results show mean values with ± standard deviation in subscripts (standard deviations rounded to 2 decimal places). The best results are shown in bold and the second best are underlined. Ranks are shown for each metric and their average. The Avg column shows the mean of the three metrics.
Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|
dHash25 | 0.8584 (0.029) | 0.8109 (0.001) | 0.5886 (0.000) | 0.7527 | 7.0 |
MinMax22 | 0.7995 (0.000) | 0.7318 (0.000) | 0.4506 (0.000) | 0.6607 | 12.7 |
aHash30 | 0.8700 (0.000) | 0.8313 (0.000) | 0.6395 (0.000) | 0.7803 | 1.3 |
LSH32 | 0.8276 (0.000) | 0.7753 (0.000) | 0.5628 (0.000) | 0.7219 | 8.7 |
LBP86 | 0.8620 (0.614) | 0.8241 (0.000) | 0.6521 (0.000) | 0.7794 | 1.7 |
ITQ21 | 0.8550 (0.005) | 0.8221 (0.000) | 0.6391 (0.000) | 0.7721 | 4.3 |
DPSH87 | 0.8012 (0.000) | 0.7440 (0.000) | 0.6284 (0.001) | 0.7245 | 8.7 |
CSQ28 | 0.6264 (0.000) | 0.5681 (0.000) | 0.4713 (0.000) | 0.5552 | 13.7 |
DTSH27 | 0.8031 (0.000) | 0.7399 (0.000) | 0.5313 (0.000) | 0.6915 | 11.0 |
DSH23 | 0.3112 (0.000) | 0.3992 (0.000) | 0.3467 (0.000) | 0.3524 | 15.0 |
DHN26 | 0.8235 (0.000) | 0.7742 (0.000) | 0.6305 (0.001) | 0.7427 | 7.3 |
Quantization29 | 0.7297 (0.000) | 0.6648 (0.000) | 0.5481 (0.000) | 0.6475 | 12.3 |
DFT-Hash | 0.8599 (0.255) | 0.8194 (0.000) | 0.6086 (0.010) | 0.7626 | 5.0 |
CGA-dHash | 0.8613 (-) | 0.8150 (-) | 0.6122 (-) | 0.7628 | 4.7 |
CGA-DFT | 0.8551 (-) | 0.8147 (-) | 0.6062 (-) | 0.7587 | 6.7 |
Retrieval performance on COVID-19
To investigate the effectiveness of CGA on real-world medical data with challenging feature distributions, we conduct experiments on the COVID-19 dataset. This dataset introduces additional complexity due to data imbalance and heterogeneity in the imaging conditions. As shown in Table 8, the aHash30 encoding approach yields the best average performance, achieving the highest ranking among all methods. It is followed closely by LBP86and ITQ21, which also showed strong results in both retrieval and classification tasks. dHash25 performs surprisingly well given its minimalistic design, reflecting the potential advantage of feature magnitude trends even in high-noise clinical datasets. CGA produces a competitive average score of 0.7628, placing it within the top-performing group and achieving an average rank of 4.3. While it did not outperform the aHash30 method in this setting, CGA maintains robust and consistent performance across all runs. Compared to many deep supervised hashing models, including Quantization29, CSQ28, and DSH23, which showed severe performance degradation, CGA offers a much more stable and reliable solution. The ability of CGA to remain effective under the conditions of a noisy and imbalanced dataset underscores its potential in clinical applications. Rather than relying on supervised signal propagation or class-specific anchors, CGA leverages the intrinsic structure of the feature space through permutation optimization. This allows it to preserve neighborhood relations without requiring class labels or pairwise training.
Overall, the results on the COVID-19 dataset support the broader observation that feature ordering has a non-trivial influence on binary representation quality. CGA remains a promising and model-independent approach for optimizing such encoding methods across both synthetic and clinical imaging domains.
Fig. 3 [Images not available. See PDF.]
Convergence plots for CGA-dHash barcoding optimization on TCGA dataset features extracted by KimiaNet and DenseNet-121.
Convergence analysis of CGA
To further investigate the optimization dynamics of the proposed CGA method on dHash25, we provide convergence plots on six TCGA sub-datasets derived from KimiaNet and DenseNet-121 feature representations, as shown in Figure 3. These plots illustrate the evolution of the training fitness score (used as the objective in the evolutionary search) alongside three evaluation metrics on the test set: mAP, Precision@k, and F1-score. Convergence plots for additional TCGA sub-datasets including KimiaNet and DenseNet-121 extracted, selected features, and other datasets are shown in
Figures 1 to 4, respectively.
In subplot Figure 3a, corresponding to KimiaNet features on TCGA brain tumor samples, CGA-dHash demonstrates a clear upward trend across all test metrics. Both Precision@k and F1-score steadily improve throughout the iterations, reflecting the model’s increasing ability to preserve neighborhood relationships in the binary space. The training fitness score also rises consistently, indicating successful convergence of the evolutionary search process. The alignment between training fitness and test metrics highlights that the learned permutations are effectively generalizing to unseen data. In contrast, Figure 3b for the Endocrine subset shows relatively flat convergence curves for all test metrics, despite a consistently high training fitness score. This indicates a potential saturation effect or a low variance in the Endocrine representations, where the CGA-dHash is already near-optimal at initialization. Nonetheless, the performance remains stable, suggesting robustness in evolutionary search without overfitting. For the Gastrointestinal data in Figure 3c, CGA-dHash shows steady improvement in the training fitness score, with corresponding but modest gains in the test metrics. Precision@k and F1-score exhibit mild but consistent upward trends, confirming that even in lower-performing settings, the optimization process gradually improves representation quality.
Figure 3d, Figure 3e, and Figure 3f present convergence on DenseNet-121 extracted features from the same tissue types. In Figure 3d, the Brain tumor subset again reflects strong convergence behavior with significant improvements in all test metrics, especially F1-score, which stabilizes above 0.8 after 60 iterations. Figure 3b for the endocrine data shows near-linear improvement in all test scores, demonstrating that CGA-dHash efficiently identifies better permutations over time. Finally, in Figure 3c, while test metrics for Gastrointestinal samples exhibit modest increases, the training curve remains consistently upward, mirroring the behavior observed in KimiaNet extraction.
Overall, these convergence plots demonstrate that CGA-dHash consistently improves retrieval performance over iterations. The strongest convergence is observed in high-variance datasets such as Brain and Endocrine tissues, while in more homogeneous datasets like Endocrine extracted by KimiaNet, CGA-dHash maintains stable performance. These findings provide further evidence that CGA-dHash’s optimization process is both effective and resilient across medical image domains and feature extraction backbones.Table 9
Performance comparison of CGA-dHash using different crossover (CR) and mutation (MR) rates on three DenseNet121-based datasets. Best results are shown in bold and second best are underlined.
Dataset | Method | F1-score | Precision@k | mAP | Avg | Avg Rank |
|---|---|---|---|---|---|---|
DenseNet121-Brain | CGA-dHash | 0.8052 | 0.7427 | 0.6439 | 0.7306 | 4.0 |
CGA-dHash | 0.8352 | 0.7478 | 0.6633 | 0.7488 | 3.0 | |
CGA-dHash | 0.8460 | 0.7595 | 0.6684 | 0.7580 | 1.0 | |
CGA-dHash | 0.8403 | 0.7505 | 0.6676 | 0.7528 | 2.0 | |
DenseNet121-Endocrine | CGA-dHash | 0.8533 | 0.8403 | 0.7382 | 0.8106 | 4.0 |
CGA-dHash | 0.8792 | 0.8614 | 0.7751 | 0.8386 | 2.3 | |
CGA-dHash | 0.8798 | 0.8631 | 0.7721 | 0.8383 | 2.0 | |
CGA-dHash | 0.8647 | 0.8672 | 0.7802 | 0.8374 | 1.7 | |
DenseNet121-Gastrointestinal | CGA-dHash | 0.5495 | 0.4982 | 0.4118 | 0.4865 | 3.7 |
CGA-dHash | 0.5482 | 0.5127 | 0.4257 | 0.4956 | 2.7 | |
CGA-dHash | 0.5545 | 0.5050 | 0.4213 | 0.4936 | 2.3 | |
CGA-dHash | 0.5545 | 0.5155 | 0.4295 | 0.4998 | 1.3 |
Parameter sensitivity analysis
To investigate how different hyperparameter settings influence the performance of our CGA-dHash, we evaluate combinations of the crossover probability rate (CR) and mutation probability rate (MR) across three datasets: DenseNet-121-Brain, DenseNet-121-Endocrine, and DenseNet-121-Gastrointestinal. The results, shown in Table 9, include five key performance metrics: F1 Score, Precision@k, mAP, average score, and average rank. The best results for each metric are highlighted in bold, with the second-best results underlined.
Impact of Crossover Rate (CR) A consistent pattern emerges across all datasets: increasing the crossover rate from 0.1 to 0.9 leads to improved performance across most metrics. In particular, the configuration with significantly outperforms others. For example, in the Brain dataset, with achieves the highest average score (0.7580) and the best average rank (1.0). A similar trend is observed in the Endocrine dataset, where the same setting yields a strong performance (Avg = 0.8383, Rank = 2.0), closely following the top-ranked , configuration. This indicates that a higher crossover rate enhances the algorithm’s ability to explore and recombine promising solutions effectively.
Impact of Mutation Rate (MR) The effect of mutation rate appears to be more dataset-dependent. In the Brain dataset, a lower mutation rate ( ) is more effective when paired with high crossover. However, for the Gastrointestinal dataset, where performance margins are smaller and data is likely noisier, the highest mutation rate ( ) yields the best overall performance (Avg = 0.4998, Rank = 1.3). This suggests that mutation contributes significantly to maintaining diversity and avoiding premature convergence in more complex or less structured datasets.
Cross-Dataset Observations The configuration , consistently performs among the top two across all datasets, establishing it as a strong general-purpose setting. However, , achieves the best average rank on the Endocrine and Gastrointestinal datasets, highlighting the potential benefit of higher mutation rates in certain domains. Notably, settings with low crossover ( ) result in the poorest performance in all cases, reaffirming the importance of effective recombination in CGA.
Based on the empirical evidence, we recommend using a high crossover rate ( ) and mutation rate ( ) across all datasets to increase the convergence.
Table 10. Top-performing methods on each datasets based on average score across F1, Precision@k, and mAP. Best results are shown in bold, second-best are underlined.
Dataset | Best Method | Avg | 2nd Best Method | Avg |
|---|---|---|---|---|
KimiaNet-Brain | CGA-dHash | 0.8991 | Quantization29 | 0.8937 |
KimiaNet-Endocrine | CGA-dHash | 0.9796 | ITQ21 | 0.9750 |
KimiaNet-Gastrointestinal | CGA-DFT | 0.7277 | CGA-dHash | 0.7263 |
KimiaNet-Gynecologic | Quantization29 | 0.9598 | LBP86 | 0.9571 |
KimiaNet-Liver | DPSH87 | 0.9066 | DHN26 | 0.9010 |
KimiaNet-Melanocytic | CGA-DFT | 0.9365 | CGA-dHash | 0.9357 |
KimiaNet-Pulmonary | CGA-dHash | 0.8741 | LBP86 | 0.8487 |
KimiaNet-Urinary | CGA-dHash | 0.9395 | DFT-Hash | 0.9363 |
KimiaNet-Brain (w/selected features) | DHN26 | 0.8549 | CGA-dHash | 0.8488 |
KimiaNet-Endocrine (w/selected features) | CGA-dHash | 0.9877 | Quantization29 | 0.9663 |
KimiaNet-Gastrointestinal (w/selected features) | DPSH87 | 0.6377 | CGA-dHash | 0.6184 |
KimiaNet-Gynecologic (w/selected features) | aHash30 | 0.9136 | CGA-dHash | 0.9072 |
KimiaNet-Liver (w/selected features) | Quantization29 | 0.8997 | CGA-dHash | 0.8952 |
KimiaNet-Melanocytic (w/selected features) | CGA-DFT | 0.9364 | DFT-Hash | 0.9355 |
KimiaNet-Pulmonary (w/selected features) | DPSH87 | 0.7727 | MinMax22 | 0.7709 |
KimiaNet-Urinary (w/selected features) | ITQ21 | 0.8845 | aHash30 | 0.8792 |
DenseNet121-Brain | aHash30 | 0.7712 | CGA-dHash | 0.7638 |
DenseNet121-Endocrine | CGA-dHash | 0.8396 | dHash25 | 0.7895 |
DenseNet121-Gastrointestinal | CGA-dHash | 0.5002 | MinMax22 | 0.4906 |
DenseNet121-Gynecologic | aHash30 | 0.6979 | CGA-dHash | 0.6705 |
DenseNet121-Liver | CGA-dHash | 0.7917 | aHash30 | 0.7765 |
DenseNet121-Skin | CGA-dHash | 0.8881 | DFT-Hash | 0.8698 |
DenseNet121-Pulmonary | CGA-dHash | 0.6441 | DFT-Hash | 0.6078 |
DenseNet121-Urinary | aHash30 | 0.6686 | CGA-dHash | 0.6655 |
CIFAR-10 | aHash30 | 0.9401 | ITQ21 | 0.9384 |
CIFAR-100 | DFT-Hash | 0.7529 | CGA-DFT | 0.7526 |
Covid19 | aHash30 | 0.7803 | LBP86 | 0.7794 |
Fashion-MNIST | CGA-dHash | 0.7639 | DFT-Hash | 0.7609 |
Cross-dataset performance summary
The results summarized in Table 10 highlight the superior performance of our proposed method, CGA, including its variants CGA-dHash and CGA-DFT, across a diverse set of 28 datasets. CGA, which can be applied to both dHash25 and DFT-Hash feature extractors, achieves the highest average score, computed as a composite of F1, Precision@k, and mAP metrics, in 14 datasets, representing 50% of the evaluated cases. This includes datasets such as KimiaNet-Brain (0.8991), KimiaNet-Endocrine (0.9796), DenseNet-121-Liver (0.7917), and Fashion-MNIST (0.7639), among others. Notably, CGA-DFT, a variant of CGA applied to DFT-Hash, secures the top position in 3 datasets, including KimiaNet-Gastrointestinal (0.7277) and KimiaNet-Melanocytic (0.9365). In total, our proposed method outperforms all competing methods in 14 instances, demonstrating its effectiveness and robustness.
Furthermore, CGA-dHash or CGA-DFT ranks among the top two methods in 22 out of 28 datasets (approximately 79%), reinforcing its consistent high performance. In datasets that do not claim the top spot, it frequently secures the second position, as seen in KimiaNet-Brain with selected features (0.8488) and DenseNet-121-Gynecologic (0.6705). Remarkably, in two datasets, KimiaNet-Gastrointestinal and KimiaNet-Melanocytic, our methods dominate by occupying both the first and second ranks, with average scores differing by less than 0.0015, underscoring the strength and synergy of CGA-dHash and CGA-DFT.
Compared to other methods such as Quantization29, aHash30, and DPSH87, which occasionally achieve top ranks, CGA-dHash surpasses them in frequency and consistency. For instance, aHash30is the best method in 5 datasets, and Quantization is the best method in 3, yet neither matches the 14 top rankings of CGA-dHash and CGA-DFT combined. This dominance is particularly evident across medical imaging datasets (e.g., KimiaNet and DenseNet-121 series) and extends to general datasets like Fashion-MNIST, illustrating the versatility of our approach across different domains and feature representations.
In conclusion, the proposed CGA method, including its variants, definitively surpasses competing methods by securing the highest performance in half of the evaluated datasets and maintaining a top ranking in the vast majority. These results establish CGA as a leading technique for the tasks assessed, validated by its robust and adaptable performance.
Conclusion remarks
Generating barcodes from high-dimensional deep embeddings offers an efficient and effective means of reducing computational and memory demands for large-scale applications. In image retrieval, where matching a query against millions of images is required, representing images as compact binary codes (barcodes) can be highly advantageous. Despite its importance, deriving optimal barcodes that enhance retrieval performance remains a challenging problem.
In this paper, we proposed a novel framework for optimizing the ordering of extracted deep features to improve the quality of the generated barcodes. Our method formulates feature ordering as a large-scale combinatorial optimization problem and solves it using an evolutionary approach. We evaluated our framework on features extracted by DenseNet, a widely used pre-trained deep network, as well as KimiaNet, a specialized network trained on histopathology images, across datasets from different domains.
Experimental results demonstrate that the original feature order produced by deep networks is often suboptimal for downstream tasks such as classification and retrieval. Although performance variations exist across datasets and settings, our findings consistently show that optimizing feature order improves classification F1-scores and retrieval accuracy on average. These results suggest that the proposed scheme can act as an effective post-processing step for enhancing the performance of any pre-trained or fine-tuned deep model.
For future work, we plan to investigate the joint optimization of feature selection and feature ordering for order-sensitive hashing methods within a unified framework. Additionally, we aim to explore alternative combinatorial optimization strategies beyond the proposed combinatorial genetic algorithm, particularly in conjunction with the suggested joint optimization, as we observe further potential for performance improvements.
Author contributions
Rasa Khosrowshahli contributed to software implementation, experimental design, analysis, writing of the original draft, and editing. Farnaz Kheiri collaborated on analysis, writing of the original draft, and editing. Azam Asilian Bidgoli contributed to analysis, reviewing and editing the manuscript. H.R. Tizhoosh provided expertise in medical imaging and barcoding methodologies and contributed to reviewing and editing. Masoud Makrehchi supervised the research. Shahryar Rahnamayan supervised the research and contributed to the conceptualization of the research framework, analysis, review and editing support. All authors contributed to the interpretation of the findings and critically reviewed the manuscript for intellectual content.
Data availability
The experiments used and analyzed during the current study are based upon medical data generated by the TCGA Research Network, publicly available at “https://www.cancer.gov/tcga” and COVID-19 Radiography Database compiled by researchers of Qatar University publicly available at “https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database”. Other non-medical experiments used here are based on the CIFAR dataset publicly available at “https://www.cs.toronto.edu/?kriz/cifar.html” and large fashion images created using Zalando article images publicly available at “https://www.kaggle.com/datasets/zalando-research/fashionmnist”.
Code availability
The source code and associated scripts used in this study are publicly available on GitHub at https://github.com/rkhosrowshahi/DeepFeatureBarcodingCGA88.
Declarations
Competing interests
The authors declare no competing interests.
Supplementary Information
The online version contains supplementary material available at https://doi.org/10.1038/s41598-025-14576-x.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1. Bercovich, E; Javitt, MC. Medical imaging: from roentgen to the digital revolution, and beyond. Rambam Maimonides medical journal; 2018; 9,
2. Manna, A; Dewan, D; Sheet, D. Structured hashing with deep learning for modality, organ, and disease content sensitive medical image retrieval. Scientific Reports; 2025; 15,
3. Tizhoosh, H. R. Foundation models and information retrieval in digital pathology, in: Artificial Intelligence in Pathology, Elsevier, pp. 211–232 (2025).
4. LeCun, Y., Bengio, Y. & Hinton, G.: Deep learning, nature 521 (7553) 436–444 (2015).
5. Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. Generative adversarial nets, Advances in neural information processing systems 27 (2014).
6. Bengio, Y., Courville, A. & Vincent, P. Representation learning: A review and new perspectives. arxiv 2012, arXiv preprint arXiv:1206.5538 (2012).
7. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visual models from natural language supervision, in: International conference on machine learning, PmLR, pp. 8748–8763 (2021).
8. Tizhoosh, HR; Diamandis, P; Campbell, CJ; Safarpoor, A; Kalra, S; Maleki, D; Riasatian, A; Babaie, M. Searching images for consensus: can ai remove observer variability in pathology?. The American journal of pathology; 2021; 191,
9. Chanda, A. Barcode technology and its application in libraries, Library Philosophy and Practice (e-journal) 3619 (2019).
10. Alizadeh, SM; Helfroush, MS; Müller, H. A novel siamese deep hashing model for histopathology image retrieval. Expert Systems with Applications; 2023; 225, 120169.
11. Rasool, A., Qu, Q., Jiang, Q., Wang, Y. A strategy-based optimization algorithm to design codes for dna data storage system, in: International Conference on Algorithms and Architectures for Parallel Processing, Springer, pp. 284–299 (2021).
12. Wudhikarn, R; Charoenkwan, P; Malang, K. Deep learning in barcode recognition: A systematic literature review. IEEE Access; 2022; 10, pp. 8049-8072.
13. Kamnardsiri, T; Charoenkwan, P; Malang, C; Wudhikarn, R. 1d barcode detection: Novel benchmark datasets and comprehensive comparison of deep convolutional neural network approaches. Sensors; 2022; 22,
14. Riplinger, L; Piera-Jiménez, J; Dooling, JP. Patient identification techniques-approaches, implications, and findings. Yearbook of medical informatics; 2020; 29,
15. Suresh, C., Chandrakiran, C., Prashanth, K., Sagar, K. V. & Priyanka, K. “mobile medical card”–an android application for medical data maintenance, in: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), IEEE, pp. 143–149 (2020).
16. Tizhoosh, H. R., Pantanowitz, L.: On image search in histopathology, Journal of Pathology Informatics 100375 (2024) .
17. Lahr, I., Alfasly, S., Nejat, P., Khan, J., Kottom, L., Kumbhar, V., Alsaafin, A., Shafique, A., Hemati, S., Alabtah, G., Comfere, N., Murphree, D., Mangold, A., Yasir, S., Meroueh, C., Boardman, L., Shah, V. H., Garcia, J. J. & Tizhoosh, H. Analysis and validation of image search engines in histopathology, IEEE Reviews in Biomedical Engineering 1–19 https://doi.org/10.1109/RBME.2024.3425769 (2024).
18. Kalra, S; Tizhoosh, HR; Shah, S; Choi, C; Damaskinos, S; Safarpoor, A; Shafiei, S; Babaie, M; Diamandis, P; Campbell, CJ et al. Pan-cancer diagnostic consensus through searching archival histopathology images using artificial intelligence. NPJ digital medicine; 2020; 3,
19. Rouzegar, H., Rahnamayan, S., Bidgoli, A. A., Makrehchi, M. & Enhancing content-based histopathology image retrieval using qr code representation, in,. IEEE Symposium Series on Computational Intelligence (SSCI). IEEE2023, 1120–1125 (2023).
20. Chang-Yeon, J. Face detection using lbp features. Final Project Report; 2008; 77, pp. 1-4.
21. Gong, Y; Lazebnik, S; Gordo, A; Perronnin, F. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE transactions on pattern analysis and machine intelligence; 2012; 35,
22. Tizhoosh, H. R., Zhu, S., Lo, H., Chaudhari, V. & Mehdi, T. Minmax radon barcodes for medical image retrieval, in: Advances in Visual Computing: 12th International Symposium, ISVC 2016, Las Vegas, NV, USA, December 12-14, 2016, Proceedings, Part I 12, Springer, pp. 617–627 (2016).
23. Liu, H., Wang, R., Shan, S. & Chen, X. Deep supervised hashing for fast image retrieval, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2064–2072 (2016).
24. Cao, Z., Long, M., Wang, J. & Yu, P. S. Hashnet: Deep learning to hash by continuation, in: Proceedings of the IEEE international conference on computer vision, pp. 5608–5617 (2017).
25. Sankhe, N. B., Singh, S. & Pundkar, S. Removal of duplicate data using secure hash and difference hash algorithm, International Research Journal of Engineering and Technology (2021).
26. Zhu, H., Long, M., Wang, J. & Cao, Y. Deep hashing network for efficient similarity retrieval, in: Proceedings of the AAAI conference on Artificial Intelligence, Vol. 30, (2016).
27. Wang, X., Shi, Y. & Kitani, K. M. Deep supervised hashing with triplet labels, in: Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part I 13, Springer, pp. 70–84 (2017).
28. Yuan, L., Wang, T., Zhang, X., Tay, F. E., Jie, Z., Liu, W. & Feng, J. Central similarity quantization for efficient image and video retrieval, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3083–3092 (2020).
29. Manna, A., Sista, R. & Sheet, D. Deep neural hashing for content-based medical image retrieval: A survey (2024).
30. Haviana, SFC; Kurniadi, D et al. Average hashing for perceptual image similarity in mobile phone application. Journal of Telematics and Informatics; 2016; 4,
31. Weiss, Y., Torralba, A. & Fergus, R. Spectral hashing, Advances in neural information processing systems 21 (2008).
32. Gionis, A., Indyk, P., Motwani, R. et al., Similarity search in high dimensions via hashing, in: Vldb, Vol. 99, pp. 518–529 (1999).
33. Chen, H; Zou, Z; Liu, Y; Zhu, X. Deep class-guided hashing for multi-label cross-modal retrieval. Applied Sciences; 2025; 15,
34. Sayeed, S., Min, P. P. & Ong, T. S. Deep supervised hashing for gait retrieval, F1000Research 10 (2022) (1038).
35. Lin, J., Li, Z. & Tang, J. Discriminative deep hashing for scalable face image retrieval., in: IJCAI, pp. 2266–2272 (2017).
36. Tiwari, R. G., Misra, A. & Ujjwal, N. Image embedding and classification using pre-trained deep learning architectures, in: 2022 8th International Conference on Signal Processing and Communication (ICSC), IEEE, pp. 125–130 (2022).
37. Berriche, A., Adjal, M. Z. & Baghdadi, R. Leveraging high-resolution features for improved deep hashing-based image retrieval, in: European Conference on Information Retrieval, Springer, pp. 440–453 (2025).
38. Huang, H; Cheng, Q; Shao, Z; Huang, X; Shao, L. Dmch: A deep metric and category-level semantic hashing network for retrieval in remote sensing. Remote Sensing; 2023; 16,
39. Barz, B., Denzler, J. & Hierarchy-based image embeddings for semantic image retrieval, in,. IEEE winter conference on applications of computer vision (WACV). IEEE2019, 638–647 (2019).
40. Xiao, X; Cao, S; Wang, L; Cheng, S; Yuan, E. Deep hashing image retrieval based on hybrid neural network and optimized metric learning. Knowledge-Based Systems; 2024; 284, 111336.
41. Belalia, A; Belloulata, K; Redaoui, A. Enhanced image retrieval using multiscale deep feature fusion in supervised hashing. Journal of Imaging; 2025; 11,
42. Redaoui, A; Belalia, A; Belloulata, K. Deep supervised hashing by fusing multiscale deep features for image retrieval. Information; 2024; 15,
43. He, C. & Wei, H. Hybridhash: Hybrid convolutional and self-attention deep hashing for image retrieval, in: Proceedings of the 2024 International Conference on Multimedia Retrieval, pp. 824–832 (2024).
44. Berriche, A., Adjal, M. Z. & Baghdadi, R. Sign-symmetry learning rules are robust fine-tuners, arXiv preprint arXiv:2502.05925 (2025).
45. Singh, A; Dev, M; Singh, BK; Kumar, A; Kolhe, ML. A review of content-based image retrieval based on hybrid feature extraction techniques. Advances in Data and Information Sciences: Proceedings of ICDIS; 2022; 2022, pp. 303-313.
46. Chen, Y-Y; Chi, K-Y; Hua, K-L. Design of image barcodes for future mobile advertising. EURASIP Journal on image and video processing; 2017; 2017,
47. Guo, L; Wang, T; Wu, Z; Wang, J; Wang, M; Cui, Z; Ji, S; Cai, J; Xu, C; Chen, X. Portable food-freshness prediction platform based on colorimetric barcode combinatorics and deep convolutional neural networks. Advanced Materials; 2020; 32,
48. Zhang, P; Li, C; Wang, C. Viscode: Embedding information in visualization images using encoder-decoder network. IEEE Transactions on Visualization and Computer Graphics; 2020; 27,
49. Kalra, S; Tizhoosh, HR; Choi, C; Shah, S; Diamandis, P; Campbell, CJ; Pantanowitz, L. Yottixel-an image search engine for large archives of histopathology whole slide images. Medical Image Analysis; 2020; 65, [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32623275]101757.
50. Zhu, S. & Tizhoosh, H. R. Radon features and barcodes for medical image retrieval via svm, in: 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 5065–5071 (2016).
51. Babaie, M., Kashani, H., Kumar, M. D. & Tizhoosh, H. R. A new local radon descriptor for content-based image search, in: Artificial Intelligence in Medicine: 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Minneapolis, MN, USA, August 25–28, 2020, Proceedings 18, Springer, pp. 463–472 (2020).
52. Tizhoosh, H. R., Rahnamayan, S. & Evolutionary projection selection for radon barcodes, in,. IEEE Congress on Evolutionary Computation (CEC). IEEE2016, 1–8 (2016).
53. Angadi, S. & Purad, H. C. Exploiting radon features for image retrieval, in: Recent Trends in Image Processing and Pattern Recognition: Third International Conference, RTIP2R 2020, Aurangabad, India, January 3–4, 2020, Revised Selected Papers, Part II 3, Springer, pp. 141–151 (2021).
54. Zhao, W. et al. IEEE International Geoscience and Remote Sensing Symposium-IGARSS. IEEE2013, 1513–1516 (2013).
55. Öztürk, Ş. Stacked auto-encoder based tagging with deep features for content-based medical image retrieval. Expert Systems with Applications; 2020; 161, 113693.
56. Karthik, K; Kamath, SS. A deep neural network model for content-based medical image retrieval with multi-view classification. The Visual Computer; 2021; 37,
57. Choi, Y-J; Beakes, G; Glockling, S; Kruse, J; Nam, B; Nigrelli, L; Ploch, S; Shin, H-D; Shivas, RG; Telle, S et al. Towards a universal barcode of oomycetes-a comparison of the cox1 and cox2 loci. Molecular ecology resources; 2015; 15,
58. Napier, G; Campino, S; Merid, Y; Abebe, M; Woldeamanuel, Y; Aseffa, A; Hibberd, ML; Phelan, J; Clark, TG. Robust barcoding and identification of mycobacterium tuberculosis lineages for epidemiological and clinical studies. Genome medicine; 2020; 12, pp. 1-10.
59. Eminaga, O; Ge, TJ; Shkolyar, E; Laurie, MA; Lee, TJ; Hockman, L; Jia, X; Xing, L; Liao, JC. An efficient framework for video documentation of bladder lesions for cystoscopy: A proof-of-concept study. Journal of Medical Systems; 2022; 46,
60. Ghatak, S. & Bhattacharjee, D. Video indexing through human face, in: Proceedings of International Conference on Communication, Circuits, and Systems: IC3S 2020, Springer, pp. 99–107 (2021).
61. Ghatak, S; Battacharjee, D. Video indexing through human face images using lgfa and window technique. Multimedia Tools and Applications; 2022; 81,
62. Ben-Artzi, G., Werman, M. & Peleg, S. Event retrieval using motion barcodes, in: 2015 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 2621–2625 (2015).
63. Pirlo, G., Chimienti, M., Dassisti, M., Impedovo, D. & Galiano, A. Layout-based document-retrieval system by radon transform using dynamic time warping, in: Image Analysis and Processing–ICIAP 2013: 17th International Conference, Naples, Italy, September 9-13, 2013. Proceedings, Part I 17, Springer, pp. 61–70 (2013).
64. Sharif, A., Jia, J., Zhang, J., Zhai, G. & Joint barcode and text orientation detection model for unmanned retail system, in,. IEEE International Symposium on Circuits and Systems (ISCAS). IEEE2020, 1–5 (2020).
65. Li, K; Wang, F; Yang, L; Liu, R. Deep feature screening: Feature selection for ultra high-dimensional data via deep neural networks. Neurocomputing; 2023; 538, 126186.
66. Gyawali, P. K., Liu, X., Zou, J. & He, Z. Ensembling improves stability and power of feature selection for deep learning models, in: Machine Learning in Computational Biology, PMLR, pp. 33–45 (2022).
67. Quinlan, JR. Induction of decision trees. Machine learning; 1986; 1, pp. 81-106.
68. Zhang, Z. & Zhou, Z.-H. Unbiased gradient boosting decision tree with unbiased feature importance, in: Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI), (2023).
69. Good, J. H., Kovach, T., Miller, K. & Dubrawski, A. Feature learning for interpretable, performant decision trees, in: Advances in Neural Information Processing Systems (NeurIPS), (2023).
70. Maji, S; Bose, S. Cbir using features derived by deep learning. ACM/IMS Transactions on Data Science (TDS); 2021; 2,
71. Gutman, DA; Cobb, J; Somanna, D; Park, Y; Wang, F; Kurc, T; Saltz, JH; Brat, DJ; Cooper, LA; Kong, J. Cancer digital slide archive: an informatics resource to support integrated in silico analysis of tcga pathology data. Journal of the American Medical Informatics Association; 2013; 20,
72. Tomczak, K; Czerwińska, P; Wiznerowicz, M. Review the cancer genome atlas (tcga): an immeasurable source of knowledge. Contemporary Oncology/Współczesna Onkologia; 2015; 2015,
73. Riasatian, A., Babaie, M., Maleki, D., Kalra, S., Valipour, M., Hemati, S., Zaveri, M., Safarpoor, A., Shafiei, S., Afshari, M., Rasoolijaberi, M., Sikaroudi, M., Adnan, M., Shah, S., Choi, C., Damaskinos, S., Campbell, C. J., Diamandis, P., Pantanowitz, L., Kashani, H., Ghodsi, A. & Tizhoosh, H. R. Fine-tuning and training of densenet for histopathology image representation using tcga diagnostic slides (2021). arXiv:2101.07903.
74. Bidgoli, AA; Rahnamayan, S; Dehkharghanian, T; Riasatian, A; Kalra, S; Zaveri, M; Campbell, CJ; Parwani, A; Pantanowitz, L; Tizhoosh, H. Evolutionary deep feature selection for compact representation of gigapixel images in digital pathology. Artificial Intelligence in Medicine; 2022; 132, [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36207081]102368.
75. Rasool, A; Tao, R; Kamyab, M; Hayat, S. Gawa-a feature selection method for hybrid sentiment classification. IEEE Access; 2020; 8, pp. 191850-191861.
76. Bidgoli, AA; Rahnamayan, S; Dehkharghanian, T; Riasatian, A; Tizhoosh, HR. Evolutionary computation in action: Hyperdimensional deep embedding spaces of gigapixel pathology images. IEEE Transactions on Evolutionary Computation; 2023; 27,
77. Dong, E; Du, H; Gardner, L. An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases; 2020; 20,
78. Rasool, A., Jiang, Q., Qu, Q., Kamyab, M. & Huang, M. Hsmc: hybrid sentiment method for correlation to analyze covid-19 tweets, in: Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery: Proceedings of the ICNC-FSKD 2021 17, Springer, pp. 991–999 (2022).
79. M. E. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M. A. Kadir, Z. B. Mahbub, K. R. Islam, M. S. Khan, A. Iqbal, N. Al Emadi, et al., Can ai help in screening viral and covid-19 pneumonia?, Ieee Access 8 (2020) 132665–132676.
80. Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S. B. A., Islam, M. T., Al Maadeed, S., Zughaier, S. M., Khan, M. S. et al., Exploring the effect of image enhancement techniques on covid-19 detection using chest x-ray images, Computers in biology and medicine 132 104319 (2021).
81. Xiao, H., Rasul, K. & Vollgraf, R. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017).
82. Malhi, US; Zhou, J; Yan, C; Rasool, A; Siddeeq, S; Du, M. Unsupervised deep embedded clustering for high-dimensional visual features of fashion images. Applied Sciences; 2023; 13,
83. Tan, M. & Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks, in: International conference on machine learning, PMLR, pp. 6105–6114 (2019).
84. Krizhevsky, A., Hinton, G. et al., Learning multiple layers of features from tiny images, Thesis (2009).
85. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. Densely connected convolutional networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708 (2017).
86. Lin, J.-H., Lazarow, J., Yang, A., Hong, D., Gupta, R. & Tu, Z. Local binary pattern networks, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 825–834 (2020).
87. Chen, W., Chen, X., Zhang, J. & Huang, K. Beyond triplet loss: a deep quadruplet network for person re-identification, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 403–412 (2017).
88. Khosrowshahli, R., Kheiri, F., Bidgoli, A. A., Tizhoosh, H., Makrehchi, M. & Rahnamayan, S. Deep feature barcoding with combinatorial genetic algorithm (CGA) (2025). https://doi.org/10.24433/CO.0941013.v1.
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.