Advancing multimodal medical image fusion: an

Full text

Translate

Turn on search term navigation

Headnote

Subject Category:

Computer science and artificial intelligence

Subject Areas:

bioengineering, computer vision, image processing

Keywords:

medical imaging, multimodal image fusion, salient feature extraction

With the rapid development of medical imaging methods, multimodal medical image fusion techniques have caught the interest of researchers. The aim is to preserve information from diverse sensors using various models to generate a single informative image. The main challenge is to derive a trade-off between the spatial and spectral qualities of the resulting fused image and the computing efficiency. This article proposes a fast and reliable method for medical image fusion depending on multilevel Guided edge-preserving filtering (MLGEPF) decomposition rule. First, each multimodal medical image was divided into three sublayer categories using an MLGEPF decomposition scheme: small-scale component, large-scale component and background component. Secondly, two fusion strategies- pulse-coupled neural network based on the structure tensor and maximum based-are applied to combine the three types of layers, based on the layers' various properties. The three different types of fused sublayers are combined to create the fused image at the end. A total of 40 pairs of brain images from four separate categories of medical conditions were tested in experiments. The pair of images includes various case studies including magnetic resonance imaging (MRI) , TITc, single-photon emission computed tomography (SPECT) and positron emission tomography (PET). We included qualitative analysis to demonstrate that the visual contrast between the structure and the surrounding tissue is increased in our proposed method. To further enhance the visual comparison, we asked a group of observers to compare our method's outputs with other methods and score them. Overall, our proposed fusion scheme increased the visual contrast and received positive subjective review. Moreover, objective assessment indicators for each category of medical conditions are also included. Our method achieves a high evaluation outcome on feature mutual information (FMI), the sum of correlation of differences (SCD), Qabf and Qy indexes. This implies that our fusion algorithm has better performance in information preservation and efficient structural and visual transferring.

(ProQuest: ... denotes formulae omitted.)

1. Introduction

Modern imaging devices offer diverse medical images for clinical screening owing to the ongoing advancements in diverse imaging systems. Various types of medical images are visually examined by a qualified clinician [1]. However, each imaging modality reflects different information about human organs and sick tissue. For example, three imaging methods including magnetic resonance imaging (MRI) , single-photon emission computed tomography (SPÉCT) and positron emission tomography (PET) are often employed to visualize the abnormalities in the brain. These techniques measure distinct facets of the epileptic process such as structure, metabolism and perfusion. Each technique has its own characteristics, advantages and limitations.

While MRI depicts the brain's physical anatomical contrast, SPECT demonstrates how it functions. SPECT is proven to reliably evaluate blood flow. MRI does not provide any information on the brain's function. On the other hand, while PET and SPECT are comparable imaging methods, PET is more expensive. Areas of the brain that are healthy, overactive or underactive can all be seen on SPECT and PET scans. Nevertheless, PET and SPECT have a low spatial resolution.

Commonly, a single modal image lacks adequate data for clinical diagnosis so multimodal medical image can aid medical professionals in developing a more suited treatment strategy [2]. In some cases, a radiologist needs to bring together details from different modalities without sacrificing the original image's attributes [3]. Additionally, collecting different images from the same patient takes extra space, processing power and time [4]. Recently, as medical imaging continued to advance, multimodal medical image fusion improved remarkably.

Integrating multiple medical images from different imaging procedures is known as medical image fusion which is mainly done to create one image with a large amount of information. Multimodality image fusion combines two or more multimodal input images to generate a more comprehensible and detailed image [5]. Two methods may often be used to combine medical images from various modalities. Hardware device upgrades are one strategy. However, this method is both difficult and costly. Image processing is yet another low-cost and straightforward method [6].

Image fusion is a useful approach to improve computer vision processing since it seeks to combine details from several input images acquired by multimodal imaging equipment [7,8]. Therefore, throughout the process, fusion algorithms should provide strong contrast and integrate the necessary information without generating any artefacts [4]. The aim is to foster medical diagnosis decision-making so the image fusion process consists of a variety of methodologies and research fields, spanning from image processing methods, pattern recognition and computer vision [9-14].

In recent years, many image fusion methods for multimodal input images have been published. The aim is to improve the quality, detail preservation and computational efficiency of the fused image. However, due to their limited ability to process the edges and the textures, the current algorithms often lead to edge artefacts [15]. Properly integrating information from diverse modalities should be managed by including changes in resolution, dynamic range and imaging characteristics. This often includes challenging computer operations and complicated algorithms. On the other hand, optimizing the computational resources is essential to ensuring real-time or close to real-time performance.

Image fusion methods generally involve three main steps: image decomposition, subimage fusion and image reconstruction. The decomposition process, which is the initial stage, divides the source images into several subimages. Some techniques based on the Laplacian redecomposition (LRD) framework are designed to merge multimodal medical images [7]. This method presents a decomposition approach based on Laplacian decision graph to obtain sub-band images. In [16], a multimodal medical image is divided into seven layers using a Gaussian edge-preserving filter and weighted mean curvature filtering. An adaptive decomposition approach is presented in [17] to extract high-and low-frequency components of an image and produce a smoothing and texture sublayer based on Fourier spectrum analysis. The non-subsampled Shearlet transform is another decomposing scheme to acquire low- and high-frequency sub-bands [14]. A two-layer decomposition approach based on joint bilateral filtering is introduced that splits the input image into an energy layer containing substantial pixel intensities and a structure layer [18]. Non-subsampled contourlet transform (NSCT) can also be used as decomposition scheme to decompose the input images into high-pass and low-pass sub-bands. This will generate the detail and structural components of the input image [19].

Inspired by the state-of-the-art decomposition methods, the sublayer containing details and background information are extracted. To accurately produce fused image, fusion rules should be carefully applied to the components from source images. In [20], pulse-coupled neural network (PCNN) are driven by spatial frequency in the NSCT and coefficients in the NSCT domain are chosen as the coefficients for the fused image. A parameter-adaptive PCNN model is used to merge high-fre-quency bands in [21], in which all the PCNN parameters can be predicted. The majority of machine learning techniques function at the feature and decision levels [22]. In this study [23], authors use deep learning to overcome the fusion problem. They develop a direct mapping across the original images and focus map. This is done by applying CNN which has been trained using the high-quality picture and the blurred variants. An image fusion scheme using convolutional neural network is also presented in [24]. First, weight map is created for the input image by the CNN model. Gaussian pyramid is applied to the weight map to decompose the image. Contrast pyramid is then for the fusion of relatable image components. In the meantime, the local similarity technique in adaptive fusion mode investigates the information of output images. Ultimately, the contrast pyramid restoration yields the fused image.

Fuzzy-based fusion techniques also enhance the precision of target recognition for clinical diagnosis [25]. This work constructs an image fusion technique that combines the benefits of fuzzy entropy and NSCT for multimodal medical images. Finally, according to the decomposition process, inverse decomposition operation is implemented to construct the fused image. An image fusion method based on sparse representation (SR) is reported in [26]. In this article, SR is applied to the texture components, while an energy-based fusion strategy is employed for the cartoon components to uphold the geometric structure information from input images. However, the reduction of computing time remains a significant challenge for SR-based fusion approaches.

To summarize, for improving the diagnostics accuracy, various types of medical images are visually examined by physicians. Each technique has its own characteristics, advantages and limitations. However, collecting different images from the same patient takes extra space, processing power and time. To overcome these issues, multimodal medical image fusion is used. For medical image applications, image fusion is mainly done to compensate for the shortcomings of each of the imaging modalities and to generate one final image with a large amount of information. It is important to have methods that yield strong performance in increasing the visual contrast of the final image. The result images are expected to have less colour distortion and more structural information included. On the other hand, the importance of computational cost efficiency cannot be neglected. Computational efficiency is important in processing large datasets and complex images in a timely manner to ensure that diagnostic procedures are not taking long. As a result, there should be a balance between high-quality, multimodal medical imaging fusion and the computational efficiency of the algorithm.

In the field of medical image fusion, research papers often lean towards either achieving exceptional visual contrast or focusing on optimizing time efficiency in terms of processing speed. It is challenging to find a balance between the two of these variables. In our efforts in this work, we have developed a novel algorithm designed to offer outstanding visual contrast, ensuring that fused medical images are not only visually appealing but also facilitate accurate diagnosis. Simultaneously, we tried to have a relatively time-efficient algorithm in terms of time cost evaluation. This makes our approach applicable for almost real-time medical scenarios.

In this article, we deliver a novel image fusion strategy for brain images. The method used the multilevel Guided edge-preserving filtering (MLGEPF) to decompose the source image into seven sublayers. MLGEPF is implemented by applying Gaussian filter and a modified version of Guided filter with iteratively updated guidance image. We have altered the guidance image from the input image to difference image of the input image and magnitude of gradient, so that the small fluctuations of the input image can be largely smoothed. This will maintain clear edge information and smooth complex small areas in an image. In this decomposition approach, the input image is breaking down into small-scale component (SC), large-scale component (LC) and background component (EC) layers using an edge-preserving Guided filter.

Then, to create three different fused layers, SC, LC and EC layers are combined using a novel gradient domain PCNN based on the structure tensor fusion technique and a maximum-based fusion approach, respectively. In our novel PCNN-based fusion approach, we use the image's gradient magnitude as the structural tensor. The structure tensor is applied as the input for the PCNN network. We take advantage of this for leveraging the strength and orientation of edges and corners in the images. This approach enables PCNN to highlight important features that lead to a more effective fusion of MRI and PET/SPECT images. The three fused layers are then combined to reconstruct the fused image.

The suggested method and other novel approaches are examined both qualitatively and quantita-tively to assess the fusion results. Our proposed technique exhibits better performance compared with current fusion methods in qualitative and quantitative evaluation. In our experiments, a total of 40 sets of medical images from four categories of medical conditions are tested. This article is structured as follows: an in-depth discussion of the suggested decomposition and fusion rule is given in §2. The results and discussions of experiments are demonstrated in §3. The conclusion is covered in §4.

2. Proposed method

This work proposes an edge-preserving filtering fusion technique that draws inspiration from PCNN fusion and image decomposition schemes. Our multimodal image fusion approach is suggested and described in this section. A visual representation of the suggested model is provided in figure 1. The algorithm consists of three components: image decompression using MLGEPF, image fusion and image reconstruction.

It is worth mentioning that the model starts with hue, saturation, value (HSV) colour space transform on both input images. An inverse HSV transform is then performed to create the final fused image after image pair fusion. The first step would be decomposing the input image pair into seven informative layers through the guidance filter edge-preserving scheme. Next, the EC layers are fused by maximum-based fusion scheme. An Structure tensor (ST)-PCNN fusion approach is performed on the SC and LC layers. After performing the fusion, linear addition is used to create the final fused image.

2.1. Image enhancement

Medical imaging devices, including MRI and X-ray machines, generate distortions in the captured images. The variations between organs and tissues in various images may be quite low, resulting in a poor visual effect when viewed through human eye. Because human eyes can only respond to brightness within a particular range [4,27]. Image enhancement is essential in the field of image processing because it optimizes image quality by highlighting useful information and eliminating redundant information [28]. Therefore, by altering the contrast of the source images, the negative effects are minimized.

2.2. Image decomposition

The structure diagram of MLGEPF decomposition is displayed in figure 2. It is worth mentioning that the number of decomposition layers must be carefully chosen with the goal of minimizing redundancy. The number of decomposition layers should be chosen after considering the trade-offs between information retention, computational complexity noise sensitivity and fusion method requirements. Having fewer layers reduces the overall quality. A limited number of layers could risk overlooking subtle details like edges. This is mainly due to the fact that the image decomposition is reliant on the intensity of pixels and the local information. For example, if the image is decomposed into only two components, such as large- and small-scale components, subtle edges might be overlooked during the process. In these situations, the algorithm divides the information based on pixel intensity, and since the model might not capture subtle variations in the image with a limited number of layers, there is a higher chance of missing features. On the other hand, too many layers produces computational and duplication issues [22]. Too many layers may introduce computational challenges and duplication issues, potentially saturating the output with information. Even misleading information may be generated during the decomposition process. Figure 3 depicts the main procedures of the proposed method and the output at each level.

The Gaussian blur feature is obtained by smoothing an image using Gaussian filtering to reduce the amount of details and also the noise level. The Gaussian filter serves as a non-uniform low-pass filter similar to the mean filter, but with a different kernel. The two-dimensional Gaussian filter is used when working with images. It is simply the product of two one-dimensional Gaussian functions.

A conventional approach in image processing is to smooth the image while keeping the structures and edges. Edge-aware filters are popular filters in visual processing. Different methodologies are used to construct edge-aware filters. However, the goal is to maintain only high-contrast edges [29]. Edge-preserving filters include bilateral filter, Guided filter, geodesic filters and weighted median filters. The primary goal of the guidance filter is to efficiently eliminate detailed texture and noise while preserving the edges as much as possible.

The filtering output determined by the Guided filter takes into account the structures of the guidance image which can be the input image or a different image [30]. Beyond smoothing, the Guided filter has a unique concept: it is capable of transferring the guidance image's features to the filtering output, useful in applications such as dehazing [30]. The input image is smoothed using the input image itself as the guidance image in the primary framework of Guided image filter to approach the edge preservation effect [30]. Figure 4 illustrates examples of the proposed Guided filter performed on the input images. In this layout, we compared the smoothed images by the original Guided filter (the image itself as the guidance image) and our proposed composition method (difference image of the input image and magnitude of gradient as the guidance image). It can be seen that the significant features retrieved by our method are more than those extracted by the original Guided filter. Structural similarity index (SSIM) comparison of these two categories is reported in figure 5. As can be seen, images extracted using gradient magnitude and the Guided filter have slightly higher SSIM, which means more similarity to the reference image.

Figure 6 displays the subimages extracted by the proposed MLGEP. Since there are three decomposition layers, the background information is separated from the small-scale components gradually.

In this article, the MLGEPF is done by applying Gaussian filter and Guided filter. The MLGEPF diagram is presented in figure 2. The image components are formed by the filtered outcomes of the zth Gaussian filter and the zth Guided filter in the block diagram. The three types of layers are represented as SC(i), LC(i) and BC.

In this work, a new adaptive decomposition strategy is presented through MLGEPF to decompose the input pair images into seven layers. In the next level, the gradient magnitude of the input image is first determined. Image also goes through Guided filter. Then, the filtered image is used as a fresh guidance image in the next iteration for filtering the input image.

So the guidance image is iteratively updated. In other words, we have altered the guidance image from the input image to difference image of the input image and magnitude of the gradient, so that the small fluctuations of the input image can be largely smoothed. Accordingly, the salient features (see figure 4) extracted by our Guided filter are far more than those extracted by the original Guided filter.

2.3. Image fusion

Using the decomposition technique described above, the sublayer containing rich details and background information is generated. To accurately produce fused image, fusion rules should be carefully applied to the components from source images. A novel gradient domain PCNN based on the structure tensor and maximum-based fusion strategies are applied on SC, LC and BC.

2.3.1. Pulse-coupled neural network

PCNNs have been widely employed for image processing techniques such as image segmentation and pattern recognition. In the process of image fusion, PCNN is a single-layered two-dimensional array of linked neurons. The network's number of neurons is equivalent to the total number of pixels in each input image. The pixels in each image have a one-to-one correspondence with neurons. For the stimulus of the neuron, the grey value of the neuron is employed [31]. PCNN has made a significant impact on image fusion since it is not required to do training [16]. It can also effectively quantify the activity level of each pixel. This involves modelling how neurons respond to input stimuli. PCNN processes the pixel activity based on local and global interactions in each iteration. The network's pulse synchronization mechanism enhances its ability to capture significant features of the input images. This approach contributes to the effectiveness of PCNN for image fusion applications [21]. Also PCNN enables quick and effective computations, which is an important feature, especially for medical image fusion tasks [32].

We have applied PCNN to improve the spatial correlation of the comparable layers. Before the fusion is processed, the structure tensor salient (STS) operator is also applied to layers.

2.3.2. Structure tensor

MRI images contain more strong edges and structural features, while PET/SPECT images contain more colour information. MRI-SPECT and MRI-РЕТ fused images should retain both structure and colour information. In order to prevent 'unnatural' fusion result, it is necessary to extract the main features of layers before the fusion operation.

By extracting the salient information, we make sure that the network pays attention to the most relevant and diagnostically significant details in the input images. For this reason, the salient information for the PCNN input image should be correctly detected.

The structure tensor is valuable because it provides a reliable representation of local patterns. Here, we use the image's gradient magnitude as the structural tensor. The gradient magnitude represents the rate of pixel intensities change in different orientations. We take advantage of this for leveraging the strength and orientation of edges and corners in the images. This approach enables PCNN to highlight important features that lead to a more effective fusion of MRI and PET/SPECT images. It enhances the network's ability to focus on salient details, which is crucial in medical imaging applications for accurate analysis and diagnosis.

Since an STS can identify an image's gradient information efficiently, it make sense to use it to choose structures as the fusion process input. In figure 7, the input image's salient information is functioning as the joint strength (ßMRI л /3РЕТ). Y is the neuron's outcome which has an important role in the neuron's input in each iteration.

A structure tensor is defined as partial derivative information in mathematics. It often represents gradient or edge and corner information in computer vision and image processing, and has a stronger representation of local patterns than the directional derivative due to its coherence measure [33-35]. In other words, a structural tensor uses an image's gradient magnitude to identify the coordinates of the edges and corners [36].

In equations (2.1)-(2.6), ST-PCNN fusion strategy is summarized. Ą7(n) has the pixel value of MRI and corresponding PET image. The input image's salient information is functioning as the joint strength /3MRI Л /3PET. Y is the neuron's outcome. L¿7(n) is the linking parameter. 6^ is the threshold of the step function. SCFused(x, y) and LCFused(x, y) are final fused SC and EC layers that are generated by ST-PCNN scheme.

... (2.1)

... (2.2)

... (2.3)

... (2.4)

... (2.5)

... (2-6)

2.3.3. Fusion of background component

The background component of an image is the best equivalent of the source images. While the image fusion seeks to merge as much valuable information in the process [18], so the regularly used fusion rule for the background layer is the maximum fusion method [37] equation (2.7).

... (2.7)

3. Experimental results and analysis

3.1. Dataset

For this work, 40 pairs of multimodal images are selected for the evaluation of the proposed methodology. The pair of images includes various case studies including MRI, TITc, SPECT and PET. Each imaging modality reflects different information about human organs and sick tissue. All test images used in this article were gathered from Whole Brain Atlas, which is an online database comprising multimodal nervous system images. This image dataset were created by K. Johnson and J. Becker of Harvard Medical School [38].

3.2. Analysis

The suggested algorithm has been experimented with various input image pairs in order to analyse its performance and accuracy. We compared the performance of our proposed framework with the following methods: multi-level morphological gradient (MLMG)-PCNN [16], joint bilateral filter (JBF) [18], Local extreme map Guided filter [39], LRD [7], adaptive co-occurence filter (ACOF) [40], multiple dictionaries and truncated Huber filtering (MDHU) [41], fast guided filtering (FGF) [42] and a fusion method based on phase congruency and local Laplacian energy [43]. In our experiments, subjective and objective evaluation is performed to assess the method's efficiency. By comparing the output image with the input data images visually, qualitative analysis evaluates the performance of the fusion process. However, quantitative analysis examines the reconstructed images using mathematical modelling [44].

To further enhance the visual comparison between the proposed method and other algorithms, we asked a group of observers to compare our method's outputs with other methods and score them.

3.3. Qualitative analysis

One of the most effective ways to distinguish the performance of medical image fusion methods is to evaluate the fused image by directly using human vision system. The nine-image fusion techniques are compared in this subsection using the qualitative method, which compares the fusion outcomes visually. Throughout this section, we present various examples that highlight the nine-image fusion methods in this section. Intuitively, we may provide some apparent visual judgement from these fusion results.

Figure 8 demonstrates the two MRI and SPECT input images from metastatic bronchogenic carcinoma image dataset. Since MRI depicts the brain's physical anatomical contrast and details, the goal of image fusion is to maintain strong contrast while preserving more structural information. Contrast reduction can be seen in MLMG-PCNN and JBF. On the other hand, SPECT image's colours have valuable information for the physician. Colour distortion took place in LRD output image. MDHU fused output image has low contrast and lacks sharpness. This suggests that the fusion procedure did not significantly improve the detail's clarity.

As can be seen, more information might be preserved in the fused output image by using our suggested strategy. Our method outperforms other popular methods in subjective visual perception. The result images have less colour distortion and more structural information included.

We have demonstrated comparison examples of the nine-image fusion methods with the two MRI and PET input images from the Glioma image dataset in figure 9. Our proposed method has significantly better visual sharpness than ACOF, MDHU and FGF. Overall, the visual contrast between the structure and the surrounding tissue is increased by the radiological contrasts in our proposed method.

Figure 10 shows the comparison of fused images on the TITc and SPECT, demonstrating images of neoplastic disease. Our algorithm's result images have less colour distortion and more structural information included.

In figure 11, LRD and FGF outputs exhibit certain colour changes that hold significant meaning in SPECT images. In SPECT imaging, colour can report important information about the distribution of radiotracers or specific aspects of the brain tissue. This observation emphasizes the importance of preserving the originality of colours and structures in the context of PET/SPECT images. Our method effectively conveys details from the MRI and colours from the SPECT input images compared with other methods. To further enhance the visual comparison between the proposed method and other algorithms, we asked a group of observers to compare our method's outputs with other methods and score them. Observers are two radiologists from the Royal University Hospital of Saskatoon and all had experience in MRI and PET/SPECT image reading. The radiologists were asked to rate a series of 90 images in four categories independently and pick out which output image they preferred to use for diagnosis. Table 1 demonstrates the scores by human observers. The average scores point out that observers opted to employ our proposed outputs during the diagnostic assessment in two categories. As can be seen, each category of images has different subjective quality evaluation. This is primarily due to the various visual characteristics that each has. Overall, our proposed fusion scheme increased the visual contrast and received positive subjective review.

3.4. Quantitative analysis

The implications of image fusion techniques are not only determined by visual analysis but also by numerous academics, who have offered various quality criteria for both qualitative and quantitative studies in order to judge the quality of the output fused image. The objective evaluation results of the fused images are shown in tables 2-5. We examine how well colour and spatial features are preserved. Among the objective assessment indicators are:

(1) Entropy, which measures the texture information of the fused image.

(2) The sum of correlation of differences (SCD). This quantifies the extent of information collected in the output image [45].

(3) Qabf evaluates the noise fusion results and associates visual information with the edge information in each pixel [46].

(4) Piella's metric for measuring the degree of relevant information in the input images that is present in the fused image [47].

(5) Qy, a structure-based indicator, analyses how well the original image structural information is retained [48].

(6) The SSIM for measuring the fused image's quality is based on the calculation of brightness, the contrast and the structure terms [49].

(7) The amount of information conveyed from the original images to the output fused image is described as feature mutual information (FMI) [50].

Each table shows objective assessment indicators for a separate category of medical conditions. It is worth mentioning that various image categories exhibit different characteristics. That is why we report the average value of 40 pairs of images.

In the comparison tables, each row demonstrates the metrics value for distinguished methods. The two highest values for each set of data are in bold. Considering the results from metastatic bronchogenic carcinoma, glioma disease (MRI-РЕТ) and glioma (TITc-SPECT), results of the proposed method report good entropy, which means higher number of bits need to encode the image's information.

Specifically, our method achieves a highly noticeable evaluation outcome on FMI, SCD, Qabf and Qy indexes in most categories. This implies that our fusion algorithm not only has better performance in information preservation but also facilitates efficient structural and visual transferring.

3.5. Time consumption comparison

It is clear from the objective examination that our method's fusion results are generally better than the other eight medical image fusion methods. Preserving structural information and conveying a greater amount of information is evident when comparing the evaluation outcomes on FMI, SCD, Qabf and Qy indexes. Our method yields strong performance on other metrics as well.

A metric for evaluating an algorithm's computation cost is the execution time consumption. Table 6 displays average time cost of each method on the Whole Brain image dataset. In the experiments, all programs were evaluated on the same computation platform MATLAB R2022a on a PC with 2.30 GHz and Intel® Xenon CPU and 16 GB RAM.

It is evident that, when compared with the other methods, the computation cost of our method is within the intermediate range. Compared with the other comparison approaches, the proposed method is relatively time-efficient in terms of time cost evaluation.

4. Conclusion

In this work, we introduce a new fusion approach for better visualization of medical images. It includes three main steps: image decomposition, image fusion and fused image construction. For image decomposition, MLGEPF is proposed that takes advantage of Gaussian filter and a gradient-based Guided filter to extract image sublayers. Also, for the fusion step, a novel gradient domain PCNN based on the structure tensor is used to preserve both the colour information and structure information very well. One of the attractive features of this algorithm is that it iteratively updates the guidance image in the decomposition section, so that the small fluctuations of the input image can be largely smoothed. Accordingly, the salient features extracted by our Guided filter are far more than those extracted by the original Guided filter.

The suggested method and other approaches are examined qualitatively and quantitatively to assess the fusion results. The suggested algorithm is better than other algorithms in both subjective and objective assessments. The comparison results show that the proposed method is the best in the three spectral metrics, FMI, Qy and Qab. Our method yields strong performance on other metrics as well. Moreover, the algorithm is relatively time-efficient in terms of time cost evaluation meaning that the proposed algorithm is applicable for almost real-time utilization scenarios.

Ethics. This work did not require ethical approval from a human subject or animal welfare committee.

Data accessibility. The data that supports the research results of this study are fully available [51]. The dataset underlying the results presented in this article are available in [38].

Declaration of Al use. We have not used Ai-assisted technologies in creating this article.

Authors7 contributions. S.M.: conceptualization, data curation, methodology, writing-original draft; M.E.: investigation, software, writing-review and editing; K.A.W.: supervision, validation; K.E.L.: supervision, validation.

All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Conflict of interest declaration. We declare we have no competing interests.

Funding. The New Frontiers in Research Fund Explorations (NFRFE).

Acknowledgements. We are grateful to Dr. Paul Babyn, Dr. Hamid Dabirzadeh from the Royal University Hospital and Dr. Peter Szkup from the Royal University Hospital. They provided constructive support for our work.

Sidebar

Cite this article: Moghtaderi S, Einlou M, Wahid KA, Lukong KE. 2024 Advancing multimodal medical image fusion: an adaptive image decomposition approach based on multilevel Guided filtering. R. Soc. Open Sei. 11 : rsos.231762.

Received: 20 November 2023

Accepted: 25 February 2024

© 2024 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.Org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

References

References

1. Mohan G, Subashini MM. 2018 MRI based medical image analysis: survey on brain tumor grade classification. Biomed. Signal Process. Control 39, 139-161. (doi:10.1016/j.bspc.2O17.07.007)

2. Noble JA, Navab N, Becher H. 2011 Ultrasonic image analysis and image-guided interventions. Interface Focus 1,673-685. (doi:10.1098/rsfs. 2011.0025)

3. Palanisami D, Mohan N, Ganeshkumar L. 2022 A new approach of multi-modal medical image fusion using intuitionistic fuzzy set. Biomed. Signal Process. Control!!, 103762. (doi:10.1016/j.bspc.2022.103762)

4. Sufyan A, Imran M, Shah SA, Shahwani H, Wadood AA. 2022 A novel multimodality anatomical image fusion method based on contrast and structure extraction. Int. J. Imaging Syst. Tech. 32,324-342. (doi:10.1002/ima.22649)

5. Diwakar M, Singh P, Shankar A. 2021 Multi-modal medical image fusion framework using co-occurrence filter and local extrema in NSST domain. Biomed. Signal Process. Control 68,102788. (doi:10.1016/j.bspc.2O21.102788)

6. Chen J, Zhang L, Lu L, Li Q, Hu M, Yang X. 2021 A novel medical image fusion method based on Rolling Guidance Filtering. Internet of Things 14, 100172. (doi : 10.1016/j. iot.2020.100172)

7. Li X, Guo X, Han P, Wang X, Li H, Luo I. 2020 Laplacian redecomposition for multimodal medical image fusion. IEEE Trans. Instrum. Meas. 69, 6880-6890. (doi:10.1109/TIM.2020.2975405)

8. Yang Y, Que Y, Huang S, Lin P. 2017 Multiple visual features measurement with gradient domain guided filtering for multisensor image fusion. IEEE Trans. Instrum. Meas. 66,691-703. (doi:10.1109/TIM.2017.2658098)

9. Chouhan V, Singh SK, Khamparia A, Gupta D, Tiwari P, Moreira C, Damaševičius R, de Albuquerque VHC. 2020 A novel transfer learning based approach for pneumonia detection in chest X-ray images. Appl. Sei. 10,559. (doi:10.3390/app10020559)

10. Du J, Li W, Lu K, Xiao B. 2016 An overview of multi-modal medical image fusion. Neurocomputing 215,3-20. (doi:10.1016/j.neucom.2015.07. 160)

11. Jaiswal AK, Tiwari P, Kumar S, Gupta D, Khanna A, Rodrigues J. 2019 Identifying pneumonia in chest X-rays: a deep learning approach. Measurement 145,511 -518. (d o i : 10.1016/j.measurement.2O19.05.076)

12. Kumar Mallick P, Ryu SH, Satapathy SK, Mishra S, Nguyen GN, Tiwari P. 2019 Brain MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural network. IEEE Access 7,46278-46287. (doi:10.1109/ACCESS.2019.2902252)

13. Tiwari P, Qian J, Li Q, Wang B, Gupta D, Khanna A, Rodrigues J, de Albuquerque VHC. 2018 Detection of subtype blood cells using deep learning. Cogn. Syst. Res. 52,1036-1044. (doi:10.1016/j.cogsys.2018.08.022)

14. Tan W, Tiwari P, Pandey HM, Moreira C, Jaiswal AK. 2020 Multimodal medical image fusion algorithm in the era of big data. Neural Comput. Applic. 1-21. (doi:10.1007/S00521 -020-05173-2)

15. Feng Y, Wu J, Hu X, Zhang W, Wang G, Zhou X, Zhang X. 2023 Medical image fusion using bilateral texture filtering. Biomed. Signal Process. Contralto, 105004. (doi:10.1016/j.bspc.2023.105004)

16. Tan W, Thitøn W, Xiang P, Zhou H. 2021 Multi-modal brain image fusion based on multi-level edge-preserving filtering. Biomed. Signal Process. Control 64,102280. (doi:10.1016/j.bspc.2020.102280)

17. Wang J, Li X, Zhang Y, Zhang X. 2018 Adaptive decomposition method for multi-modal medical image fusion. IET Image Process. 12,1403- 1412. (doi:10.1049/iet-ipr.2017.1067)

18. Li X, Zhou F, Tan H, Zhang W, Zhao C. 2021 Multimodal medical image fusion based on joint bilateral filter and local gradient energy. Inf. Sei. 569,302-325. (doi:10.1016/j.ins.2021.04.052)

19. Zhu Z, Zheng M, Qi G, Wang D, Xiang Y. A phase congruency and local Laplacian energy based multi-modality medical image fusion method in NSCT domain. IEEE Access 7,20811 -20824. (doi:10.1109/ACCESS.2019.2898111 )

20. Qu XB, Yan JW, Xiao HZ, Zhu ZQ. 2008 Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Automatica Sinica 34,1508-1514. (doi:10.1016/51874-1029(08)60174-3)

21. Yin M, Liu X, Liu Y, Chen X. Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain. IEEE Trans. Instrum. Meas. 68,49-64. (doi:10.1109/TIM.2018.2838778)

22. Hermessi H, Mourali 0, Zagrouba E. 2021 Multimodal medical image fusion review: theoretical background and recent advances. Signal Process. 183,108036. (doi:10.1016/j.sigpro.2021.108036)

23. Liu Y, Chen X, Peng H, Wang Z. 2017 Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 36,191-207. (doi:10.1016/ j.inffus.2016.12.001)

24. Wang K, Zheng M, Wei H, Qi G, Li Y. 2020 Multi-modality medical image fusion using convolutional neural network and contrast pyramid. Sensors 20,2169. (doi:10.3390/s20082169)

25. Li W, Lin Q, Wang K, Cai K. 2021 Improving medical image fusion method using fuzzy entropy and nonsubsampling contourlet transform. Int. J. ImagingSyst. Tech. 31,204-214. (doi:10.1002/ima.22476)

26. Qiu T, Wen C, Xie K, Wen F, Sheng G, Tang X. 2019 Efficient medical image enhancement based on CNN-FBB model. IETImage Process. 13,1736-1744. (doi: 10.1049/iet-ipr.2018.6380)

27. Zhu Z, Yin H, Chai Y, Li Y, Qi G. 2018 A novel multi-modality image fusion method based on image decomposition and sparse representation. Inf. Sei. 432,516-529. (doi:10.1016/j.ins.2O17.09.010)

28. Qi Y, Yang Z, Sun W, Lou M, Lian J, Zhao W, Deng X, Ma Y. 2022 A comprehensive overview of image enhancement techniques. Arch. Computat. Methods Eng. 29,583-607. (doi: 10.1007/s 11831 -021-09587-6)

29. Zhang Q, Shen X, Xu L, Jia J. 2014 Rolling guidance filter. In European Conf, on Computer Vision pp. 815-830, Springer. (doi:10.1007/978-3-319-10578-9)

30. He K, Sun J, Tang X. 2012 Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35,1397-1409.

31. Wang Z, Ma Y. 2008 Medical image fusion using m-PCNN. Inf. Fusion 9,176-185. (doi:10.1016/j.inffus.2007.04.003)

32. Jiang L, Zhang D, Che L. 2021 Texture analysis-based multi-focus image fusion using a modified Pulse-Coupled Neural Network (PCNN). Signal 91,116068. (doi:10.1016/j.image.2020.116068)

33. He Z, Chen X, Sun L. 2015 Saliency mapping enhanced by structure tensor. Comput. Intell. Neurosci. 2015,875735. (doi:10.1155/2015/875735)

34. BroxT, Weickert J, Burgeth B, Mrázek P. 2006 Nonlinear structure tensors. Image Vision Comput. 24,41-55. (doi:10.1016/j.imavis.2005.09.010)

35. Köthe U. 2003 Edge and junction detection with an improved structure tensor. In Joint Pattern Recognition Symp. pp. 25-32 Springer.

36. Baghaie A, Yu Z. 2015 Structure tensor based image interpolation method. AEU - Int. J. Electron. Commun. 69, 515-522. (doi:10.1016/j.aeue. 2014.10.022)

37. Tan W, Zhou H, Song J, Li H, Yu Y, Du J. 2019 Infrared and visible image perceptive fusion through multi-level Gaussian curvature filtering image decomposition. Appl. Opt. 58,3064-3073. (doi:!0.1364/A0.58.003064)

38. Johnson KA, Becker JA. Whole brain atlas. See http://www.med.harvard.edu/AANLIB/

39. Zhang Y, Xiang W, Zhang S, Shen J, Wei R, Bai X, Zhang L, Zhang Q. 2022 Local extreme map guided multi-modal brain image fusion. Front. Neurosci. 16,1866. (doi:10.3389/fnins.2022.1055451)

40. Zhu R, Li X, Huang S, Zhang X. 2022 Multimodal medical image fusion using adaptive co-occurrence filter-based decomposition optimization model. Bioinformatics 38,818-826. (doi:10.1093/bioinformatics/btab721)

41. Jie Y, Li X, Tan H, Zhou F, Wang G. 2024 Multi-modal medical image fusion via multi-dictionary and truncated Huber filtering. Biomed. Signal Process Controls, 105671. (doi:10.1016/j.bspc.2023.105671)

42. Jie Y, Li X, wang M, Zhou F, Tan H. 2023 Medical image fusion based on extended difference-of-Gaussians and edge-preserving. Expert Syst. Appl. 227,120301. (doi:l 0.1016/j.eswa.2023.120301)

43. Zhu Z, Zheng M, Qi G, Wang D, Xiang Y. A phase congruency and local Laplacian energy based multi-modality medical image fusion method in NSCT domain. IEEE Access 7,20811 -20824. (doi:10.1109/ACCESS.2019.2898111 )

44. Jagalingam P, Hegde AV. 2015 A review of quality metrics for fused image. Aquat. Procedía 4,133-142. (doi:10.1016/j.aqpro.2015.02.019)

45. Aslantas V, Bendes E. 2015 A new image quality metric for image fusion: the sum of the correlations of differences. AEU - Int J. Electron. Commun. 69,1890-1896. (doi:10.1016/j.aeue.2O15.09.004)

46. Xydeas CS, Petrovic'V. 2000 Objective image fusion performance measure. Electron. Lett. 36,308. (doi:10.1049/el:20000267)

47. Piella G, Heijmans H. 2003 A new quality metric for image fusion. In Proc. 2003 Int. Conf, on Image Processing (Cat. No. 03CH37429) vol. 3, Barcelona, Spain, p. 111-173 IEEE.

48. Li S, Hong R, Wu X. 2008 A novel similarity based quality metric for image fusion. In int. Conf, on Audio, Language and Image Processing, pp. 167-172 IEEE.

49. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. 2004 Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13,600-612. (doi: 10.1109/tip.2003.819861 )

50. Haghighat MBA, Aghagolzadeh A, Seyedarabi H. 2011 A non-reference image fusion metric based on mutual information of image features. Comput. Elect. Eng. 37,744-756. (doi:10.1016/j.compeleceng.2011.07.012)

51. Cloud SM. 2024 Endoscopic-image-enhancement-wavelet-transform-and-guided-filter-decomposition-based-fusion-approach. See https:// github.com/S-M-Cloud/Advancing-multimodal-medical-image-fusion

Word count: 6711

Show less

© 2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Details

Title

Advancing multimodal medical image fusion: an adaptive image decomposition approach based on multilevel Guided filtering

Author

Moghtaderi, Shiva¹; Einlou, Mokarrameh¹; Wahid, Khan A¹; Lukong, Kiven Erique²

¹ Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5A9, Canada
² Department of Biochemistry, Microbiology and Immunology, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E5, Canada

Pages

1-18

Section

Research

Publication year

2024

Publication date

2024

Publisher

The Royal Society Publishing

e-ISSN

20545703

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1098/rsos.231762

ProQuest document ID

3049277089

Advancing multimodal medical image fusion: an adaptive image decomposition approach based on multilevel Guided filtering

Jump to:

Full text

Abstract

Details

Suggested sources