Content area
Underwater images bring about substantial information to many tasks regarding marine science or coastal engineering. Meanwhile, enhancement of serious underwater image degradation like wavelength-dependent color distortion or decreased contrast is essential in practical applications. Although deep learning-based underwater image enhancement methods have increasingly been developed, construction of a large-scale underwater image dataset is still a remaining issue. Currently, expensive cost and the difficulty of measurement disturb collection of real data. On the other hand, alternatively employed synthetic underwater images based on simplified physical model or generative adversarial network may deviate from real data. In order to reduce the domain gap between real and synthetic underwater images, we generate underwater images based on physically revised underwater image formation model. By reformulating the model as Monte Carlo integration in statistical physics, we avoid variable multiplication and enable the calculation. The constructed dataset is shown to include diverse degradation and be closer to real images as well. Subsequently, underwater image color correction is tackled via exemplar-based style transfer to cope with diverse color cast. Finally, simply designed image sharpening algorithm combining discrete wavelet transform and Laplacian pyramid is proposed to improve the visibility. The proposed scheme mainly achieves superior or competitive performance compared to other latest methods.
Introduction
Underwater optical images play an indispensable role in sensing ocean environments. As underwater images give high resolution color information at low cost, they are widely utilized in many tasks like monitoring marine environment in oceanic science or developing ocean resources in coastal engineering [51]. On the other hand, depending on physical conditions like water types or lightning conditions, real underwater images seriously suffer from image degradation [39, 53]. Specifically, blueish, yellowish, and greenish color tones coming from wavelength selective color distortion, as well as blur or decreased contrast induced by underwater physical process worsen the visibility and limit the availability of underwater images [3]. Accordingly, underwater image enhancement of various degrees and kinds of degradation is a challenging and an important task in underwater image processing [5].
Similar to other vision tasks, constructing a large-scale training dataset with the ground truth is not a minor problem in developing deep learning-based underwater image enhancement techniques. One reason is that obtaining a sufficient number of clear and degraded image pairs in real underwater conditions via two different sensing devices is high cost, as well as inherently difficult in turbid water areas.
Alternative to real images, synthetic datasets generated with simplified physical model [17, 34, 58] or generative adversarial network (GAN) [16, 26, 27] are often employed in underwater imaging.
Fig. 1 [Images not available. See PDF.]
Examples of underwater images synthesized with our proposed scheme. An input land image (upper 1st column) is transformed depending on the depth map (lower 1st column), resulting in six synthesized underwater images (2nd column to 7th column) with different lightning conditions and camera sensors. Synthetic images of water type I (1st row) and water type 3C (2nd row) classified by Jerlov [28] are shown
In generating artificial underwater images based on physical model, complex physical process in underwater is usually simplified in modeling. Detailed descriptions of mathematical modeling are found in Sect. 3. [34] synthesized underwater images by assigning constant attenuation coefficients to the RGB-D indoor image dataset [46], mathematically described as Eqs. 3 [46]. The attenuation coefficients are defined per RGB channels and respectively set to visually match each 10 water types classified by Jerlov [28]. While ground truth images are beneficially given from the indoor images, dependency of on physical parameters is simplified. [17] considers wavelength scattering and attenuation simultaneously to better reflect real environments. As discussed in [1, 3, 5], however, the employed physical models for synthesizing above datasets are simplified and incorrect in that they only consider the effect of optical properties of underwater and neglect other physical parameters like lightning conditions or camera sensors. Also, each is set to a constant, which actually relies on distance z, as shown in Fig. 4. This means previously synthesized underwater images are apart from real images and hence their effectiveness may be limited.
Thereafter, based on physical process, revised underwater image formation model defined as Eq. 4 is proposed in [1]. Compared to previous models [17, 34, 58], dependencies on physical parameters like water types, lightning conditions, camera sensors, and reflectance are explicitly considered. One point to be noted, we experimentally observe that naively calculated attenuation coefficient in Eq. 4 numerically diverges in cases where turbid water types and large distance z. Detailed discussions are found in Sect. 3.2.3.
Fig. 2 [Images not available. See PDF.]
An overview of presented underwater image color correction scheme. First, content and Style of clear and degraded image pairs are encoded. Then style information is swapped, followed by GDWCT [14] and decoder modules. The same procedure is repeated once more, returning to each original input image to take the cycle-consistency loss. Activation function, normalization, and convolution operations are denoted as act, norm, and conv, respectively
In this study, inspired by statistical physics, we reformulate attenuation coefficients and in Eq. 4 as Monte Carlo integration, which accordingly avoids variable multiplication and enables the numerical calculation. Calculated attenuation coefficients are then assigned to the RGB-D indoor image dataset [46], thus we succeeded in synthesizing underwater images based on physically revised underwater image formation model [1]. As shown in Fig. 4, Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) of the proposed dataset distributed wider compared to the previous dataset based on simplified physical model [34]. This means a wide variety of degradation is included in the proposed dataset and hence practically useful. Also, reality of the constructed dataset is suggested in Table 2.
Subsequent to constructing proposed dataset, underwater image enhancement is tackled via color correction phase and image sharpening phase, respectively shown in Fig. 2 and Fig. 5. As color bias is seen as a kind of image style, style transfer is efficaciously integrated in underwater image enhancement. Specifically, adaptive instance normalization (AdaIN) [24] in style transfer employing mean and variance of features is shown to correct color cast [12, 19]. However, as style information is also embedded in higher order statistics like covariance [14, 21], AdaIN may be insufficient in underwater imaging. Motivated by above, this study proposes an underwater color correction method via exemplar-based style transfer based on GAN, as shown in Fig. 2. Compared to previous studies, Group-wise Deep Whitening-and-Coloring Transformation (GDWCT) [14] is incorporated in framework of cycle-consistency [59] to restore color cast selective to wavelength. The model consists of content and style encoder modules, GDWCT modules, and decoder modules. Cycle-consistency loss is measured between an original input image and a reconstructed image to stabilize training.
Finally, we present a quite simple image sharpening algorithm, Sharpening with LAplacian pyramid on Wavelet decomposition (SLAW), to enhance blurred underwater images. As shown in Fig. 5, SLAW recursively adds high frequency signals in Laplacian pyramid to each component decomposed with discrete wavelet transform. Employed as pre-processing, visibility of blurred underwater images improves. The overall underwater enhancement scheme mainly attains state-of-the-art or competitive performance in terms of UIQM [40] and UCIQE [54]. Our main contributions are following:
Reformulating physically revised underwater image formation model as Monte Carlo integration in statistical physics, which avoids variable multiplication and enables the computation of attenuation coefficients in large distance z and turbid water types.
Constructing a large-scale synthetic underwater image dataset based on the formulation. The dataset is shown to include a wide variety of degradation and more real compared with a previous approach.
Proposing a simple image sharpening algorithm combining discrete wavelet transform and Laplacian pyramid, which is suitable to pre-process blurred underwater images.
Presenting an underwater image color correction method based on exemplar-based style transfer, achieving state-of-the-art or competitive performance compared with previous mainstream methods.
Related work
Construction of synthetic image dataset
Other than physical model, underwater image datasets are constructed based on GAN [16, 26, 27] or human ranking [35, 43]. In GAN-based approaches, clear and distorted underwater images are respectively collected. Then, mappings from clear to distorted, and distorted to clear underwater images are trained by minimizing cycle-consistency loss [59]. Note that paired images are not required in training. Trained mapping is employed to add strong noise to clear underwater images, mimicking real underwater image degradation, thus clear and degraded underwater image pairs are obtained [16, 26, 27]. The validity of the mapping depends on the distribution of a prepared dataset and adversarial learning. Meanwhile, covering wavelength-selective, spatially variant whole real underwater image degradation is inherently difficult. [35, 43] construct clear and degraded image pairs based on human scores on recovered results via conventional algorithms, to match human perception. Though this approach advantageously tends to reflect human perception, the size of the dataset is currently limited to at most a few thousand, as it laboriously requires human judgment one by one [35, 43]. Measurable success is achieved to employ these datasets for training deep learning models, yet deviation from real data remains to be a challenging problem [5]. Meanwhile, recently proposed multimodal collaborative learning guided by autoencoder [10] efficiently supplements incomplete data in medical image synthesis. By incorporating information given by autoencoder as a teacher model, the synthetic network generates medical images of a target-modality from embedded results of the remaining modalities.
Fig. 3 [Images not available. See PDF.]
Results of constructed dataset. Histograms of PSNR (middle) and SSIM (right) in comparison with [34], and average colors of the proposed dataset (left). Average colors of different water types (I, IA, IB, II, III, 1C, 3C, 5C, 7C, 9C from left to right) classified by [28], lightning conditions (CIE D50, CIE D65, and Illuminant-A from top to bottom), and camera sensors (Nikon D90 and Canon 60D) are shown, respectively
Methods for underwater image enhancement
Previous underwater image enhancement methods are generally classified into supervised and unsupervised approaches. As for convolutional neural network (CNN) methods, [34] proposed UWCNN, a light weight densely connected model, of which each layer directly takes an input image. Inspired by [26, 56] incorporated dense skip connections concatenating an input and an output of each layers to capture hierarchical features. In GAN-based models, [27] proposed an encoder-decoder-based network equipping skip connections. Recently, Vision Transformer (ViT)-based models with the attention mechanism are proposed to deal with spatially variant, wavelength-selective underwater image degradation [43]. To alleviate the domain gap between training data and test data, style transfer is effectively integrated. [19] proposed probabilistic network named PUIE-Net to cope with biased labels. Embedded features are converted to image statistics by randomly sampling from multivariate Gaussian distribution [19]. To enable domain adaptation, features of content and style are separately extracted in UIESS [12], followed by decoding with swapped features via AdaIN. Also, traditional signal processing methods such as white balance, gamma correction, and histogram equalization [35], or color information other than RGB color space [33] further boost underwater image enhancement.
On the other hand, many unsupervised methods assume physical model and correct underwater image color distortion by estimating model parameters [2] or imposing white balance, which often accompanies estimation of veiling light or average color [4, 36]. Based on underwater physical process [1, 2] utilized RGB-D images and solved an optimization problem on distance. While showing excellent results, their method requires usually unknown physical parameters like depth information, which limits practical applications. Also, the estimation of veiling light generally needs expensive computation. [60] successfully introduced hyper-laplacian reflectance priors reflecting gradient information based on Retinex variational model [18]. Apart from underwater images, scheme of Multi-modal Gated Mixture [11] might be a promising option in correcting various color distortion. Toward integrating visible images and infrared images, which are complementary in lighting conditions, [11] fuses weighted sum of outputs of several expert networks to achieve dynamic fusion.
Scheme of constructing the dataset
In Sect. 3.1, physical modeling of underwater images is explained. In Sect. 3.2, mathematical formulation described as Monte Carlo integration follows. Sections 3.3 and 3.4 discuss detailed settings and results of proposed dataset.
Fig. 4 [Images not available. See PDF.]
Results of attenuation coefficients by the proposed scheme with different lightning conditions. Results of (left) and (middle) for water type I, and for water type 9C (right). Each colors represent corresponding color channels. The attenuation coefficients are distinguished and strongly depend on distance z, which are often simplified to be the same and a constant in previous studies [5, 34]
Physical modeling of the generation process of underwater images
An underwater image I(x) is often described as an additive model composed of the direct signal D(x), the backscattering component B(x), and the forward scattering component F(x), where x represents the position of an image [45]. D(x) directly reaches the sensor carrying object information, while B(x) arrives at by scattering from suspended particles which includes no signal information. F(x) is the component indirectly reaches a sensor from objects in underwater, which makes I(x) blurred appearance. Here, F(x) is often omitted in modeling for brevity [3]. In this study, blurred appearance of I(x) is enhanced by image sharpening rather than modeling, described in Sect. 5. Hence, I(x) is modeled as Eq. 1 [1].
1
Considering the characteristics of wavelength-dependent attenuation in underwater, I(x) is further modeled as Eq. 2, where represents wavelength [13]. Here, works as the ratio of the original signal in .2
Due to the similarity to an atmospheric scattering model on land, which is described as Koschmieder’s model, is simplified to Eq. 3 [5, 32], where c and are the color channels and attenuation coefficients, respectively. Each component of Eq. 3 exponentially attenuates with distance z from objects to a sensor depending on respective color channels, which corresponds to wavelength-selective signal attenuation in underwater.3
In underwater imaging, each in Eq. 3 is often set from the viewpoints of wavelength attenuation to reflect underwater physical process [13, 17, 34, 58].More recently, revised underwater image formation model based on physical process is proposed [1]. Mathematical formulation is defined as Eq. 4.
4
Here, attenuation coefficients of the direct component and the backscattering component are clearly distinguished [1]. represents veiling light of the c channel later defined in Eq. 14. Major difference with previous models is that physical parameters are explicitly modeled, such as the effects of illumination irradiance , spectrum response of a sensor, besides optical properties of water and distance z. It is worth noting that above physical parameters are often simplified or ignored in previous studies [1, 5, 17, 34].Reformulation and computation of attenuation coefficients
Derivation of attenuation coefficients
First, we quickly review the revised underwater image formation model [1] described in Eq. 4. Detailed discussions and derivation process are found in [1, 3].
Following Beer-Lambert law and assuming spatially homogeneous water volume [3, 44], direct signal at point and wavelength is described as following:
5
In practical setting, direct signal is often measured with publicly available RGB cameras, which is formulated as:6
Illumination irradiance is attenuated with the term in propagating underwater, followed by terms of reflectance spectrum of objects and spectrum response of a sensor. and are beam attenuation coefficient mainly governed by water constituents and a scaling constant in imaging like exposure, respectively. and stand for measured spectrum range. As is eliminated in integration of Eq. 6, we can simplify Eq. 5 as following equation.7
Therefore, attenuation coefficient is obtained by transforming Eqs. 6 and 7.8
9
Subsequently, derivation of attenuation coefficient of backscattering follows. As the second term of Eq. 4 is a function of distance z, we can rewrite backscattering component :10
To derive , we consider its physical process. Infinitesimal backscattering signal dB received within small range dz is given by:11
Here, represents exponential decay based on Beer-Lambert law. represents beam scattering coefficient characterizing water constituents. By integrating both sides of Eq. 11, we can obtain , where .12
represents spectrum of veiling light corresponding to ambient light in underwater. Similar to Eq. 6, captured RGB response of backscattering is:13
Note that veiling light captured by the sensor is obtained by setting in Eq. 13.14
Therefore, by employing and transforming Eqs. 10 and 13, is expressed as following equation.15
16
Reformulation of attenuation coefficients with Monte Carlo integration
Subsequently, we reformulate in Eq. 9 and in Eq. 16 from the viewpoint of Monte Carlo integration. and are respectively expressed as an expectation calculation in and , given by:
17
18
19
20
21
22
where and . Here, we rewrite Eqs. 17 and 20 from Eqs. 9 and 16 for clarity. By interpreting the denominator in Eq. 17 as the normalization constant independent on in Eq. 18, we can get in the framework of computation of expected value as in Eq. 19. can be obtained in the same way as as in Eq. 22. The expected values formulated in Eqs. 19 and 22 are to be computed within the framework of Monte Carlo integration by using MCMC. In calculating the attenuation coefficients, , for instance, wavelength is repeatedly sampled from for N times. Sampled is then assigned to the integrand in Eq. 19, followed by simply averaging with the N samples. Similarly, is computed by employing and Eq. 22 instead of and Eq. 19.The calculation accuracy based on the above formulation is improved by the order of with the sample size N [9]. This study uses Metropolis-Hastings algorithm [22] for the MCMC method. We sampled 1300000 samples, first 300000 samples are abandoned as burn-in, hence remaining 1000000 samples are employed. Namely, we compute the Monte Carlo integration of and at the accuracy 0.001.
Advantages of proposed formulation
By assigning related variables to Eqs. 19 and 22, we can obtain and , to be described in Sect. 3.3. In preliminary experiments, we empirically observe that naive computation of numerically diverges especially in cases where turbid water types and large z. Concretely speaking, is nearly 2.8 in water type 9C and , while is about 0.02 in water type I, as shown in Fig. 4. In this case, becomes approximately 2.49E-122 and 0.13, respectively. As the exponential term of Eq. 20 is nearly zero in the former case, the remaining variable multiplication leads to zero, resulting in the denominator and the numerator the same. Accordingly, the argument of the logarithmic term becomes zero and diverges. On the other hand, computation in the framework of Monte Carlo integration that simply sums up the sampled values does not diverge, by avoiding variable multiplication of and . It is noteworthy that computation of in this setting does not numerically diverge via usual integration, as is usually smaller than . Compared with previous research solving optimization problem concerning distance z [2], proposed scheme defined as Monte Carlo integration does not require RGB-D images in computing and . Also, our scheme may beneficially enable computation of attenuation coefficients in settings like other spectrum data is employed, as well as large z.
Table 1. Standard deviations of PSNR and SSIM per 10 water types. The proposed dataset contains diverse degradation, compared to previous dataset [34]
Water-type | I | IA | IB | II | III | 1C | 3C | 5C | 7C | 9C | |
|---|---|---|---|---|---|---|---|---|---|---|---|
PSNR() | Propose | 8.165 | 6.807 | 5.63 | 6.333 | 4.332 | 6.515 | 4.367 | 4.486 | 3.562 | 3.647 |
Li et al. [34] | 2.292 | 2.345 | 2.41 | 2.657 | 2.49 | 2.41 | 2.072 | 2.449 | 2.882 | 2.82 | |
SSIM() | Propose | 0.193 | 0.201 | 0.195 | 0.219 | 0.187 | 0.188 | 0.188 | 0.200 | 0.175 | 0.175 |
Li et al. [34] | 0.095 | 0.097 | 0.101 | 0.110 | 0.113 | 0.109 | 0.106 | 0.093 | 0.095 | 0.093 |
Process of constructing the proposed dataset
Following Eqs. 19 and 22, we compute and . Specifically, attenuation coefficients are calculated per ten water types classified by Jerlov [28]. We utilize spectrum data of physical scattering b and beam attenuation coefficient governed by water constituents per ten water types from [48]. Spectrum response and reflectance of objects are respectively adopted from Nikon D90, Canon 60D [29] and Macbeth chart [41]. To mimic natural lightning, spectra of CIE D50, CIE D65, and Illuminant-A are employed from [25] for illumination irradiance . As in [1], reflectance of Macbeth chart is averaged over all color patches, which less affects the final result [1]. Totally, we synthesize 60 kinds of attenuation coefficients.
Then, we can synthesize realistic underwater images by assigning computed and to the underwater image formation model described in Eq. 4 [1]. To obtain depth information, we employ 1449 indoor RGB-D images from NYU Depth Dataset V2 [46] as in [34], thus 86940 underwater images are totally generated. It is noteworthy that we can further obtain more images by employing additional data. Examples of the synthesized images containing various colors and degrees of degradation are illustrated in Fig. 1.
Results of calculated attenuation coefficients and constructed dataset
1st column to 3rd column of Fig. 4 show computed results of and for water type I, and for water type 9C, which are classified by Jerlov [28]. Results of different light spectrum data of CIE D50, CIE D65, and Illuminant-A are plotted in Fig. 4. Each color represents corresponding color channels and the y-axis represents distance z. As discussed in [1, 2–3], both attenuation coefficients are distinguished and prominently depend on distance z, while [34] incorrectly assumed to be the same and a constant on z. Reflecting water properties, the red channel attenuates fast and displays blueish color tone in water type I, while greenish or yellowish appearance is displayed in water type 9C because the blue channel more attenuates.
To quantitatively evaluate the proposed dataset and previous dataset based on simplified physical model [34], PSNR and SSIM are computed. Figure 3 illustrates histograms of PSNR (middle) and SSIM (left). Compared to [34] (blue), ours (red) ranges wider, indicating the degradation is diverse. Also, standard deviations per 10 water types of proposed dataset are consistently higher than [34], as shown in Table 1. A wide variety of degradation in the constructed dataset would be practically valuable to train supervised methods.
Subsequently, we compute the classification score employed in style transfer tasks [50] to quantitatively evaluate reality of the dataset. We fine-tune pre-trained VGG16 network [47] to distinguish real underwater images from synthesized ones, and the percentage of classified as "real" underwater images is evaluated. Higher score implies realistic underwater images. Real and synthesized underwater images are respectively taken from [35] and [26]. In evaluating classification score, 90 real images not utilized in training are treated as a baseline. Results are shown in Table 2. Owing to taking into account complex underwater physical process, score of our dataset is about 10 points higher than [34], which suggests that proposed dataset is more real. All real underwater images are classified as real. It is noteworthy that, to our knowledge, no other synthetic underwater image dataset synthesized from NYU Depth Dataset V2 is publicly available except [34], hence not compared in this study [17, 58].
Table 2. Classification score for the proposed dataset and a previous dataset [34]
Classification Score | |
|---|---|
Real (baseline) | 1.0000 |
Ours | 0.9978 |
Li et al. [34] | 0.8830 |
Fig. 5 [Images not available. See PDF.]
An overview of proposed SLAW. An input image is recursively sharpened with the Laplacian module. Low frequency LL component of wavelet transform is passed to deeper layers. Being applied to each level of wavelet transform, multiscale image feature is enhanced. Case of is shown
Scheme of underwater image color correction via exemplar-based style transfer
Subsequent to constructing the proposed dataset, we present an underwater image color correction method illustrated in Fig. 2. Details of the model and loss function are described in this section.
Setting
In this research, we consider underwater image color correction under the scheme of exemplar-based style transfer. In style transfer, style of an one image is changed with style of the other image without changing the original content [14, 55].
Let be a sample of land images and denotes a sample of underwater images, where and represent domains of land and underwater images, respectively. As we consider supervised setting, an underwater image and the corresponding land image are given. Here, content and style of and are respectively denoted as , , , , where and are content and style encoders for land and underwater image domains, respectively. We assume and are common, thus the objective is getting a style transferred image where denotes the decoder reconstructing land images.
Table 3. Results of sharpness [20] and RMS-Contrast [42] processed with SLAW. Sharpness and RMS-Contrast is improved with increased number of pyramid levels L or the times of wavelet transform W
Dataset | PIPAL [30] | UIE-890 [35] | Challenging-60 [35] | Original | ||||
|---|---|---|---|---|---|---|---|---|
Sharpness | RMS-Contrast | Sharpness | RMS-Contrast | Sharpness | RMS-Contrast | Sharpness | RMS-Contrast | |
RAW | 1.063 | 0.076 | 0.449 | 0.067 | 0.245 | 0.049 | 0.116 | 0.0833 |
SLAW-1310 | 1.372 | 0.0781 | 0.58 | 0.067 | 0.296 | 0.0494 | 0.14 | 0.0833 |
SLAW-1311 | 2.034 | 0.084 | 0.926 | 0.0685 | 0.46 | 0.0498 | 0.236 | 0.0834 |
SLAW-1312 | 2.464 | 0.0908 | 1.261 | 0.0708 | 0.661 | 0.0508 | 0.339 | 0.0837 |
SLAW-2310 | 1.576 | 0.0803 | 0.68 | 0.0675 | 0.357 | 0.0496 | 0.169 | 0.0833 |
SLAW-2311 | 2.259 | 0.0891 | 1.097 | 0.0701 | 0.579 | 0.0506 | 0.291 | 0.0836 |
SLAW-2312 | 2.699 | 0.0987 | 1.514 | 0.0742 | 0.846 | 0.0528 | 0.441 | 0.0843 |
SLAW-3310 | 1.632 | 0.0826 | 0.718 | 0.0682 | 0.387 | 0.0499 | 0.182 | 0.0834 |
SLAW-3311 | 2.263 | 0.0937 | 1.137 | 0.0719 | 0.624 | 0.0516 | 0.314 | 0.084 |
SLAW-3312 | 2.649 | 0.1046 | 1.541 | 0.0775 | 0.899 | 0.0551 | 0.478 | 0.0851 |
Underwater image color correction model
We interpret diverse color cast of underwater images as a kind of style. Presented process is illustrated in Fig. 2. Underwater color correction is implemented based on exemplar-based style transfer incorporating cycle-consistency loss [59] and Group-wise Deep Whitening-and-Coloring Transformation (GDWCT) [14]. GDWCT is a kind of modified version of Whitening and Coloring Transform (WCT) [55]. WCT effectively normalizes input features by performing whitening transform and coloring transform. Here, covariance matrix of the content feature becomes an identity matrix in whitening and the covariance matrix of the whitened content feature is shifted to match that of style in coloring. Compared to WCT, efficiently implemented GDWCT avoids Singular Value Decomposition (SVD) and hence numerically stable, by imposing a regularization term to become an identity matrix of which components consist of divided several groups.
For obtaining better results, we consider supervised setting where image pairs are given. In presented scheme, after encoding the style and the content feature of both underwater and corresponding land images, each style features are swapped, followed by GDWCT [14] and decoding phase, thus reconstructed images are obtained. This procedure is repeated one more time, and the original underwater and land images are respectively reconstructed. Obtained intermediate images and encoded features are utilized in loss function for training.
Detailed procedures are described as follows. Overall network consists of four modules, content encoder , style encoder , GDWCT block, and decoder block Decoder. In the content encoder , convolution, normalization and activation block is repeated three times, and subsequent residual block with skip connection is repeated eight times to encode content features. Note the stride of convolution of the second and the third block is set to two, thus input resolution decreases. In the style encoder , convolution, normalization, and activation block is repeated five times followed by adaptive average pooling for summarizing output information. Except the first block, the stride of the convolution is set to two for down-sampling. After encoding with and , GDWCT and the decoder modules follow. Adopted GDWCT transfers respective styles. In the Decoder, up-sampling, convolution, and normalization block is repeated twice, then output is obtained after the final convolution and tanh activation. As for discriminator, multiscale discriminator is utilized [52].
Loss Function
Pixel wise loss, encoded feature loss, and adversarial loss are adopted to train the model. As for the pixel wise loss function, the cycle-consistency loss and the identity loss are employed to accelerate learning stability [14, 59], described as following:
23
24
where denotes an originally underwater image which is converted from land to underwater style in order, and the same is true for the others. The encoded feature loss for preserving encoded style and content is taken between before and after the transformation, represented as:25
26
For the adversarial loss term, LS-GAN [38] is employed, thus whole loss function is expressed as follows:27
where are adversarial loss for the discriminator and the generator, respectively.Sharpening with laplacian pyramid on wavelet decomposition (SLAW)
Other than color cast, contrast degradation caused by backscattering or forward scattering substantially reduces the visibility of underwater images, as shown in 1st column of Fig. 7. In order to improve sharpness of blurred underwater images, we introduce a fairly simple unsupervised image sharpening algorithm, Sharpening with Laplacian pyramid on Wavelet decomposition (SLAW). Figure 5 illustrates an overview of SLAW which combines discrete wavelet transform (DWT) and Laplacian pyramid. DWT and Laplacian pyramid are briefly introduced, followed by topics of detailed algorithm and experiments.
Preliminary
DWT, especially employing Haar wavelet has recently been incorporated to deep learning [55]. Haar wavelet divides an image to subbands which reflect frequency, consisting of four kernels, . Here, . captures low frequency signals, while high frequency signals are captured by . Inverse discrete wavelet transform (IDWT) is the mirror operation of DWT reconstructing signals by structurally organized components with minimal noise amplification [55].
Laplacian pyramid is a hierarchical image representation composed of band-pass images reflecting resolution and frequency [7, 8]. Lower pyramid levels capture high frequency signals, while higher pyramid levels capture low frequency signals. Laplacian pyramid is simply implemented by recursively subtracting adjacent elements of Gaussian pyramid, which is obtained with Gaussian filtering and down-sampling. It is worth noting that Laplacian pyramid depends on three parameters, kernel size K and standard deviation of Gaussian kernel, and pyramid level L.
Algorithm of SLAW
Presented SLAW sharpens an image by simply adding high-frequency signals, which is obtained by removing the top component of Laplacian pyramid, to each component decomposed with DWT. Applied to multiscale image representation decomposed with DWT, various image features are efficiently sharpened with SLAW.
Fig. 6 [Images not available. See PDF.]
Comparison of the times of wavelet transform W. W is set to 0 to 3 from 1st column to 4th column. Other parameters are set to . An input (5th column) is sharpened with SLAW-3313 (4th column)
Fig. 7 [Images not available. See PDF.]
Comparison of pyramid level L. Input image (1st column), SLAW-3313 (2nd column), SLAW-4313 (3rd column), and SLAW-5313 (4th column)
Fig. 8 [Images not available. See PDF.]
Comparison of different Gaussian kernels. Input image (1st column), SLAW-1312 (2nd column), SLAW-1552 (3rd column), SLAW-1992 (4th column)
Algorithm of SLAW is described in algorithm 1. Here, pyramid level L, kernel size K and standard deviation of Gaussian kernel G, and the times of wavelet transform W are given beforehand. Each parameter settings of SLAW is denoted as SLAW- in order. An input image is first sharpened by adding a sharpened image without the top component of Laplacian pyramid. Depending on the times of DWT, the sharpened image is then divided with DWT using Haar wavelet, thus multiscale image representation is obtained, which is composed of high frequency and low frequency components. Each component is further decomposed with Laplacian pyramid, and corresponding high frequency signals, , are extracted by once again removing the top component of Laplacian pyramid. Sharpened outputs, which is denoted as , are obtained by respectively adding to the original signals . The sharpened frequency component is repeatedly decomposed with DWT and Laplacian pyramid several times, followed by reconstruction with inverse discrete wavelet transform, resulting in a sharpened output image .
While simply implemented by combining Laplacian pyramid and DWT, the key point of SLAW is that high frequency signals are recursively added to multiscale image representation, thus powerful image sharpening is achieved. As shown in Figs. 6 and 7, severely blurred underwater image is sharpened with increased numbers of W (Fig. 6) or Laplacian pyramid level L (Fig. 7). Compared to unsupervised underwater image enhancement method, [4] divides an input image with Laplacian pyramid and estimates each coefficients of the pyramid components to perform multiscale fusion. With regard to DWT in image restoration tasks, DWT is performed twice in underwater image enhancement [57] or only once in dehazing [15]. In relation to deep learning methods directly incorporating Laplacian pyramid or DWT to network architectures [49, 55], sharpness is flexibly set depending on input images or objective tasks, which is practically useful and also employed as a useful preprocessing.
Results of SLAW
In this section, we qualitatively and quantitatively evaluate the effectiveness of SLAW for land and underwater image datasets. PIPAL dataset [30] mainly utilized in image restoration tasks is evaluated for land images, while UIE-890, Challenging-60 [35], and an original dataset taken in Okinawa, Japan, are utilized for underwater images. As for quantitative metric, Sharpness measuring magnitude of gradient of an image [20] and RMS-Contrast [42] are computed. As shown in Table 3, sharpness and RMS-Contrast are consistently improved with various parameter settings of SLAW in both land and underwater image datasets.
Results of ablation study on SLAW are shown in Figs. 6, 7, and 8, respectively. The times of discrete wavelet transform W is compared in Fig. 6, where W is respectively set to 0, 1, 2, 3 from left to right. Note that high frequency components are added one time in . Other parameters are set to , which are denoted as SLAW-3310, SLAW-3311, SLAW-3312, and SLAW-3313. Sharpness and contrast of both raw and color corrected images (5th column of Fig. 6) are improved as W increases. As stated above, the essence of SLAW is that image sharpening is recursively performed on decomposed wavelet signals enabling enhancement of multiscale features. To be specific, the output of (1st column of Fig. 6) where image sharpening is performed only once in the original resolution is still blurred, while (4th column of Fig. 6) is clear, even for a severely degraded input image. Similarly, Fig. 7 illustrates results of different pyramid levels L. The higher the pyramid level L, the wider the added frequency bands are, resulting in higher sharpness or contrast. Also, larger kernel size K or standard deviation enhances an image more, because more signals are preserved in higher pyramid levels, as shown in Fig. 8. The degree of enhancement depends on original sharpness. Unpleasing halos sometimes become prominent in some cases (4th column of Fig. 8) because spatially invariant Gaussian kernel is employed for constructing Laplacian pyramid [6]. We empirically set .
Table 4. Results of UIQM [40] and UCIQE [54] compared to previous methods. Ours plus preprocessed with SLAW-2 mainly achieves the superior performance in several real underwater image datasets. The 1st, 2nd, and 3rd scores are marked in red, blue, and green, respectively
Dataset | UWCNN | FUnIE-GAN | Water-Net | U-Transformer | All-in-one | PUIE-Net | UIE-SS | Semi-UIR | Ours | +SLAW-1 | +SLAW-2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
UIE-890 | UIQM | 2.848 | 3.045 | 2.993 | 3.057 | 2.748 | 3.008 | 2.982 | 2.897 | 3.066 | 3.101 | 3.125 |
UCIQE | 3.709 | 6.393 | 5.478 | 5.255 | 5.369 | 4.759 | 5.120 | 6.139 | 6.262 | 6.237 | 6.106 | |
Challenging-60 | UIQM | 2.386 | 2.878 | 2.609 | 2.724 | 2.679 | 2.748 | 2.471 | 2.678 | 2.804 | 2.86 | 2.936 |
UCIQE | 3.203 | 5.338 | 4.554 | 4.231 | 4.69 | 3.804 | 4.302 | 4.855 | 5.49 | 5.43 | 5.258 | |
Original | UIQM | 2.558 | 2.58 | 3.031 | 2.664 | 2.9 | 2.518 | 2.955 | 2.521 | 2.591 | 2.959 | 3.07 |
UCIQE | 4.127 | 6.336 | 6.057 | 5.313 | 2.998 | 4.004 | 4.252 | 3.574 | 6.861 | 6.926 | 6.963 |
Table 5. Results of UIQM [40] and UCIQE [54] processed via different settings of SLAW. UIQM is improved with increased number of pyramid levels or the times of wavelet transform, while UCIQE depends on initial sharpness
Dataset | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
UIE-890 | Ours | +SLAW-2310 | +SLAW-2311 | +SLAW-2312 | +SLAW-2313 | +SLAW-3310 | +SLAW-3311 | +SLAW-3312 | +SLAW-3313 | |
UIQM | 3.066 | 3.101 | 3.125 | 3.13 | 3.147 | 3.115 | 3.131 | 3.147 | 3.189 | |
UCIQE | 6.262 | 6.237 | 6.106 | 5.889 | 5.699 | 6.16 | 5.984 | 5.786 | 5.684 | |
Challenging-60 | Ours | +SLAW-2310 | +SLAW-2311 | +SLAW-2312 | +SLAW-2313 | +SLAW-3310 | +SLAW-3311 | +SLAW-3312 | +SLAW-3313 | |
UIQM | 2.804 | 2.86 | 2.936 | 3.048 | 3.146 | 2.891 | 3.0 | 3.104 | 3.194 | |
UCIQE | 5.49 | 5.43 | 5.258 | 5.136 | 5.042 | 5.355 | 5.193 | 5.118 | 5.092 | |
Original | Ours | +SLAW-2310 | +SLAW-2311 | +SLAW-2312 | +SLAW-2313 | +SLAW-3310 | +SLAW-3311 | +SLAW-3312 | +SLAW-3313 | |
UIQM | 2.591 | 2.651 | 2.738 | 2.868 | 2.995 | 2.701 | 2.838 | 2.959 | 3.07 | |
UCIQE | 6.861 | 6.87 | 6.909 | 6.948 | 6.963 | 6.859 | 6.878 | 6.926 | 6.963 | |
Experiment
Fig. 9 [Images not available. See PDF.]
Recovered results of previous methods. 1st row shows input images, 2nd and 3rd row respectively show results of presented method and preprocessed with SLAW-2312. From 4th to 11th rows respectively show results of UWCNN[34], FUnIEGAN[27], Water-Net[35], U-Transformer[43], All-in-one[37], PUIE-Net (MP) [19], UIESS [12], and Semi-UIR [23]
Experimental setting
We train our method upon the constructed underwater image dataset by mapping from synthesized images to clear images. The training epoch is set to 100000 and Adam optimizer [31] with the learning rate 0.0001 is adopted. Style image required for generating an output is fixed throughout the test. Presented model is implemented with PyTorch and GeForce RTX 2080 Ti GPU. In evaluating our model, real underwater image datasets, UIE-890 [35], Challenging-60 [35], and an Original dataset taken in Okinawa, Japan are utilized. As existing no ground truth for real images, non-reference metric, generally employed UIQM [40] and UCIQE [54] are evaluated, which are described as.where , and . Each coefficient is set depending on the water quality. UIQM evaluates images based on colorfulness UICM, sharpness UISM, and contrast UIConM, weighting the clarity of an image. Meanwhile, UCIQE is defined as a linear combination of standard deviation of chroma , contrast of luminance , and average of saturation in CIELab space.
Fig. 10 [Images not available. See PDF.]
Depending on image sharpness, while visibility of a severely blurred underwater image (1st column) is improved with SLAW (2nd column), SLAW sometimes over sharpens (4th column) an input image (3rd column)
Recovered results and discussions of real underwater image datasets
In this experiment, publicly available state-of-the-art deep learning based methods, UWCNN [34], FUnIE-GAN [27], Water-Net [35], U-Transformer [43], All-in-one [37], PUIE-Net (MP) [19], UIESS [12], and Semi-UIR [23] are evaluated. Codes and parameters are directly employed provided by the authors for fair comparison. Qualitative and quantitative results are respectively shown in Fig. 9 and Table 4. As SLAW can be used as a pre-processing, results preprocessed with SLAW-2312 are also shown.
As shown in Fig. 9, the presented model (2nd row) corrects various degrees and colors of degradation. By preprocessed with SLAW-2312 (3rd row), the visibility of blurred underwater images is improved. Compared with other methods trained with a synthetic image dataset, UWCNN (4th row) [34] hardly improves overall visibility and FUnIE-GAN (5th row) [27] adds unpleasing artifacts in the heavily degraded inputs (1st, 2nd, and 5th column). Though Water-Net (6th row) [35], fusing images preprocessed with white balance, gamma correction, and histogram equalization, shows relatively better, the yellowish image is not sufficiently recovered (6th column). ViT-based U-Transformer (7th row) [43] also failed to recover a yellowish image (6th column) and sometimes adds non-negligible artifacts (5th column). All-in-one (8th row) [37] introduces serious color bias. Color correction is not sufficient in PUIE-Net (MP) [19] (9th row) and UIESS [12] (10th row) equipping AdaIN. Thanks to semi-supervised learning based on mean-teacher framework, Semi-UIR [23] (11th row) better restores all images, whereas color cast still exists (4th column). Overall, above supervised models suffer from domain shift owing to the difficulty of covering all real underwater image degradation. Besides, recovered results look a little blurred. In terms of UIQM and UCIQE, proposed scheme outperforms or competitive to other state-of-the-art methods, as shown in Table 4. Specifically, SLAW-2 get 1st rank in 4 out of 6 scores. Note that SLAW-1 and SLAW-2 represent SLAW-2310 and SLAW-2311 in UIE-890 and Challenging-60 datasets, whereas SLAW-3312 and SLAW-3313 in the Original dataset, respectively. Also refer to Table 5 listing other results preprocessed with SLAW.
Limitations of SLAW
Major drawbacks of SLAW is that the effect of sharpening is fixed depending on its pre-defined hyper parameters. As shown in Fig. 10, although simply designed SLAW effectively improves sharpness or contrast of blurred underwater images (2nd column), a not blurred underwater image (3rd column) is sometimes over enhanced with SLAW-2312, because SLAW does not work adaptively. In terms of quantitative scores, as UIQM tends to weight overall sharpness more than UCIQE, all settings of SLAW improve UIQM scores, while decreased UCIQE in some settings of SLAW except the Original dataset largely containing blurred images. However, compared to other supervised methods often suffering from the domain gap between training data and test data, restored results of SLAW are flexibly adjusted by hand depending on input image sharpness. As the degradation of underwater images is quite diverse, this characteristic of SLAW is practically useful in underwater imaging, especially in enhancing severely blurred images where less contained in training data. We encourage SLAW-2310 and SLAW-3310 as initial settings for clear and blurred underwater images to obtain stable results.
Conclusion
This study presents a synthetic underwater image dataset based on physically revised underwater image formation model. Attenuation coefficients of the model is reformulated as Monte Carlo integration, thereby avoiding variable multiplication and enabling numerical calculation in settings of turbid water types and large distance. PSNR and SSIM of the constructed dataset distributed wider, meaning that degradation is more diverse than previous one. Also, synthesized images are implied to be closer to real images and this may lead to mitigate problems of the domain shift. Diverse color distortion of underwater images is restored by incorporating the scheme of style transfer, cycle-consistency loss, and GDWCT. Besides, sharpness is enhanced via presented SLAW by recursively sharpening on multi scale image representation. Proposed underwater image enhancement method achieves the favorable performance compared to recent deep learning based methods.
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Akkaynak, D., Treibitz, T.: A revised underwater image formation model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
2. Akkaynak, D., Treibitz, T.: Sea-thru: a method for removing water from underwater images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
3. Akkaynak, D., Treibitz, T., Shlesinger, T., Loya, Y., Tamir, R., Iluz, D.: What is the space of attenuation coefficients in underwater computer vision? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
4. Ancuti, CO; Ancuti, C; De Vleeschouwer, C; Bekaert, P. Color balance and fusion for underwater image enhancement. IEEE Transact. Image Process.; 2018; 27,
5. Anwar, S; Li, C. Diving deeper into underwater image enhancement: a survey. Signal Process. Image Commun.; 2020; 89, [DOI: https://dx.doi.org/10.1016/j.image.2020.115978] 115978.
6. Aubry, M; Paris, S; Hasinoff, SW; Kautz, J; Durand, F. Fast local Laplacian filters: theory and applications. ACM Transact. Graph. (TOG); 2014; 33,
7. Bojanowski, P., Joulin, A., Lopez-Pas, D., Szlam, A.: Optimizing the latent space of generative networks. In: J. Dy, A. Krause (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 600–609. PMLR (2018). https://proceedings.mlr.press/v80/bojanowski18a.html
8. Burt, P.J., Adelson, E.H.: The Laplacian pyramid as a compact image code. In: Readings in Computer Vision, pp. 671–679. Elsevier (1987)
9. Caflisch, RE. Monte Carlo and quasi-monte Carlo methods. Acta Numer.; 1998; 7, pp. 1-49.
10. Cao, B., Bi, Z., Hu, Q., Zhang, H., Wang, N., Gao, X., Shen, D.: Autoencoder-driven multimodal collaborative learning for medical image synthesis. International J. Comput. Vis. pp. 1–20 (2023)
11. Cao, B., Sun, Y., Zhu, P., Hu, Q.: Multi-modal gated mixture of local-to-global experts for dynamic image fusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 23555–23564 (2023)
12. Chen, YW; Pei, SC. Domain adaptation for underwater image enhancement via content and style separation. IEEE Access; 2022; 10, pp. 90523-90534. [DOI: https://dx.doi.org/10.1109/ACCESS.2022.3201555]
13. Chiang, JY; Chen, YC. Underwater image enhancement by wavelength compensation and dehazing. IEEE Transact. Image Process.; 2011; 21,
14. Cho, W., Choi, S., Park, D.K., Shin, I., Choo, J.: Image-to-image translation via group-wise deep whitening-and-coloring transformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10639–10647 (2019)
15. Demirel, H; Anbarjafari, G. Image resolution enhancement by using discrete and stationary wavelet decomposition. IEEE Transact. Image Process.; 2010; 20,
16. Fabbri, C., Islam, M.J., Sattar, J.: Enhancing underwater imagery using generative adversarial networks. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7159–7165 (2018). https://doi.org/10.1109/ICRA.2018.8460552
17. Fu, X; Ding, X; Liang, Z; Wang, Y. Jointly adversarial networks for wavelength compensation and dehazing of underwater images. Multimed. Tools Appl.; 2023; 82,
18. Fu, X., Zhuang, P., Huang, Y., Liao, Y., Zhang, X., Ding, X.: A retinex-based enhancing approach for single underwater image. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 4572–4576 (2014). https://doi.org/10.1109/ICIP.2014.7025927
19. Fu, Z., Wang, W., Huang, Y., Ding, X., Ma, K.K.: Uncertainty inspired underwater image enhancement. In: European Conference on Computer Vision, pp. 465–482. Springer (2022)
20. Gao, W., Zhang, X., Yang, L., Liu, H.: An improved sobel edge detection. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 5, pp. 67–71. IEEE (2010)
21. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
22. Hastings, WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika; 1970; 57,
23. Huang, S., Wang, K., Liu, H., Chen, J., Li, Y.: Contrastive semi-supervised learning for underwater image restoration via reliable bank. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18145–18155 (2023)
24. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
25. Hunt, R.W.G., Pointer, M.R.: Measuring Colour. Wiley (2011)
26. Islam, M.J., Luo, P., Sattar, J.: Simultaneous enhancement and super-resolution of underwater imagery for improved visual perception. In: Robotics: Science and Systems (RSS). Corvalis, Oregon, USA (2020). https://doi.org/10.15607/RSS.2020.XVI.018
27. Islam, MJ; Xia, Y; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett.; 2020; 5,
28. Jerlov, N.G.: Marine Optics. Elsevier (1976)
29. Jiang, J., Liu, D., Gu, J., Süsstrunk, S.: What is the space of spectral sensitivity functions for digital color cameras? In: 2013 IEEE Workshop on Applications of Computer Vision (WACV), pp. 168–179 (2013). https://doi.org/10.1109/WACV.2013.6475015
30. Jinjin, G., Haoming, C., Haoyu, C., Xiaoxing, Y., Ren, J.S., Chao, D.: Pipal: a large-scale image quality assessment dataset for perceptual image restoration. In: European Conference on Computer Vision, pp. 633–651. Springer (2020)
31. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
32. Koschmieder, H.: Theorie der horizontalen sichtweite. Beitrage zur Physik der freien Atmosphare pp. 33–53 (1924)
33. Li, C; Anwar, S; Hou, J; Cong, R; Guo, C; Ren, W. Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Transact. Image Process.; 2021; 30, pp. 4985-5000. [DOI: https://dx.doi.org/10.1109/TIP.2021.3076367]
34. Li, C; Anwar, S; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recogn.; 2020; 98, [DOI: https://dx.doi.org/10.1016/j.patcog.2019.107038]
35. Li, C; Guo, C; Ren, W; Cong, R; Hou, J; Kwong, S; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Transact. Image Process.; 2020; 29, pp. 4376-4389. [DOI: https://dx.doi.org/10.1109/TIP.2019.2955241]
36. Liu, K; Liang, Y. Enhancement of underwater optical images based on background light estimation and improved adaptive transmission fusion. Opt. Express; 2021; 29,
37. M Uplavikar, P., Wu, Z., Wang, Z.: All-in-one underwater image enhancement using domain-adversarial learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
38. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
39. Ouyang, T; Zhang, Y; Zhao, H; Cui, Z; Yang, Y; Xu, Y. A multi-color and multistage collaborative network guided by refined transmission prior for underwater image enhancement. Vis. Comput.; 2024; [DOI: https://dx.doi.org/10.1007/s00371-023-03215-z]
40. Panetta, K; Gao, C; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng.; 2015; 41,
41. Pascale, D.: Rgb Coordinates of the macbeth colorchecker. The BabelColor Company 6 (2006)
42. Peli, E. Contrast in complex images. JOSA A; 1990; 7,
43. Peng, L., Zhu, C., Bian, L.: U-shape transformer for underwater image enhancement. IEEE Transactions on Image Processing (2023)
44. Schechner, YY; Karpel, N. Recovery of underwater visibility and structure by polarization analysis. IEEE J. Ocean. Eng.; 2005; 30,
45. Schettini, R; Corchs, S. Underwater image processing: state of the art of restoration and image enhancement methods. EURASIP J. Adv. Signal Process.; 2010; 2010, pp. 1-14. [DOI: https://dx.doi.org/10.1155/2010/746052]
46. Silberman, N; Hoiem, D; Kohli, P; Fergus, R. Fitzgibbon, A; Lazebnik, S; Perona, P; Sato, Y; Schmid, C. Indoor segmentation and support inference from RGBD images. Computer Vision - ECCV 2012; 2012; Berlin Heidelberg, Berlin, Heidelberg, Springer: pp. 746-760. [DOI: https://dx.doi.org/10.1007/978-3-642-33715-4_54]
47. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014)
48. Solonenko, MG; Mobley, CD. Inherent optical properties of Jerlov water types. Appl. Opt.; 2015; 54,
49. Takao, S.: Zero-shot image enhancement with renovated laplacian pyramid. In: European Conference on Computer Vision Workshops, pp. 721–737. Springer (2022)
50. Wang, W., Dang, Z., Hu, Y., Fua, P., Salzmann, M.: Robust differentiable SVD. IEEE transactions on pattern analysis and machine intelligence (2021)
51. Wang, Y; Song, W; Fortino, G; Qi, L; Zhang, W; Liotta, A. An experimental-based review of image enhancement and image restoration methods for underwater imaging. IEEE Access; 2019; 7, pp. 140233-140251. [DOI: https://dx.doi.org/10.1109/ACCESS.2019.2932130]
52. Wu, Y., He, K.: Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
53. Xue, Q; Hu, H; Bai, Y; Cheng, R; Wang, P; Song, N. Underwater image enhancement algorithm based on color correction and contrast enhancement. Vis. Comput.; 2023; [DOI: https://dx.doi.org/10.1007/s00371-023-03117-0]
54. Yang, M; Sowmya, A. An underwater color image quality evaluation metric. IEEE Transact. Image Process.; 2015; 24,
55. Yoo, J., Uh, Y., Chun, S., Kang, B., Ha, J.W.: Photorealistic style transfer via wavelet transforms. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
56. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)
57. Zhou, J; Wei, X; Shi, J; Chu, W; Lin, Y. Underwater image enhancement via two-level wavelet decomposition maximum brightness color restoration and edge refinement histogram stretching. Opt. Express; 2022; 30,
58. Zhou, Y., Yan, K.: Domain Adaptive Adversarial Learning Based on Physics Model Feedback for Underwater Image Enhancement. arXiv preprint arXiv:2002.09315 (2020)
59. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
60. Zhuang, P; Wu, J; Porikli, F; Li, C. Underwater image enhancement with hyper-Laplacian reflectance priors. IEEE Transact. Image Process.; 2022; 31, pp. 5442-5455. [DOI: https://dx.doi.org/10.1109/TIP.2022.3196546]
Copyright Springer Nature B.V. Jan 2025