A super resolution generative adversarial networks and partition-based adaptive filtering technique for detect and remove flickers in digital color images

Abstract

Eliminating flickering from digital images captured by cameras equipped with a rolling shutter is of paramount importance in computer vision applications. The ripple effect observed in an individual image is a consequence of the non-synchronized exposure of rolling shutters utilized in CMOS sensor-based cameras. To date, there have been only a limited number of studies focusing on the mitigation of flickering in single images. Furthermore, it is more feasible to eliminate these flickers with prior knowledge, such as camera specifications or matching images. To solve these problems, we present an unsupervised framework Super-Resolution Generative Adversarial Networks and Partition-Based Adaptive Filtering Technique (SRGAN-PBAFT) trained on unpaired images from end to end Deflickering of a single image. Flicker artifacts, which are commonly caused by dynamic lighting circumstances and sensor noise, can severely reduce an image’s visual quality and authenticity. To enhance image resolution SRGAN is used, while Partition based Adaptive Filtering technique detects and mitigates flicker distortions successfully. Combining the strengths of deep learning and adaptive filtering results in a potent approach for restoring image integrity. Experimental results shows that the Proposed SRGAN-PBAFT method is effective, with major improvements in visual quality and flicker aberration reduction compared to existing methods.

Full text

Translate

Turn on search term navigation

1. Introduction

Applications like object recognition and medical image analysis can be improved by using the abundance of detailed information in high-resolution images [1]. The ability of neural networks to comprehend the mapping relationships between high-resolution and low-resolution images is essential in the super-resolution reconstruction of images. As deep learning has progressed, model parameters have gradually risen, and the network topology has grown increasingly complicated [2]. The power grid quasi-periodically alters AC-powered light sources’ luminance, resulting in rolling shutter camera images under artificial lighting frequently containing flickers [3]. More specifically, during exposure, varying intensities of quasi-periodic sinusoidal signals disturb the pixels captured in the picture. Worse yet, it might interfere with more advanced tasks that come next, such as position estimation, object detection, and interior scene recognition.

To tackle this problem, the majority of current approaches utilize historical data from lighting systems [4] or depend on imaging device settings, such as the inter-row delay of the rolling shutter [5], for the creation of digital filters aimed at mitigating flickering. However, because this background data is unavailable, they can’t be used in the real world. Another field of study looks at the analysis of image collections, namely the blank spaces within them. For example, combining images taken at different exposure times is one approach to developing a flicker suppression algorithm for images with a wide dynamic range [6]. This technique works particularly well for mitigating flicker in short-exposure images. It’s crucial to remember that this method cannot eliminate flicker from a single image because it requires numerous images of the same scene. However, images with complex backgrounds significantly degrade their performance, limiting their utility.

Data-driven methods have recently demonstrated impressive results in various image enhancement tasks. Supervised learning-based approaches for removing flicker become unfeasible due to challenges in obtaining flicker-free image pairs in real-world scenarios. Using unsupervised techniques provides an alternative solution. Since the generation and removal of flicker in images involve cyclic mapping complexities, we leverage the SRGAN’s ability to facilitate information transfer between source and destination domains, enabling the development of a single-image DEFlickering technique and vice versa. However, the SRGAN generator introduces distortions in color and structure that must be addressed, as it cannot eliminate flickers without considering their patterns. We have meticulously tailored the network topology to align with specific objectives to tackle these issues and have introduced new constraint function designs to regularize network training. Additionally, we employ highly trained discriminators in a unified system to detect flicker. According to the results of our experiments, our Deflickering algorithm outperforms common non-learning approaches and the older SRGAN when learning from synthetic and real images [7].

The algorithmic structure and design reasoning of PBAFT’s is explained by its theoretical basis. We analysis this method with multiple dataset and compare it to flicker reduction approach. PBAFT is required for satellite photography, medical imaging and cinematography in order to clarify its applications and restore image quality. This article helps to understand the mechanism of PBAFT and incorporate it into image processing pipelines for the compelling and authentic visual content creation.

* To enhance image resolution SRGAN is used and the flicker distortions were deducted and reduced by the PBAFT filtering technique.

* By combining these two method the strength of deep learning and adaptive filtering technique leverages to offer a powerful result for restoring image integrity.

* The results from the experimental validation reveals the proposed SRGAN-PBAFT method is effective and shows great improvements in visual quality and reduction in flicker abnormalities compared to the other existing methods.

* To upgrade multimedia content by simultaneously tackling resolution enhancement and flicker artifact reduction.

Following this structure, the rest of the paper will discuss: In Section 2, we present a synopsis of the relevant literature. The suggested system is described in Section 3, the experimental results are presented and discussed in Section 4, and the conclusion and future directions are discussed in Section 5.

2. Literature survey

The Section 2 discusses the literature survey of existing system, Nowisz et al. [8] offer a method for removing flicker from frame streams that exceed 200 frames per second and work as an online filter. The bulk of solutions in the literature emphasize efficacy and accuracy over speed of operation. Conversely, our initial method is optimized for speed while maintaining enough accuracy to be used before computing differential frames to identify motion in streams. Our method is flexible and performs well with various flickering light sources and when lighting circumstances change. The trial’s results demonstrate how well the CPU and GPU technology work to monitor items of interest in early applications of a fast-approaching badminton system available for acquisition.

Lin et al. [9] have described the DEFlicker Cycle GAN, an unsupervised architecture trained on unpaired images for end-to-end single-image DEFlickering. Flicker and gradient loss are two new capabilities that we carefully designed to decrease the possibility of color distortion and edge blurring. We also include a cycle-consistency loss to ensure that image content similarity is retained. Furthermore, we present a novel technique that employs the knowledge of two pre-trained Markovian discriminators in an ensemble approach to detect flicker artifacts in images. Extensive trials on synthetic and real datasets demonstrate that our DEFlicker Cycle GAN exhibits strong generalization capabilities and achieves high accuracy in flicker detection, outperforming a well-trained ResNet50 classifier. It is also very good at reducing flicker within a single image.

Gao et al. [10] developed a groundbreaking two-stage detection technique that greatly enhanced detection speed for addressing adaptable targets. However, the robustness of hypothesis testing using the K distribution is compromised by time-varying clutter. Additionally, the dynamic programming-based Track before Detect method encounters challenges in multi-target detection due to the absence of prior knowledge, necessitating higher processing overhead. Moreover, target state disruptions occur due to the target’s flickering and blanking conditions. The cumulative integration quantity of the previous technique must be increased, resulting in reduced performance in flickering multi-target detection. We introduce a novel method for low Signal-to-Clutter Ratio (SCR) flickering multi-target detection, which can be attributed to its straightforward architecture and minimal data requirements.

Zhang et al. [11]’s innovative multi-frame joint detection method is the Biological Memory Model-based Multi-Target Joint Detection Technique. This approach locates elusive, intermittent items on rivers and seas. They solved multi-target identification problems in unexpected target states with a pre-filtering operator. BMM-DP-MJD’s memory weight-based integration ensures accurate identification of flickering dynamic targets. When the Signal-to-Clutter Ratio (SCR) was low, 3 to 8 dB, flashing targets improved detection accuracy in simulations. Experimental results show that this approach accurately locates ships and micro buoys in river environments, enhancing marine navigation radar target detection.

Chudasama et al. [12] presented two unique and effective computational algorithms for single-image super-resolution that reduce the component count to improve performance. While some hyperparameters were necessary, decreasing the model’s parameters might shorten the time needed to tune them. To reduce the time spent making changes, the technique took a different tack, employing optimization algorithms to find a variety of model hyperparameter combinations on their own. There are a few general model hyperparameter search approaches provided.

Kopania et al. [13] framework the system’s architectural progress. The shuttlecock was initially tracked in three dimensions by cameras above and around the lines. An enhanced version used cameras throughout the court instead of three-dimensional reconstruction. This study compares our system’s competition results to linesmen’s decisions. In badminton events, our technology matches the world’s largest commercial goods in accuracy and processing speed. Our design and algorithms make installation faster and easier, making our system more pervasive, robust, adaptive, and adjustable to sporting facility needs.

Shekhar et al. [14] demonstrate that per-frame stylised videos can be temporally coherent regardless of frame stylization. Reduce the gradient domain difference to retain per-frame processed output similarity. Interactive consistency management provides faster and more accurate optical flow computation on the incoming video stream for stabilization than previous methods. For fast optical-flow inference, we use PWC-Net-based lightweight flow networks. Real-time HD frame rates are achieved using GPU-based optimization. User study shows that our temporally consistent output beats alternative methods.

Si et al. [15] introduced the IA multi-target filter, which simultaneously approximations the multi-target state and the clutter distribution, while characterizing unidentified clutter as a gamma distribution. However, it continues to work on the basis of a known cluster distribution, a limitation that may be more successful in practice. The SRBE-PF-TBD technique is notable for its versatility and durability, since it works well even without prior clutter information and consistently yields great detection performance in a variety of target identification conditions. Unfortunately, due to its high processing cost and theoretical base in particle filtering, this approach is inappropriate for real-world deployments.

Raihan et al. [16] compared underwater image restoration methods. Earlier methods used hardware like polarizers, sensors, lasers, etc. To capture images of the same topic and run them through an algorithm to sharpen them. That hardware approaches involve complex hardware setups and are more expensive. For underwater images improvement with low visibility, computer vision algorithms and optical models have been created. DCP, histogram equalization, and color correction are popular. The literature also shows that optical images restoration works. Wavelength correction and artificial light detection and exclusion work well together.

2.1. Limitations for existing system

* A potentially leading method need to be developed to treat specific types of flickers which leads to insufficient detection or eradication.

* Detecting and removing flickers in complex scenarios, where different portions of the image exhibit varying flicker characteristics, can be challenging.

* Digital images are often susceptible to noise, which might be misinterpreted as flickers, resulting in false positive detection.

* Flickers can occur at varied rates and phases across frames due to temporal synchronization.

* Due to dynamic lighting conditions, camera movements and object interactions flicker characteristics can change over time.

3. Proposed system

The Section 3 discusses the proposed method about an unsupervised system SRGAN-PBAFT which trained on a unpaired image to perform deflickering from start to finish in a single image. There are several factors which cause Flicker artifacts including dynamic illumination condition and sensor noise can minimize an image visual quality and trustfulness. This proposed work introduced to enhance image resolution and mitigates flicker distortions effectively using SRGAN-PBAFT. To produce a powerful solution to restore image integrity we combine these two algorithms. The block diagram illustrating the SRGAN-PBAFT method is displayed in Fig 1.

[Figure omitted. See PDF.]

3.1. Datasets

Utilizing a haze-free image and the corresponding depth map downloaded from the NYU2 Depth dataset, create synthetically hazy images as explained in [17]. In this procedure, compute the transmission map for each image, represented as t(x, y), by utilizing Equation (2), which employs the scattering coefficient (β) and the depth information (d(x, y)). Then, to combine the light from different sources, we use a model of air scattering. (A), the transmission map (t(x, y)), and the haze-free image in order to generate the hazy image. This paper makes the assumption of globally uniform ambient light (A) in our method. We select β, the scattering coefficient, from the set {0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6}, and set [a, a, a] as the ambient light A, with ‘a’ ranging between [0.7, 1.0]. For assistance with this, we select 1,000 images from the NYU2 Depth dataset that are clear of fog at random. Fig 2 shows some examples of the photos found in the NYU2 Depth dataset. Generate hazy images by using atmospheric light A and randomly sampled scattering coefficient β, resulting in ten training images for each haze-free image. In total, 10,000 training images are available. This data create a 300-image indoor test synthetic dataset using the imagery and depth maps from the Middlebury stereo dataset, as shown in Fig 3 [18]. Additionally, as an external synthetic test dataset, Fig 4 utilizes 500 outdoor synthetic hazy images taken from the SOTS [19] dataset. It’s important to note that none of these test images are used during the training phase. This research ensures ethical AI development by using responsibly sourced data, minimizing biases, and preventing misuse in deceptive image manipulation. This research prioritize transparency, privacy protection, and energy-efficient model training to promote fairness and sustainability in digital image processing.

[Figure omitted. See PDF.]

3.2. Pre-processing

Several essential techniques can be employed to preprocess digital color images to detect and remove flickers. Image scaling and color space conversion may first be used to ensure uniformity and convenience of processing. Histogram equalization can enhance image contrast, aiding in the identification of flickers. Noise reduction techniques, such as Gaussian or median filtering, can help mitigate minor fluctuations. Additionally, frame differencing and motion detection algorithms can be implemented to identify flickering regions within the image. These preprocessing steps lay the foundation for accurate flicker detection and removal, improving the overall quality and stability of digital color images.

3.3. Modelling of image flickers

Images with sinusoidal impulses but lack flicker can be called flickering images [20]. Flicker can be represented as follows because of its sinusoidal-like pattern:

(1)

The color channel is indicated by ‘c,’ and the image’s pixel coordinates are signified by ‘(a, b).’ ‘X’ signifies a flickering image, while ‘Y’ defines the image without flicker. The flicker in channel ‘c’ has an intensity of ‘Ac.’ Fig 5 shows that flicker signals can have varying intensities in each RGB channel because the light source’s spectrum affects each channel’s flicker intensity. In a rolling shutter camera, the electric network frequency (ENF), usually 50 Hz or 60 Hz, is represented as , while the row sampling frequency is indicated as . Notably, the flicker signal frequency is double that of the typical ENF due to the power consumption of lighting fixtures. The first row of image X is captured simultaneously as the power grid’s first phase begins. The goal of image deflickering is to replace a flickering image (X) with a steady image (Y).

[Figure omitted. See PDF.]

3.4. Super resolution generative adversarial networks

From a low-resolution input image , super-resolved image I SR was obtained. The low-resolution equal of high-resolution counterpart is represented here as .

The high-resolution images can be accessed exclusively during the training phase. These images are generated for training purposes through the application of a Gaussian filter and a downsampling process with a downsampling reason denoted as ‘r.’ When dealing with an image possessing C color channels, we employ the notation ‘’ to represent it and utilize a real-valued tensor with ‘’ and ‘’ dimensions.

This paper objective is to train a generating function G that computes the appropriate image for a given input image. Then, train a generator network as a feed-forward CNN parameterized by in order to accomplish this. Here, which is achieved by optimization, a loss function unusual to . We solve the following for training images with appropriate :

(2)

Specifically, a weighted sum of many loss components corresponding to different needed elements of the recovered image will be used to compute a perceptual loss in this study.

3.5. Adversarial network architecture

To tackle the difficulty of addressing the adversarial min-max problem, this study presents a discriminator network called , and uses to optimize it alternately:

(3)

A differentiable discriminator (D) can be instructed to distinguish between real and super-resolved images, but a generative model (G) can be built to outperform D. Using this method, our generator is able to produce outputs that are highly realistic, making them difficult for D to classify. This, in turn, confirms that solutions are located in the subspace or manifold of natural images and have better perceptual properties. The mean square error (MSE) and other pixel-wise error metrics yield less accurate solutions. Fig 6 show that our generator network G, intended for deep water, comprises B identical residual blocks. Train a discriminator network to differentiate between generated SR data and actual HR images. In accordance with the design objectives, we employ the LeakyReLU activation function (with α set to 0.2) to prevent max-pooling throughout the network. The maximization problem described in Equation (3) is integrated into the discriminator network. Following this, two dense layers are added, and a final sigmoid activation function is applied after the 512 feature maps to yield a probability for straightforward categorization.

[Figure omitted. See PDF.]

3.6. Perceptual loss function

The performance of the generator network is impacted by the definition of our perceptual loss function. Although Mean Squared Error is widely used to approximate low-resolution (LR) images, we improve on previous research [21–22] by developing a loss function that assesses solutions according to perceptually meaningful features. The adversarial loss component plus the content loss are the weighted sums that make up the perceptual loss.

(4)

The possibilities for the adversarial loss Gen and the tent loss are described below.

3.6.1. Content loss.

The following formula is used to determine the Mean Squared Error (MSE) loss per pixel:

(5)

This imaginary optimization aim is used the most, and it forms the basis for many inventive approaches. Solutions to MSE optimization difficulties often need more high-frequency content, even at very high PSNRs, leading to perceptually unsatisfactory solutions with overly smooth textures.

In this case, the PBAPT network’s and feature maps’ dimensions are described.

3.6.2. Adversarial loss.

This paper supplements the perceptual loss with the generative constituent of our GAN in addition to the previously mentioned content losses. Using the discriminator ( ()) probabilities across all training samples, the generative loss l SR Gen is defined as

(6)

The likelihood that the reconstructed image is a genuine HR image is represented here by . Rather of minimizing , we minimize for improved gradient behaviour.

3.7. Partition-based adaptive filtering technique (PBAFT)

In various contexts, from entertainment to medical imaging, digital color images are essential. However, they frequently experience flickers, which can be caused by several factors such as sensor noise, compression errors, or abrupt changes in lighting. Flickers are abrupt, erratic changes in pixel values that give the appearance that an image is unstable and inconsistent. With its effective approach to flicker detection and eradication, the Partition-Based Adaptive Filtering Technique represents a significant advancement in image processing [23]. This approach effectively locates and minimizes flicker-induced disruptions by applying adaptive filtering inside different image segments.

Consider an unknown linear system.

(7)

where signifies the unknown system vector. For the input vector; stands for the input signal at instant m; stands for the corresponding desired output signal; and stands for observation noise. Next, the error signal is explained as

(8)

The estimate of two at instant n is given by w(n). We may change the PBAPT algorithm’s updating formula.

(9)

Where σ is the step-size parameter and is satisfied by the fractional-order α. The PBAFT will degenerate to the conventional LMS algorithm upon .

We examine the PBAFT algorithm’s transient and steady-state performance in terms of mean-square error. We define a diagonal matrix below to make the following mathematical formula simpler:

(10)

As a result, the updating formula in (9) can be reformulated as

(11)

After that, the weight-related error vector is defined as follows:

(12)

The formula for updating the weight error vector could be obtained by substituting equation (12) for equation (11).

(13)

The error in (8) can be reconsidered as (12).

(14)

Where stands for an excess error that is noise-free. When we add (14) to (13), we get

(15)

Where

(16)

Same, . We use the subsequent two well-known hypotheses to help with analysis.

Assumption 1: The autocorrelation matrix

is utilized to ascertain the autocorrelation matrix for the temporally independent random process that is the input signal .

Assumption 2: The noise, referred to as , is independent of and is made up of zero-mean i.i.d random procedures with variance equal to .

3.7.1. Mean square transient behavior.

By applying Assumptions 1 and 2 along with the weighted Euclidean norm expectations of (15), we may obtain

(17)

Where denotes the weighted matrix and Σ denotes any symmetric semi-definite matrix. T can be vectorized using the Kronecker product property.

(18)

Where θ = and

(19)

For sufficiently small step-size σ, the approximation of (19) makes sense. The second term on (17)’s suitable half surface can therefore be further stated as

(20)

Where

(21)

Thus, it is possible to rewrite (17) as

(22)

Using (22) as an iteration starting with m = 0, we get

(23)

the popular performance metric is defined as the excess mean-square error (EMSE). The theoretical EMSE can be constructed by selecting an appropriate .

(24)

Likewise, by setting the parameter , the mean square deviation learning curve may be calculated.

3.7.2. Mean square steady-state behavior.

By using the value from equation (22) and presumptuous that is in a steady-state condition, the limit may be computed.

Thus, the theoretic stable θ state EMSE can be achieved by inserting .

Also, we assume that the distribution of the input signal u(n) elements is the similar. As a result, we get , , where denotes the variance. As a result, the steady-state EMSE is different.

(25)

The two expectancies of (25) can be explicitly derived when the zero-mean Gaussian signal by using the formula

(26)(27)

Where displays the Gamma function. Consequently, the steady-state EMSE of equation (21) can be expressed as

(28)

To further simplify the steady-state Mean Squared Error (EMSE) of equation (25), one can extract r(m) from a uniform distribution whose mean is zero.

(29)

Partition-based adaptive filtering technique algorithm:

Initialization:

Define the input signal r[m] and desired signal k[m].

Choose the filter length L and partition size P.

Set the w[k] filter coefficients to small random values or zeros.

Set the σ (learning rate) step size parameter

Partition the Data:

Divide the input signal x[m] and desired signal k[m] into non-overlapping partitions of size P.

For instance, if the input signal has M samples, you will have M/P partitions.

Adaptive Filtering Loop

For each partition (i = 1 to M/P):

Extract the current partition of the input signal x[i] and desired signal k[i].

Perform the following steps within the partition

Step 1: Filtering Operation

Apply the filter defined by coefficients w[k] to the current partition of the input signal to generate an estimate y[i].

Compute the error signal e[i] = k[i] - y[i].

Step 2: Update Filter Coefficients:

Update the filter coefficients w[k] using an adaptive algorithm such as the LMS (Least Mean Squares) algorithm or NLMS (Normalized LMS):

w[k] = w[k] + σ * e[i] * x[i-k], for k = 0, 1, 2,..., L-1.

Step 3: Repeat until Convergence or Fixed Number of Iterations:

Repeat the adaptive filtering loop for each partition for a predetermined number of iterations or until the convergence requirements are satisfied (such as when the error becomes sufficiently reduced).

Output

The modified filter that minimizes the error across all partitions is represented by the final filter coefficients, w[k].

4. Result and discussion

The Section 4 discuss the result of the experiment settings and performance evaluation, and compares the proposed system with existing systems like GAN (Generative Adversarial Networks) [24], convLSTM (Convolutional Long Short Term Memory) [25], CAM-FAR (class attention map-based flare removal network) [26], ADE (Adaptive Differential Equalization) [27], and Li-Fi (light-fidelity) [28]. Fig 7 shows the images after flicker removal in the NYU2 depth dataset, while Fig 8 shows the images after indication removal in the Middlebury stereo dataset.

[Figure omitted. See PDF.]

4.1. Experiment settings

“We use an NVIDIA TITAN X GPU to train the networks. The MatConvNet toolkit is used to implement the suggested approach [29]. Every training image has been scaled to 320 x 240 pixels. Batch size, weight decay, and momentum have parameters of 10, 0.001, and 0.9, respectively. The ambient light module and the transmission map module have starting learning rates of 10−3 and 10−6, respectively. Moreover, after 20 epochs, both modules’ learning rates drop by a factor of 10. After 80 epochs, the training period comes to an end. The parameters are initialized with λt = 1, λA = 102, and λP = 5 × 10−4. The convolutional layer’s kernel size (h × w) in the atmospheric light module is 15 x 15. The discriminator and generator networks are updated alternately like in a standard GAN.”

4.2. Performance metrics

These three metrics Mean Squared Error (MSE), Precision, Recall, and Accuracy, along with Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM)—are used to quantitatively assess picture dehazing methods on synthetic images. Without ground-truth reference images, real-world images evaluate performance subjectively and visually. Traditional theory bases these measurements on the outcomes of binary classification, including true positive (TP), true negative (TN), false positive (FP), and false negative (FN). TP stands for successfully discovered, while TN represents correctly recognized elements. FP and FN indicate incorrectly recognized elements. These performance metrics are further described in the following paragraphs.

Precision: Precision is quantified as the percentage of accurately classified cases. To demonstrate this, utilize Equation (30).

(30)

Recall: The term recall rate or recall may also refer to the genuine positive rate. It evaluates the frequency at which a classifier produces a good outcome for the correct category. Equations (31) are used to explain it.

(31)

Accuracy: The total number of true positives and negatives is divided by all values to determine our forecasted value’s accuracy (32).

(32)

Mean square error (MSE): The mean square error is a commonly used metric in statistics and machine learning to determine the average squared difference between expected and actual data. The MSE formula is as follows:

(33)

The number of data points in the dataset is indicated by n.

The actual (observed) value for the ith data point is represented by .

The character indicates the expected value for the ith data point.

Signifies the sum of all data points between i = 1 and n.

Peak-signal-to-noise ratio (PSNR): It is the peak error measured and calculated as

(34)

Pi is the likelihood that a pixel in imagine F would experience intensity i.

Structure similarity index measure (SSIM): SSIM analyzes the brightness, contrast, and structure of the improved patches at x and y locations to the original picture patches to determine whether they are the same.

(35)

Where the standard deviation values of the pixels in patches x and y are , , and the mean values are , , correspondingly. The covariance of patches x and y is called , and small constants C1 = (k1L) 2 and C2 = (k2L) 2 prevent instability when the denominator is near zero. L is the dynamic range of the pixel values, where k1 = 0.01 and k2 = 0.03. The greater the SSIM value, the less distortion there is, and the better the improvement.

4.3. Precision analysis

In Fig 9 and Table 1, the precision of the SRGAN-PBFPT methodology is contrasted with other commonly applied techniques. The improved precision performance of the deep learning method is illustrated in the graph where the SRGAN-PBFPT model’s precision for 100 data is 94.99%, while the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models have respective precisions of 77.19%, 88.19%, 82.19%, 93.55%, and 91.99%. Like this, the suggested SRGAN-PBAFT model has a precision of 96.88% under 600 data, compared to 81.99%, 91.52%, 87.77%, 95.19%, and 93.15% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, correspondingly.

[Figure omitted. See PDF.]

4.4. Recall analysis

The recall of the SRGAN-PBAFT approach is compared to other frequently used techniques in Fig 10 and Table 2. The improved recall performance of the DL approach is depicted in the graph where the SRGAN-PBAFT model’s recall for 100 data is 92.19%, while the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models have recalls of 66.19%, 78.19%, 72.87%, 88.18%, and 83.98%, respectively. Like this, the suggested SRGAN-PBAFT model has a recall of 96.13% under 600 data, compared to 71.25%, 81.66%, 77.13%, 91.99%, and 87.99% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, correspondingly.

[Figure omitted. See PDF.]

4.5. Accuracy analysis

In Fig 11 and Table 3, the accuracy of the SRGAN-PBAFT strategy is compared with other commonly utilized techniques. The graph shows how the DL method has an enhanced accuracy performance. For instance, the accuracy with 100 data for the SRGAN-PBAFT model is 94.19%, compared to 70.19%, 82.88%, 76.66%, 92.19%, and 88.19% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models respectively. Similarly, the suggested SRGAN-PBAFT model achieves an accuracy of 97.99% with 600 data, compared to 75.78%, 87.77%, 81.23%, 96.99%, and 91.35% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively.

[Figure omitted. See PDF.]

4.6. MSE analysis

The MSE analysis of the SRGAN-PBAFT methodology with other techniques is presented in Fig 12 and Table 4. The graph illustrates how the DL method has an improved performance while reducing MSE. In contrast, the MSE values for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models are 45.12%, 41.22%, 37.98%, 31.88%, and 27.19%, respectively, whereas the SRGAN-PBAFT model has an MSE of 21.87% with 100 data. The MSE value for the SRGAN-PBAFT model is 26.19% with 600 data, compared to 48.11%, 44.99%, 40.12%, 36.33%, and 30.18% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively.

[Figure omitted. See PDF.]

4.7. PSNR analysis

The PSNR of the SRGAN-PBAFT approach is compared to other frequently used techniques in Fig 13 and Table 5. The graph shows how the DL method enhances PSNR performance. For instance, the PSNR for the SRGAN-PBAFT model is 24.19%, with 100 data compared to 38.19%, 33.18%, 44.16%, 30.18%, and 49.19% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively. Similarly, the suggested SRGAN-PBAFT model has a PSNR of 29.56% under 600 data, compared to 43.56%, 37.98%, 48.19%, 32.87%, and 53.19% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively.

[Figure omitted. See PDF.]

4.8. SSIM analysis

In Fig 14 and Table 6, the SSIM of the SRGAN-PBAFT strategy is compared with other commonly used methods. The graph illustrates the improved SSIM performance of the DL approach. For instance, the SSIM with 100 data for the SRGAN-PBAFT model is 0.936%, compared to 0.214%, 0.412%, 0.612%, 0.729%, and 0.891% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively. Similarly, the suggested SRGAN-PBAFT model has an SSIM of 0.987% under 600 data, compared to 0.119%, 0.598%, 0.723%, 0.866%, and 0.915% for the ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi models, respectively.

[Figure omitted. See PDF.]

4.9. Ablation study

In the proposed model, every module is essential. In this section, we analyze the basis of the proposed SRGAN-PBAFT and current models such ADE, ConvLSTM, CAM-FRN, GAN, and Li-Fi by showing a series of ablation tests on the NYU2 Depth dataset, Middlebury stereo dataset, and SOTS dataset. Step-by-step addition of various features resulting by our proposed SRGAN-PBAFT model proves its reasons and helps study performance enhancement.

4.10. Influence of the PBAFT

The accuracy analysis of the proposed SRGAN-PBAFT approach is compared to other existing methods shown in Fig 15 and Table 7. The PBAFT has a substantial impact in detecting and removing flickers from digital color images. By reasonably splitting images into smaller pieces, PBAFT professionally targets flickering spots, increasing its precision. It customizes its approach to each segment’s specific features using adaptive filtering techniques, ensuring excellent flicker removal while keeping visual details and color fidelity. PBAFT’s technique has low computing overhead, making it suitable for real-time applications. Finally, SRGAN combined with PBAFT model that achieved the superior performance of 97.99% for our input data. In assessment the existing MLP, CNN, SOM, DBN and GAN achieved accuracy performance of 94.19%, 96.99%, 94.13%, 95.99%, and 95.18% respectively.

[Figure omitted. See PDF.]

4.11. Influence of the K-fold cross-validation

The 10-fold cross validation of the SRGAN-PBAFT approach is discussed in Table 8. By combining 10-fold cross-validation, the SRGAN-PBAFT-based flicker detection and removal method in digital color images is made much more dependable and robust. By splitting the dataset into 10 segments, or subsets, one is put sideways for validation and the other nine are used for training, a method of figures known as cross-validation is active. The suggested model SRGAN-PBAFT achieved a higher performance of 97.99% from our input data by using a 10-fold cross-validation procedure.

[Figure omitted. See PDF.]

5. Conclusion

The combination of SRGANs-PBAFT method has produced a effective method for detecting and removing flickers in digital color images. SRGANs capably develop the awareness and simplicity of the images by addressing the problem of image resolution. The ability to achieve this outcome is enabled through significant enhancement in image quality for low-resolution inputs, resulting in the production of high-resolution outputs. Moreover, by training the generator network to produce representative textures and details, generative adversarial networks current a novel and reliable method to enhance the perceived quality of images. The study’s findings designate a respectable chance of improving the graphic quality of images with flickering artifacts. Our method offers a favorable solution to develop the excellence of digital color images and deliver users with a more immersive and aesthetically attractive experience as technology continues to advance. Future research could examine additional developments and additions to our suggested method, improving its performance and addressing more difficult problems in digital image improvement.

References

1. 1. Orejón-Sánchez RD, Hermoso-Orzáez MJ, Gago-Calderón A. LED Lighting installations in professional stadiums: energy efficiency, visual comfort, and requirements of 4K TV broadcast. Sustainability. 2020;12(18):7684.

* View Article

* Google Scholar

2. 2. Ahn H-A, Hong S-K, Kwon O-K. A highly accurate current LED lamp driver with removal of low-frequency flicker using average current control method. IEEE Trans Power Electron. 2018;33(10):8741–53.

* View Article

* Google Scholar

3. 3. Castro I, Vazquez A, Arias M, Lamar DG, Hernando MM, Sebastian J. A review on flicker-free AC–DC LED drivers for single-phase and three-phase AC power grids. IEEE Trans Power Electron. 2019;34(10):10035–57.

* View Article

* Google Scholar

4. 4. Raducanu BC, Zaliasl S, Stanzione S, van Liempd C, Quintero AV, De Smet H, et al. An artificial iris ASIC with high voltage liquid crystal driver, 10-nA light range detector and 40-nA blink detector for LCD flicker removal. IEEE Solid-State Circuits Lett. 2020;3:506–9.

* View Article

* Google Scholar

5. 5. Sheinin M, Schechner YY, Kutulakos KN. Rolling shutter imaging on the electric grid. 2018 IEEE International Conference on Computational Photography (ICCP). 2018:1–12.

* View Article

* Google Scholar

6. 6. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244

* View Article

* PubMed/NCBI

* Google Scholar

7. 7. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-resolution using a generative adversarial network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:105–14.

* View Article

* Google Scholar

8. 8. Nowisz J, Kopania M, Przelaskowski A. Realtime flicker removal for fast video streaming and detection of moving objects. Multimed Tools Appl. 2021;80(10):14941–60.

* View Article

* Google Scholar

9. 9. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244.

* View Article

* PubMed/NCBI

* Google Scholar

10. 10. Gao J, Du J, Wang W. Radar detection of fluctuating targets under heavy-tailed clutter using track-before-detect. Sensors (Basel). 2018;18(7):2241. pmid:30002277

* View Article

* PubMed/NCBI

* Google Scholar

11. 11. Zhang Q, Huo W, Pei J, Zhang Y, Yang J, Huang Y. A novel flickering multi-target joint detection method based on a biological memory model. Remote Sensing. 2021;14(1):39.

* View Article

* Google Scholar

12. 12. Chudasama V, Upla K. Computationally efficient progressive approach for single-image super-resolution using generative adversarial network. J Electron Imag. 2021;30(02).

* View Article

* Google Scholar

13. 13. Kopania M, Nowisz J, Przelaskowski A. Automatic shuttlecock fall detection system in or out of a court in badminton games-challenges, problems, and solutions from a practical point of view. Sensors (Basel). 2022;22(21):8098. pmid:36365797.

* View Article

* PubMed/NCBI

* Google Scholar

14. 14. Shekhar S, Reimann M, Hilscher M, Semmo A, Döllner J, Trapp M. Interactive control over temporal consistency while stylizing video streams. Comput Graph Forum. 2023;42(4).

* View Article

* Google Scholar

15. 15. Si W, Zhu H, Qu Z. Robust Poisson multi-Bernoulli filter with unknown clutter rate. IEEE Access. 2019;7:117871–82.

* View Article

* Google Scholar

16. 16. Raihan A J, Abas PE, C. De Silva L. Review of underwater image restoration algorithms. IET Image Processing. 2019;13(10):1587–96.

* View Article

* Google Scholar

17. 17. https://www.kaggle.com/datasets/soumikrakshit/nyu-dehazing

* View Article

* Google Scholar

18. 18. https://www.kaggle.com/datasets/minhanhtruong/middleburystereodataset

* View Article

* Google Scholar

19. 19. https://www.kaggle.com/datasets/balraj98/synthetic-objective-testing-set-sots-reside

* View Article

* Google Scholar

20. 20. Wong C, Hajj-Ahmad A, Wu M. Invisible geo-location signature in a single image. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2018:1987–91.

* View Article

* Google Scholar

21. 21. Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing. 2016, pp. 694–711. https://doi.org/10.1007/978-3-319-46475-6_43

22. 22. Bruna J, Sprechmann P, LeCun Y. Super-resolution with deep convolutional sufficient statistics. arXiv preprint. 2015.

* View Article

* Google Scholar

23. 23. Li L, Pu Y-F, Xie X. Performance analysis of fractional-order adaptive filtering algorithm and its improvement. IEEE Signal Process Lett. 2022;29:1853–7.

* View Article

* Google Scholar

24. 24. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244

* View Article

* PubMed/NCBI

* Google Scholar

25. 25. Lai WS, Huang JB, Wang O, Shechtman E, Yumer E, Yang MH. Learning blind video temporal consistency. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018:170–185.

26. 26. Kang SJ, Ryu KB, Jeong MS, Jeong SI, Park KR. CAM-FRN: Class Attention Map-Based Flare Removal Network in Frontal-Viewing Camera Images of Vehicles. Mathematics. 2023;11(17):3644.

* View Article

* Google Scholar

27. 27. Won Y-Y, Yoon SM, Seo D. Ambient LED light noise reduction using adaptive differential equalization in Li-Fi wireless link. Sensors (Basel). 2021;21(4):1060. pmid:33557179

* View Article

* PubMed/NCBI

* Google Scholar

28. 28. Won Y-Y, Kang J. Imperceptible flicker noise reduction using pseudo-flicker weight functionalized derivative equalization in light-fidelity transmission link. Sensors (Basel). 2022;22(22):8857. pmid:36433454

* View Article

* PubMed/NCBI

* Google Scholar

29. 29. Vedaldi A, Lenc K. MatConvNet: Convolutional neural networks for matlab. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015.

* View Article

* Google Scholar

Citation: Shanmugaraja T, Karthikeyan N, Karthik S, Bharathi B (2025) A super resolution generative adversarial networks and partition-based adaptive filtering technique for detect and remove flickers in digital color images. PLoS One 20(5): e0317758. https://doi.org/10.1371/journal.pone.0317758

About the Authors:

Thangavel Shanmugaraja

Roles: Conceptualization, Investigation, Software, Writing – original draft

E-mail: [email protected]

Affiliation: Department of Electronics and Communication Engineering, KPR Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India

ORICD: https://orcid.org/0009-0004-0366-458X

Natesapillai Karthikeyan

Roles: Data curation, Methodology, Supervision, Writing – review & editing

Affiliation: Department of Computer Science and Engineering, SNS College of Technology, Coimbatore, Tamil Nadu, India

Subburathinam Karthik

Roles: Formal analysis, Project administration, Resources

Affiliation: Department of Computer Science and Engineering, SNS College of Technology, Coimbatore, Tamil Nadu, India

Balamurugan Bharathi

Roles: Funding acquisition, Resources, Validation

Affiliation: Department of Computer Science and Engineering, Sri Ranganathar Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India

[/RAW_REF_TEXT]

References

1. Orejón-Sánchez RD, Hermoso-Orzáez MJ, Gago-Calderón A. LED Lighting installations in professional stadiums: energy efficiency, visual comfort, and requirements of 4K TV broadcast. Sustainability. 2020;12(18):7684.

2. Ahn H-A, Hong S-K, Kwon O-K. A highly accurate current LED lamp driver with removal of low-frequency flicker using average current control method. IEEE Trans Power Electron. 2018;33(10):8741–53.

3. Castro I, Vazquez A, Arias M, Lamar DG, Hernando MM, Sebastian J. A review on flicker-free AC–DC LED drivers for single-phase and three-phase AC power grids. IEEE Trans Power Electron. 2019;34(10):10035–57.

4. Raducanu BC, Zaliasl S, Stanzione S, van Liempd C, Quintero AV, De Smet H, et al. An artificial iris ASIC with high voltage liquid crystal driver, 10-nA light range detector and 40-nA blink detector for LCD flicker removal. IEEE Solid-State Circuits Lett. 2020;3:506–9.

5. Sheinin M, Schechner YY, Kutulakos KN. Rolling shutter imaging on the electric grid. 2018 IEEE International Conference on Computational Photography (ICCP). 2018:1–12.

6. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244

7. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-realistic single image super-resolution using a generative adversarial network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:105–14.

8. Nowisz J, Kopania M, Przelaskowski A. Realtime flicker removal for fast video streaming and detection of moving objects. Multimed Tools Appl. 2021;80(10):14941–60.

9. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244.

10. Gao J, Du J, Wang W. Radar detection of fluctuating targets under heavy-tailed clutter using track-before-detect. Sensors (Basel). 2018;18(7):2241. pmid:30002277

11. Zhang Q, Huo W, Pei J, Zhang Y, Yang J, Huang Y. A novel flickering multi-target joint detection method based on a biological memory model. Remote Sensing. 2021;14(1):39.

12. Chudasama V, Upla K. Computationally efficient progressive approach for single-image super-resolution using generative adversarial network. J Electron Imag. 2021;30(02).

13. Kopania M, Nowisz J, Przelaskowski A. Automatic shuttlecock fall detection system in or out of a court in badminton games-challenges, problems, and solutions from a practical point of view. Sensors (Basel). 2022;22(21):8098. pmid:36365797.

14. Shekhar S, Reimann M, Hilscher M, Semmo A, Döllner J, Trapp M. Interactive control over temporal consistency while stylizing video streams. Comput Graph Forum. 2023;42(4).

15. Si W, Zhu H, Qu Z. Robust Poisson multi-Bernoulli filter with unknown clutter rate. IEEE Access. 2019;7:117871–82.

16. Raihan A J, Abas PE, C. De Silva L. Review of underwater image restoration algorithms. IET Image Processing. 2019;13(10):1587–96.

17. https://www.kaggle.com/datasets/soumikrakshit/nyu-dehazing

18. https://www.kaggle.com/datasets/minhanhtruong/middleburystereodataset

19. https://www.kaggle.com/datasets/balraj98/synthetic-objective-testing-set-sots-reside

20. Wong C, Hajj-Ahmad A, Wu M. Invisible geo-location signature in a single image. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2018:1987–91.

21. Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing. 2016, pp. 694–711. https://doi.org/10.1007/978-3-319-46475-6_43

22. Bruna J, Sprechmann P, LeCun Y. Super-resolution with deep convolutional sufficient statistics. arXiv preprint. 2015.

23. Li L, Pu Y-F, Xie X. Performance analysis of fractional-order adaptive filtering algorithm and its improvement. IEEE Signal Process Lett. 2022;29:1853–7.

24. Lin X, Li Y, Zhu J, Zeng H. DeflickerCycleGAN: learning to detect and remove flickers in a single image. IEEE Trans Image Process. 2023;PP:10.1109/TIP.2022.3231748. pmid:37018244

25. Lai WS, Huang JB, Wang O, Shechtman E, Yumer E, Yang MH. Learning blind video temporal consistency. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018:170–185.

26. Kang SJ, Ryu KB, Jeong MS, Jeong SI, Park KR. CAM-FRN: Class Attention Map-Based Flare Removal Network in Frontal-Viewing Camera Images of Vehicles. Mathematics. 2023;11(17):3644.

27. Won Y-Y, Yoon SM, Seo D. Ambient LED light noise reduction using adaptive differential equalization in Li-Fi wireless link. Sensors (Basel). 2021;21(4):1060. pmid:33557179

28. Won Y-Y, Kang J. Imperceptible flicker noise reduction using pseudo-flicker weight functionalized derivative equalization in light-fidelity transmission link. Sensors (Basel). 2022;22(22):8857. pmid:36433454

29. Vedaldi A, Lenc K. MatConvNet: Convolutional neural networks for matlab. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015.

Word count: 7011

Show less

© 2025 Shanmugaraja et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

A super resolution generative adversarial networks and partition-based adaptive filtering technique for detect and remove flickers in digital color images

Content area

Abstract

Full text

1. Introduction

2. Literature survey

2.1. Limitations for existing system

3. Proposed system

3.1. Datasets

3.2. Pre-processing

3.3. Modelling of image flickers

3.4. Super resolution generative adversarial networks

3.5. Adversarial network architecture

3.6. Perceptual loss function

3.6.1. Content loss.

3.6.2. Adversarial loss.

3.7. Partition-based adaptive filtering technique (PBAFT)

3.7.1. Mean square transient behavior.

3.7.2. Mean square steady-state behavior.

4. Result and discussion

4.1. Experiment settings

4.2. Performance metrics

4.3. Precision analysis

4.4. Recall analysis

4.5. Accuracy analysis

4.6. MSE analysis

4.7. PSNR analysis

4.8. SSIM analysis

4.9. Ablation study

4.10. Influence of the PBAFT

4.11. Influence of the K-fold cross-validation

5. Conclusion

References