1 Introduction
Image enhancement is the process of improving an image’s quality and information content [
45]. The goal is to increase visual differences among image features, making it more suitable for various applications, such as increasing the brightness of dark images for better viewing. Common image enhancement techniques include sharpening, smoothing, contrast enhancement, and noise reduction [
4,
10,
33,
41]. Contrast enhancement, in particular, is applied to images or videos to increase their dynamic range [
15].
Digital image processing is widely used across various applications, but uncontrollable factors during image acquisition can lead to low-quality images [
1,
51]. These low-quality images are often unsuitable for human observation and machine perception, making image enhancement a crucial task in computer vision, machine vision, and pattern recognition [
19,
21,
54].
Metaheuristic techniques have been applied to several real-world computational problems across various domains [
3,
20,
23,
24,
25]. Similarly, image enhancement has long been treated as a suitable problem for metaheuristics [
17,
31,
32,
53]. Moreover, the integration of fuzzy systems with metaheuristic algorithms has been the subject of area within the Computational Intelligence community [
8,
11]. Many studies have approached image enhancement as an optimization problem, focusing on modifying the image quality fitness function [
10,
34,
53], combining several metaheuristic algorithms [
17,
53], optimizing parameters [
39] and avoiding local optima [
9]. However, one aspect that has not received as much attention in image enhancement problems is the adaptation of fuzzy systems. Sandeep and Samrudh [
22] have demonstrated that the variation in the input membership function in a fuzzy system can positively impact image enhancement performance.
This paper addresses the image contrast enhancement problem by using stochastic optimization within a fuzzy logic system. Our primary contribution is the design of a metaheuristic inspired framework that leverages a fuzzy logic system. The core component of the fuzzy logic system is a set of
input membership function that are used to describe an
intensity transformation function. We implement genetic operators that
mutate the fuzzy logic system, thereby enhancing the original image to achieve optimal contrast enhancement. The fuzzy system in our approach is based on simple contrast enhancement rules, outlined in [
13], and it operates independently of the input image. Our fuzzy image enhancement technique begins with a basic fuzzy rule set and a set of input membership functions. Using a metaheuristics framework, we evolve these fuzzy sets. The transformation function defined by the evolved fuzzy sets is then applied to the value channel of the HSV color space to produce the final enhanced image [
40]. This approach converts the problem into an optimization challenge. Instead of directly generating an image, we aim to create a suitable mapping between input and output color values. This task is inherently difficult because, even with only 8-bit color (256 color values), the number of possible mappings is enormous. Exhaustive search through this vast solution space is infeasible, and there is no clear knowledge on how to improve or generate a solution. Therefore, we leverage a metaheuristics framework to address this challenge [
29]. Additionally, we empirically demonstrate the effectiveness of our proposed metaheuristic-based techniques in enhancing image contrast. Quantitative comparisons and a comprehensive survey reveal that the genetic algorithm technique, in particular, significantly improves the visual quality of images on par with existing image enhancement techniques.
The paper is structured as follows: Section
2 reviews the literature of image enhancement, Section
3 presents the background to understand the methodology of the paper, Section
4 formally presents our problem statement, Section
5 presents our methodology for enhancing image contrast, Section
6 empirically evaluates our proposed methodology for image enhancement, finally, we conclude our work with some indications of future works in Section
7.
2 Literature Review
Several techniques are used for contrast manipulation, including gamma transformation [
18,
45] and histogram equalization (HE) [
46,
47], but it has limitations, such as failing to maintain image brightness [
42]. Bi-histogram Equalization [
35] addresses this issue. Another drawback of HE is the potential loss of image information [
55]. Techniques like gamma transform and log transform [
45] offer lower computational complexity. Tarawneh et al. [
49] applied gamma transformation for contrast enhancement by automatically applying different gamma corrections to multiple pixel sets. Huang et al [
18] applied gamma correction with
weighting distribution. However, these techniques often struggle in complex illumination settings [
34], requiring parameter adjustments to be effective. Bhandari et al [
6] combined sub-histograms with
discrete cosine transform for contrast enhancement.
Fuzzy logic and metaheuristic techniques have been previously applied to image enhancement problems [
8,
11,
26]. A fundamental approach involves using a fuzzy logic-based system from [
13], which utilizes only three rules and employs trapezoidal and triangular input fuzzy sets. Besides, Joshi and Kumar [
22] proposed a complex method with a seven-rule set and
gaussian fuzzy sets.
To evaluate the fitness of an enhanced image, Munteanu and Rosa [
32] proposed a novel objective function and applied an evolutionary algorithm to search for optimal parameters in a continuous transform function. This objective function has also been utilized in various optimization techniques, including artificial bee colony optimization [
10], the cuckoo search algorithm [
5], and the firefly algorithm for enhancing UAV-captured images [
39].
Asamoah et al. [
2] exploited some metrics to evaluate image enhancement techniques, including, Mean Square Error (MSE), Peak Signal Noise Ratio (PSNR), Root Mean Square Error (RMSE), and Image Quality Index (IQI). PSNR metric is commonly used to assess the visual quality of images. This work also reported that out of multiple variants, histogram equalization (HE) has performed best in all 8 compared metrics. Inspired by this, we have also presented the PSNR metric comparison with baselines.
4 Problem Statement
Our objective is to manipulate image contrast to enhance the sharpness and make image features more visually distinguishable. While image enhancement is inherently subjective, we have employed a fitness function from image enhancement literature to quantify this enhancement [
5,
7,
9,
10,
32]. This fitness function evaluates the quality of an enhanced image. We use a transformation function to enhance the image, which is optimized using a metaheuristic approach. Thus, the input to our process is an image, and the output is an enhanced version of that image.
For simplicity, we present a formal definition of the problem in the context of a
grayscale image. Let
I =
f(
x,
y) represent a grayscale image, where
x and
y denote the pixel positions. The image
I has dimensions
M ×
N, thus 0 ≤
x <
M and 0 ≤
y <
N. Assume there is a fuzzy logic-based transformation function
\(\mathsf {T}(I)\) that modifies the gray value of each pixel, producing an enhanced image
Ie. More formally,
Thus, our task is to identify a suitable transformation function that visually enhances the resulting image
Ie.
5 Methodology
In this section, we present our metaheuristic-based techniques for image contrast enhancement. All metaheuristic techniques share several common features [
12]. Each technique begins with an initial set of
individuals, often referred to as the
population. To introduce diversity into the population, special operations, known as
mutations, are applied to generate new populations. In each iteration, the fittest individuals are selected based on some mathematical functions known as the
fitness function. This process of generating a new population is repeated for a fixed number of iterations or a fixed time frame, with the fittest population from all iterations is considered the best solution found. While this approach does not guarantee an optimal solution, it is effective at discovering
good solutions within a relatively short time frame. Finally, the differences between most metaheuristics techniques lie in the way in which the initial population is generated and the following populations are manipulated.
At a high level, we employed both Hill Climbing and Genetic Algorithm-based techniques with problem specific mutation and crossover operations. Now we discuss each of the subroutines of our proposed metaheuristic techniques.
5.1 Population Representation
Every metaheuristic technique begins with an initial population, usually randomly generated, and the representation of this population is specific to the problem being addressed. In our problem, each individual represents an input membership function, as illustrated in Figure
1. The initial population thus contains information about a set of input membership functions. Each input membership function comprises three mathematical functions, each of them is represented by a tuple of three values, regardless of whether the function type is trapezoidal or triangular. The first two values of the tuple determine the shape of the function, while the third value is used in defuzzification.
5.2 Fitness Assessment
Following the literature of image enhancement [
32], we calculate an individual’s (
I) quality/fitness using the following fitness function:
Is = image after Applying Sobel filter on I
\(\mathsf {E}(I)\) = sum of intensity of image I
\(\mathsf {ne}(I)\) = number of edge pixel in I
\(\mathsf {H}(I)\) = entropy of image I
5.3 Mutation
When generating a new individual, only the shapes of the functions (i.e., the width of the triangle and trapezoid’s oblique line’s slope) are tweaked. For implementation, we have adapted three hyperparameters, namely, \(\mathsf {ChangeProb}\), \(\mathsf {MutateMu}\), and \(\mathsf {MutateSigma}\). \(\mathsf {ChangeProb}\) denotes a probability and during each iteration, a membership function is tweaked with a probability of \(\mathsf {ChangeProb}\). Whenever a membership function is tweaked, it is changed by some amount δ, and the value of δ is sampled from a gaussian distribution with mean and variance \(\mathsf {MutateMu}\) and \(\mathsf {MutateSigma}\), respectively.
5.4 Crossover
The crossover operation combines multiple individuals (typically two) to produce offspring. There are three classical methods of performing crossover, and in our implementation, we have used uniform crossover. In the context of our representation, the crossover operation iterates through all the membership functions and combines them by swapping individual functions based on the outcome of a probabilistic coin toss with probability p. For the crossover in the Genetic Algorithm, both individuals must be of the same size.
5.5 Hill Climbing: Details
The Hill Climbing technique stochastically generates new individuals and stores the fittest individual as the candidate solution. The best solution is evaluated using the fitness function, as shown in Equation
3. In each generation, we generate a certain number of neighbor individuals (set to 10) from a fixed individual, following the mutation operations outlines in Section
5.3.
5.6 Genetic Algorithm: Details
Unlike Hill Climbing, the Genetic Algorithm (GA) starts with a number (say,
\(\mathsf {PopSize}\)) of individuals. We use the same evaluation function shown in Equation
3 for fitness evaluation in our GA approach. To introduce variations in the population, GA breeds a new population of children, selects individuals from the old population, and tweaks them to breed new individuals. To keep the footprints of both populations, it joins the parent and children populations to form a new generation of the population (Section
5.4). In our implementation, we have fixed
\(\mathsf {PopSize}\) equal to 30. The genetic algorithm differs in how parent and child populations are joined. In our genetic algorithm procedure, we have experimented with both (
P,
P) and (
P +
P) evolution strategies, where
P is the
\(\mathsf {PopSize}\).
Hyperparameter Settings. We run each of the variants for 10 minutes and select the best solution found within that time frame. The parameters are set as follows: \(\mathsf {ChangeProb} = 0.5\), \(\mathsf {MutateMu} = 3\), and \(\mathsf {MutateSigma} = 2\).
6 Experimental Results
In this section, we evaluate our proposed approach and present the results of applying our methods to various images. Our experiments were conducted on several images from [
10] and additional images (color, grayscale, and text) sourced from the Internet. Building on previous works in image enhancement [
5,
10,
39] and to conduct a comprehensive evaluation, we experimented with a total of 18 images. For color images, we convert them to the HSV format and change the contrast in the image by manipulating brightness without affecting the color. However, we observed that our proposed method performs particularly well on grayscale images. Therefore, we present the results specifically for grayscale images (13 images). Our experiment study reveals that the genetic algorithm with (
P,
P) strategy outperforms the genetic algorithm with (
P +
P) strategy. Thus, we present the results of genetic algorithm with (
P,
P) strategy. In the further analysis, we consider the technique:
(1)
(HC) Hill Climbing (Subsection
5.5)
(2)
(GA) Genetic Algorithm with (
P,
P) strategy (Subsection
5.6)
6.1 Experimental Setup and Environment
To evaluate the efficacy of our approach, we implemented a prototype in Python using the DEAP evolutionary computation framework
1. For the fuzzy logic component, we utilized the scikit-fuzzy toolbox
2. All variants of our algorithms were executed on a system with an Intel(R) Core(TM) i3 CPU M370 @ 2.40GHz processor, 6GB RAM, running Ubuntu 18.04, and Python version 3.6.8.
6.2 Results
Due to the space limitation, we are unable to include all the enhanced or generated images in the paper. We first present the attained result by pointing out changes in features of four images. Figures
5f,
3f,
4f,
6f show output images generated by our technique against the reference input images. As an outcome of our technique, the enhanced images are more natural looking in the case of Figures
3f,
4f. In Figures
5f,
4f and
6f we can see the edges are sharper (notice the house shadow and roof tiles, cloud at top of jetplane and mountain body). Overall the histogram is expanded, and the contrast between the black and white portion has increased. Also, it is visually noticeable that method GA is doing better than HC. We will see support behind this claim in Section
6.3. A downside of our method is that some black portions (e.g., Figures
4e,
4f) are unnecessarily darkened. This can be beneficial in the case of overexposed images. For example, both methods in Figure
3 and
6 are succeeded in increasing the contrast between the person’s background, face, and contrast between letters and page.
For quantitative comparisons, we compute the Michelson contrast [
30,
37], PSNR [
2], and SSIM [
52] of each enhanced image in Table
1 through Hill Climbing (HC) and Genetic Algorithm (GA) variants of the proposed methods. We compare similar metrics with images enhanced through techniques: histogram equalization (HE),
dual channel prior-based nighttime low illumination image enhancement (Dual) [
43], and adaptive gamma correction with weighting distribution (AGCWD) [
18].
The higher value of contrast indicates better contrast by increasing the difference between the darkest and brightest pixels. The table demonstrates that our proposed methods (especially GA) are effective in image contrast enhancement in most of the images compared to the original image. HE, Dual, and AGCWD have achieved better contrast metric value. Upon visual inspection, it is revealed that these three methods manage to increase contrast by darkening the dark region and brightening the bright region. In all presented images, Dual and AGCW affect oppositely compared to our method, that is they overexpose the already bright area in the images. This explains why HE, AGCW, and Dual is doing well in the Michelson contrast metric. Another observation is that out of all 5 methods, HE introduces noticeable artifacts (grainy feature in Figures
3b,
4b,
6b,
11b,
10b). This explains the lower value attained in SSIM and PSNR metrics in most of the images by HE compared to other methods. Our method consistently achieved near 1 SSIM value (worst is > 0.7) compared to the other three. This indicates our method will introduce fewer artifacts (grainy feature, emphasized boundary, whitened area, etc.).
We can realize that only one metric does not show all sides of improvement by the same methods. The visual quality of the image is a subjective manner. For a more complete and subjective assessment of the visual quality, we have conducted a survey on our method enhanced images which we present in the next section.
6.3 Survey
6.3.1 Survey Preparation.
We have prepared an anonymous survey to get a quantitative opinion on our enhanced images. In the survey, we have placed 26 enhanced images (produced by our technique) side-by-side with their original image. The survey participant needs to mark the enhanced image on a scale of 1 − 9. As the reference, the mark of the original image was 5; therefore, a mark of > 5 would mean a better quality enhanced image, while < 5 will mean image degradation. A mark of 5 would mean the processed image is the same as the original visually (a subjective judgment).
6.3.2 Survey Response & Observations.
We have received a total of 67 responses. From the responses, we have filtered responses with suspicious patterns like all equal marks or very high or very low marks. This filtering caused two responses to be filtered. From the remaining 65 responses, it seems GA, achieving a mean score of 5.45, is better than HC, achieving a mean score 4.77. Moreover, achieving a mean score of greater that 5 indicates that the GA improves the original image.
From Figure
2, we can see the mean scores of all enhanced images given by each surveyee. Each data point in Figure
2(a) indicates the mean of scores given by one participant. Figure
2(b) shows the mean (triangle inside each box, green in color picture), and median (horizontal line inside each box, orange in color picture) of the responses. The red points are outliers according to the 1.5 × IQR rule [
38]. From Figure
2(a), GA shows a better mean score than HC (mean over participants). From Figure
2(b), we can see that HC has a higher chance of degrading the image (dense below score 5), although GA has achieved the image with the lowest score according to one participant.
7 Conclusion
In this work, we applied metaheuristics to develop an effective image-specific fuzzy logic-based transformation function. We utilized a quality function from previous studies to guide our enhancements. Recognizing that image quality is subjective, we conducted a survey to gather subjective opinions on the visual quality of the enhancements and reported the results. The assessment revealed that one variant of our approach demonstrates visual improvement on average.
For future work, we plan to conduct additional experiments by incorporating more image processing operations, such as
gamma transformation, into the processing pipeline. One limitation of our current approach, based on visual observation, is that it tends to darken the image. Integrating other processing operations with automatic parameter tuning via metaheuristics holds promise for further improvement.