1 Introduction

The segmentation process is considered as a critical step within an image processing system due to its effects on the subsequent image analysis steps. In segmentation, image pixels are clustered into different regions based on their intensity levels. The ultimate goal of image segmentation is to increase the interpretability or to extract relevant information within the images for human observers, or to produce superior input for further computerized digital image processing systems. Image segmentation is used in several applications such as object tracking, searching regions of interest, surveillance, medical imaging and many more [1,2,3,4].

As mentioned earlier, the primary objective of image segmentation is to divide the pixels into classes based on their intensity levels. A myriad of schemes have been proposed over the years for image segmentation. Among those, thresholding-based segmentation algorithms are noted to be simple and easy to implement. At the same time, segmentation based on image thresholding is reported to be highly effective. This method uses the gray-level histogram derived from the image for selecting the thresholds in order to separate out the classes. If only one threshold value is selected, it is called bi-level thresholding. Bi-level thresholding is primarily used to detect an object from its background. On the other hand, when an image is divided into several classes by selecting more than one threshold, it is known as multi-level thresholding. It is worth highlighting here that bi-level thresholding is easier to implement than multi-level thresholding. In the case of multi-level thresholding, the addition of each new threshold value in the searching approach leads to an increase in complexity along with a decrease in accuracy [5].

Thresholding-based segmentation methods are divided into parametric and nonparametric approaches [6, 7]. Parametric approaches exploit the probability density function (PDF) for defining each of the classes. At the same time, the computational cost is high in the case of parametric approaches. Nonparametric techniques, on the other hand, exploit variance between the classes, the entropy and the error rate for effectively segmenting the images [8,9,10]. Such methods are generally known for their accuracy and robustness [11].

In recent years, entropy-based thresholding approaches have been reported to be highly effective in segmenting various kind of images such as color images, medical and satellite images. Among those, thresholding based on information entropy theory is a fascinating subject of research. Therefore, entropy-based image segmentation methodology has enticed the attention of numerous researchers [12,13,14,15,16,17,18,19,20] and is reckoned as one of the prominent global thresholding method. The strong theoretical background and enhanced performance makes entropy-based thresholding enormously popular in theoretical research as well as in various applications in image processing [21,22,23,24,25,26,27]. The mean information produced by a probabilistic stochastic origin of data defines the entropy. In the case of entropy-based segmentation approach, the entropy of the two regions of the image histogram, the object and the background, are summed. The main assumption here is that the higher the value of entropy, the better the separation between the object and background.

During the last few years, newer methods employing the difference in entropies have been developed to separate the objects from the background. Some of those are Renyi entropy [28], Shannon entropy [29], Tsallis entropy [30], cross-entropy [31] and fuzzy entropy [32]. It is well known that the information existing in the pixels of an image has either the additive or non-additive property. The entropy-based segmentation approaches exploit this fact. A maximum entropy method was proposed based on non-extensive Tsallis entropy [30]. Tsallis entropy is the generalization of Shannon entropy. The pseudo-additivity property of Tsallis entropy with entropic parameter q can handle non-extensive information for statistically independent subsystems. Sahoo et al. [28] proposed a method for thresholding based on Renyi’s entropy. Renyi’s entropy can handle the additive property using the tunable entropic parameter α [28, 30, 33]. However, Renyi’s and Tsallis entropies cannot handle the additive and non-additive information simultaneously.

On the other hand, Masi [33] has introduced a new entropic measure, which is based on the analysis of thermodynamic entropies that utilizes the complete probability distribution for image segmentation. This method incorporates an entropic parameter r, which, when defined as r = 1, reduces the Masi’s entropy to the Shannon entropy, i.e., the Boltzmann–Gibbs entropy. Fundamentally, Masi entropy combines the additivity of Renyi entropy and the non-extensivity of Tsallis entropy. The main argument that differentiates Renyi and Tsallis entropy from Masi entropy is the concordant parameter r. Unlike probability functions of Renyi’s and Tsallis entropy, where each state-probability is raised to the power of their entropic parameters α and q, respectively, in the case of Masi, the entire probability function is raised to the power r [21, 33]. The parameter r represents a measure of the degree of extensivity or non-extensivity that might be existent in the system. The entropy-based thresholding methodology developed on the entropic measure was employed for gray-level image segmentation in [21, 33]. All the entropies are, directly or indirectly, some kinds of generalization of the well-established Shannon entropy [34, 35]. The entropic parameter gives the flexibility to easily achieve the desired result. However, the image segmentation approach based on Masi entropy does not give satisfactory results, especially in the case of color images. Motivated by this fact, we have exploited the concept of energy curve within the Masi entropy framework.

Employing the concept of the energy curve of an image instead of histogram for segmentation, enhanced results can be obtained. Histogram utilizes the intensity of the pixel whereas the energy curve uses the spatial content of the image. In the case of energy-curve-based segmentation, valleys and peaks are smoother than that obtained with histogram-based segmentation. At the same time, the image characteristics are better preserved in the case of former. In the context of energy curve, each of the modes represents an object. Further, a valley exists in between the two adjacent modes. For thresholding any image, the optimal thresholds are obtained in the center of the valley regions of the energy curve. Therefore, energy-curve-based method yields more accurate threshold values when compared to those obtained via histogram-based segmentation approach. This, in turn, leads to improved results. For further improvement in the quality of the segmented images, the concept of fusion based on local contrast can be employed. Interestingly, the fusion-based Masi energy curve approach dominates energy-based Masi entropy.

Motivated by the aforementioned facts, in this paper, a local-contrast-based fusion method [36] for segmentation is coupled with energy-based Masi entropy along with cuttlefish algorithm to get better thresholding results. Local contrast depends upon the difference in the intensity value within a small local space within an image. Besides, there are other expressions for local contrast as well, such as the one used in the logarithmic image processing [37]. In the case of image enhancement, the outcomes of local-contrast-based fusion method are much better than each of the individual enhancement methods. The complete details of image fusion based on the local contrast for image enhancement are described in [36, 38]. In this paper, the same concept is adopted and introduced for image segmentation, where a fusion of the original and segmented images is coupled in order to enhance the quality of the segmented image.

The fusion-based Masi energy curve technique is observed to be highly effective. At the same time, the said approach consumes more time and the complexity level for selecting suitable thresholds is high. In order to overcome these drawbacks, we have coupled the fusion-based Masi energy curve method with the meta-heuristic cuttlefish algorithm during segmentation. Cuttlefish optimization algorithm (CFA) is an emergent meta-heuristic optimization algorithm [39, 40]. The CFA algorithm was proposed after observing, mimicking and modeling the camouflage feature of cuttlefish. The effectiveness and suitability of this algorithm were demonstrated by employing it to traveling salesperson problem [41] and intrusion detection systems [42]. In order to compare the efficacy of CFA, effectiveness of several other nature-inspired algorithms is also explored in this paper. Those are lightening search algorithm (LSA) [43] inspired from natural phenomena of lightening, sine cosine algorithm (SCA) [44, 45], dragonfly algorithm (DA) [46] and selfish herd optimizer (SHO) [47].

Many methods have been introduced recently to determine the optimal threshold levels for image segmentation. In 2019, a novel fuzzy type II set-based image thresholding using evolutionary algorithms was proposed [48]. Recently, in 2020, many new multi-level thresholding approaches have been proposed in the literature which includes PSO image thresholding on images compressed via fuzzy transforms [49], efficient krill herd algorithm for color image multi-level thresholding [50], improved emperor penguin optimization-based multi-level thresholding [51], symbiotic organisms search algorithm for multi-level thresholding of images [52] and a multimodal particle swarm optimization-based image thresholding [53]. A modified hybrid bat algorithm with genetic crossover-based segmentation approach [54] has also been proposed recently. In 2018, a novel deviation analysis based texture segmentation of mammographic images has been proposed [55]. In the literature, fusion concept has been exploited in different image processing applications such as fusion-based image denoising [56, 57]. In recent years, fuzzy and filtering approaches have played important role in many image processing applications [58,59,60]. In 2020, a new fuzzy c-means algorithm- and region salient color-based image thresholding approach has been proposed [61]. In this paper, optimization algorithm has been exploited to get optimum threshold, whereas fusion concept has been opted to further improve the segmentation.

The rest of the paper is organized as follows. Section 2 provides a brief discussion of previous thresholding methods. Section 3 describes a general explanation of Masi entropy and energy-curve-based thresholding model with a summary of LSA, SHO, SCA, and DA algorithms. Section 4 explains the basic theory of CFA in brief and the proposed fusion-energy-based multi-level thresholding scheme. Section 5 discusses experimental setup, results and comparisons. Finally, the conclusions are drawn in Sect. 6.

2 Motivation and contributions

As a result of decades of study and research, numerous thresholding methods have been proposed. Among the established approaches, image thresholding (TH) is one of the simplest, robust and effective means of image segmentation. As discussed earlier, the main aim of the segmentation is to divide the pixels into classes based on their intensity levels. Thresholding-based color image segmentation lacks accuracy in partitioning the ambiguous regions due to the presence of dense features and small abrupt changes within the images. However, Masi energy curve model provides a suitable solution for finding thresholds in multi-level thresholding in spite of its high computational cost. Nevertheless, the multi-level image segmentation performed through conventional techniques poses serious problems since the execution of those methods require high processing time in order to obtain optimal solution. The aforementioned problem with multi-level segmentation can be resolved to a large extent using meta-heuristic techniques to search for appropriate threshold values without incurring extra computational cost.

In the last few decades, several new techniques have been proposed for multi-level thresholding-based segmentation [1, 2, 13, 15, 16, 21]. Exploiting cuttlefish algorithm [39] as objective function in Masi energy curve model reduces this cost partially and gives improved results. The demand for improving the quality of the segmented images is high. At the same time, this is very challenging task. For example, if an image is dark, one will be required to perform image enhancement followed by segmentation. Unfortunately, there is no such technique that performs both of those tasks simultaneously. This fact inspired us to develop a simple yet effective technique that produces an enhanced segmented image. In order to enhance the quality of the segmented images, a novel fusion-based multi-level color segmentation method is introduced. The use of local-contrast-based fusion technique for multi-level thresholding has been proposed for color image segmentation for the first time in this work. Moreover, in order to reduce the increased complexity due to inclusion of fusion criterion and Masi entropy, the proposed method makes an effective use of the Cuttlefish search algorithm. Furthermore, the efficacy of proposed approach is contrasted with several other existing dominant approaches.

3 Overview of objective functions and optimization algorithms

In this section, we describe the Masi’s thresholding method in detail along with other well-known objective functions used in image segmentation. This is followed by discussions on some of the most commonly used optimization criteria reported in the literature.

3.1 Masi’s thresholding method

The pixels of a gray scale or colored image are classified into regions or sets based on their intensity levels (L). This process of arranging the pixels is referred to as thresholding. In order to classify the pixels of a grayscale test image into two regions, the pixels are selected using the following criterion:

$$ \begin{array}{*{20}l} {C_{0} \leftarrow i} \hfill & {{\text{if}}\,\,0 \le i < {\text{th}}} \hfill \\ {C_{1} \leftarrow i} \hfill & {{\text{if}}\,\,{\text{th}} \le i < L - 1} \hfill \\ \end{array} $$
(1)

where i represents the intensity values of the grayscale image with L being the maximum intensity level, th refers to the optimum threshold value, and C represents the classes into which the pixels of the test image need to be classified. Similarly, when the pixels need to classified into several classes, the criterion given in Eq. (1) can be easily extended to incorporate multiple thresholds as follows:

$$ \begin{array}{*{20}l} {C_{0} \leftarrow i \, } \hfill & {{\text{if}}\,\,0 \le i < {\text{th}}_{1} } \hfill \\ {C_{1} \leftarrow i} \hfill & {{\text{if}}\,\,{\text{th}}_{1} \le i < {\text{th}}_{2} } \hfill \\ {C_{2} \leftarrow i} \hfill & {{\text{if}}\,\,{\text{th}}_{2} \le i < {\text{th}}_{3} } \hfill \\ {C_{n} \leftarrow i} \hfill & {{\text{if}}\,\,{\text{th}}_{n} \le i < L - 1} \hfill \\ \end{array} $$
(2)

where th1, th2, th3, …, thn represent multiple thresholds. Segmenting pixels into their respective classes is done using Eqs. (1) and (2) for bi-level and multi-level thresholding, respectively. In the case of thresholding-based image segmentation, the most challenging task is to choose optimal threshold values that can properly classify the different regions within the image using either bi-level or multi-level thresholding algorithms.

Let I denote a test image with an extreme of L gray levels with G = {0, 1…L − 1} denoting the set of intensity values of the image. Further, let the dimensions of the image be M × N. If ni is the number of pixels with gray-level intensity i and the total number of pixels in the image is given by its dimensions, then the probability of gray level i is estimated as follows:

$$ h_{i} = \frac{{n_{i} }}{M \times N}\quad {\text{where,}}\quad h_{i} \ge \, 0\quad{\text{and}}\quad\mathop \sum \limits_{0}^{L - 1} h_{i} = 1 $$
(3)

The complete probabilistic distribution H of gray levels is the set of probabilities for each of the gray levels or H = {h0, h1, h2, …, hL1}. When the pixels in the image are to be separated into two classes [bi-level segmentation given by Eq. (1)], class C0 consists of the set of pixels with intensity levels {0, 1, …, th}, while the other set of pixels with intensity values {th + 1, th + 2, …, L − 1} belong to class C1. In the case of bi-level segmentation, C0 and C1 generally correspond to the background class and object (or foreground) class or vice versa. Similarly, when the pixels in the image are divided into more than two classes [multi-level segmentation given by Eq. (2)], the classes C0, C1, C2Cn represent the set of pixels with intensity values {0, 1… th1}, {th1, th1 + 1… th2}, {th2, th2 + 1… th3} and {thn, thn + 1…L − 1}. Each of the C0, C1, C2 and Cn corresponds to the different object classes and one of the class represents the background. The probability of the classes defined for bi-level and multi-level thresholding is given by following the equations:

For bi-level thresholding,

$$ w_{0} = \sum\limits_{i = 0}^{\text{th}} {h_{i} } ,\quad w_{1} = \sum\limits_{{i = {\text{th}} + 1}}^{L - 1} {h_{i} } $$
(4)

and for multi-level thresholding,

$$ w_{0} = \sum\limits_{i = 0}^{{{\text{th}}_{1} }} {h_{i} } ,\quad w_{1} = \sum\limits_{{i = {\text{th}}_{1} + 1}}^{{{\text{th}}_{2} }} {h_{i} } ,\quad w_{2} = \sum\limits_{{i = {\text{th}}_{2} + 1}}^{{{\text{th}}_{3} }} {h_{i} } ,\quad w_{n} = \sum\limits_{{i = {\text{th}}_{n} }}^{L - 1} {h_{i} } $$
(5)

The above-defined probability distributions are further normalized. Consequently, a new set of distributions are obtained which can be expressed for bi-level and multi-level thresholding using Eqs. (6) and (7):

$$ {\text{DC}}_{0} : \frac{{h_{0} }}{{w_{0} }},\frac{{h_{1} }}{{w_{0} }}, \ldots ,\frac{{h_{\text{th}} }}{{w_{0} }},\quad {\text{DC}}_{1} : \frac{{h_{{{\text{th}} + 1}} }}{{w_{1} }}, \frac{{h_{{{\text{th}} + 2}} }}{{w_{1} }}, \ldots , \frac{{h_{L - 1} }}{{w_{1} }} $$
(6)
$$ {\text{DC}}_{0} : \frac{{h_{0} }}{{w_{0} }},\frac{{h_{1} }}{{w_{0} }}, \ldots ,\frac{{h_{{{\text{th}}_{1} }} }}{{w_{0} }},\quad {\text{DC}}_{1} : \frac{{h_{{{\text{th}}_{1} + 1}} }}{{w_{1} }},\frac{{h_{{{\text{th}}_{1} + 2}} }}{{w_{1} }}, \ldots ,\frac{{h_{{{\text{th}}_{2} }} }}{{w_{1} }},\quad {\text{DC}}_{n} : \frac{{h_{{{\text{th}}_{p} + 1}} }}{{w_{n} }},\frac{{h_{{{\text{th}}_{p} + 2}} }}{{w_{n} }}, \ldots ,\frac{{h_{L - 1} }}{{w_{n} }} $$
(7)

where {DC}s is the new set of a probability distribution.

Based on the above discussion of thresholding-based segmentation, Kapur [14] proposed a thresholding method that maximizes the entropic value on the basis of Shannon entropy. It is one of the objective functions and is optimized using Eq. (8):

$$ E\left( {I/{\text{th}}} \right) = E(C_{0} /{\text{th}}) + \, E\left( {C_{1} /{\text{th}}} \right) $$
(8)

where

$$ \begin{aligned} & E\left( {C_{0} /{\text{th}}} \right) = - \sum\limits_{i = 0}^{\text{th}} {\frac{{h_{i} }}{{w_{0} }}\log \frac{{h_{i} }}{{w_{0} }}} \\ & E\left( {C_{1} /{\text{th}}} \right) = - \sum\limits_{{i = {\text{th}} + 1}}^{L - 1} {\frac{{h_{i} }}{{w_{1} }}\log \frac{{h_{i} }}{{w_{1} }}} \\ \end{aligned} $$
(9)

where E(.) represents the entropy. Albuquerque [30] proposed a concept based on Tsallis entropy and the function for thresholding is now represented by Eqs. (10) and (11):

$$ E_{q} \left( {I/{\text{th}}} \right) = E_{q} \left( {C_{0} /{\text{th}}} \right) + E_{q} \left( {C_{1} /{\text{th}}} \right) + \left( {1 - q} \right) \, E_{q} \left( {C_{0} /{\text{th}}} \right) \, E_{q} \left( {C_{1} /{\text{th}}} \right) $$
(10)

where

$$ \begin{aligned} & E_{q} \left( {C_{0} /{\text{th}}} \right) = \frac{1}{1 - q}\left[ {\sum\limits_{i = 0}^{\text{th}} {\left( {\frac{{h_{i} }}{{w_{o} }}} \right)^{q} - 1} } \right] \\ & E_{q} \left( {C_{1} /{\text{th}}} \right) = \frac{1}{1 - q}\left[ {\sum\limits_{{i = {\text{th}} + 1}}^{L - 1} {\left( {\frac{{h_{i} }}{{w_{1} }}} \right)^{q} - 1} } \right] \\ \end{aligned} $$
(11)

The criterion function for the Renyi’s entropy [28] is given by Eqs. (12) and (13):

$$ E_{\alpha } (I/{\text{th}}) = E_{\alpha } (C_{0} ) + E_{\alpha } (C_{1} ) $$
(12)

where

$$ \begin{aligned} & E_{\alpha } (C_{0} /{\text{th}}) = \frac{1}{\alpha - 1}\log \left[ {\sum\limits_{i = 0}^{\text{th}} {\left( {\frac{{h_{i} }}{{w_{o} }}} \right)^{\alpha } } } \right] \\ & E_{\alpha } (C_{1} /{\text{th}}) = \frac{1}{\alpha - 1}\log \left[ {\sum\limits_{{i = {\text{th}} + 1}}^{L - 1} {\left( {\frac{{h_{i} }}{{w_{1} }}} \right)^{\alpha } } } \right] \\ \end{aligned} $$
(13)

The concept based on Masi’s entropy [33] and the corresponding thresholding function is represented by Eqs. (14) and (15):

$$ E_{r} \left( {I/{\text{th}}} \right) = E_{r} \left( {C_{0} /{\text{th}}} \right) + E_{r} \left( {C_{1} /{\text{th}}} \right) $$
(14)

where

$$ \begin{aligned} & E_{r} \left( {C_{0} /{\text{th}}} \right) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{i = 0}^{\text{th}} {\left( {\frac{{h_{i} }}{{w_{0} }}} \right)\log \left( {\frac{{h_{i} }}{{w_{0} }}} \right)} } \right] \\ & E_{r} \left( {C_{1} /{\text{th}}} \right) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{{i = {\text{th}} + 1}}^{L - 1} {\left( {\frac{{h_{i} }}{{w_{1} }}} \right)\log \left( {\frac{{h_{i} }}{{w_{1} }}} \right)} } \right] \\ \end{aligned} $$
(15)

The entropy between the two classes C0 and C1 is maximized and the gray level at which this holds true is treated to be the optimal threshold. An optimal threshold value th in the case of multi-level based image segmentation can be expressed as:

$$ E_{r} (I_{l} /{\text{th}}_{1l} ) = {\text{E}}_{r} (C_{0l} /{\text{th}}_{1l} ) + E_{r} (C_{1l} /{\text{th}}_{1l} ) $$
(16)

where

$$ \begin{aligned} & E_{r} (C_{0l} /{\text{th}}_{1l} ) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{i = 0}^{{{\text{th}}_{1l} }} {\left( {\frac{{h_{i} }}{{w_{0l} }}} \right) { \log }\left( {\frac{{h_{i} }}{{w_{0l} }}} \right)} } \right] \\ & E_{r} (C_{1l} /{\text{th}}_{1l} ) \\& \quad = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{{i = {\text{th}}_{1l} + 1}}^{\text{th}} {\left( {\frac{{h_{i} }}{{w_{1l} }}} \right) { \log }\left( {\frac{{h_{i} }}{{w_{1l} }}} \right)} } \right] \\ \end{aligned} $$
(17)

After maximizing the entropy in the second segment, threshold th1l is calculated. To realize the second segment of the image, the following equations are used:

$$ E_{r} (I_{r} /{\text{th}}_{1r} ) = E_{r} (C_{0r} /{\text{th}}_{1r} ) + E_{r} (C_{1r} /{\text{th}}_{1r} ) $$
(18)

where

$$ \begin{aligned} & E_{r} (C_{0r} /{\text{th}}_{1l} ) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{{i = {\text{th}} + 1}}^{{{\text{th}}_{1r} }} {\left( {\frac{{h_{i} }}{{w_{0r} }}} \right) { \log }\left( {\frac{{h_{i} }}{{w_{0r} }}} \right)} } \right] \\ & E_{r} (C_{1r} /{\text{th}}_{1r} ) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{{i = {\text{th}}_{1r} + 1}}^{L - 1} {\left( {\frac{{h_{i} }}{{w_{1r} }}} \right) { \log }\left( {\frac{{h_{i} }}{{w_{1r} }}} \right)} } \right] \\ \end{aligned} $$
(19)

After maximizing the entropy in the second segment, threshold th1r has been found. After completion of the above steps, three threshold values are obtained. Thus, four new image segments are obtained: two from the first segment and the next two from the second segment. For each of the segments, an optimal threshold is determined using the Masi-based scheme. The process can be continued for any number of threshold values.

3.2 Sine cosine algorithm

In the context of multi-level thresholding segmentation, the sine cosine algorithm (SCA) focuses to search for the optimal position in the search space. Alternatively, the best solution denotes the optimal threshold configuration values that maximize the objective function. In SCA, each candidate outcome is represented as a vector of possible real values related to thresholds. The quality of each of the candidate outcomes is evaluated at the beginning using the energy-curve-based Masi entropy objective function. The SCA has its own novel tendency to update the position of the candidate outcomes after the evaluation of the objective function employing the sine and cosine mathematical functions [44]. To summarize, the SCA involves three steps. The first step is to produce a random population and then calculate the fitness function of each of the thresholds. The second step is to determine the global best outcome (the target point) which forms the basis to update the rest of the population. The final step is to set an ending criterion to the maximum number of iterations. The stopping criterion is generally chosen as 100 iterations. The complete description of the algorithm is given in [44, 45].

3.3 Dragonfly algorithm

Mirjalili [46] proposed the dragonfly optimization (DFO) algorithm after observing the immovable and movable swarming behaviors of dragonflies. In optimization, there are two imperial phases, i.e., exploration and exploitation. DFO explains these phases by modeling the social behavior of dragonflies in order to find food and to avoid enemies while swarming both statistically and dynamically. Generally, the DFO algorithm is based on the life cycle of dragonflies. The life cycle involves mainly two stages, i.e., nymph and adult. Most of their lifespan spent in nymph phase and transform into the adult phase by undergoing metamorphism. Exploration phase is formulated by observing the sub-swarm’s flying behavior over various regions in a static swarm, whereas exploitation phase is modeled by the monodirectional flying behavior of dragonflies in bigger swarms. The three important behaviors of swarms are separation which means individuals separate from one another to avoid a collision, second is alignment where the velocity of the individuals is matched with others, and the last is cohesion which means the center of the neighborhood attracts the individuals. The stopping criterion is the maximum number of iterations, generally chosen as 100. The complete description of the algorithm is given in [46].

3.4 Selfish herd algorithm

Selfish herd algorithm is based on the selfish herd theory, and it was proposed by Hamilton [47]. According to this theory, at the time of predation, each individual within a herd of possible prey follows to enhance their chance of living by aggregating with other same species without the care of other individuals’ chances of survival. This behavior is observed in some groups of organisms, thereby forming a new optimization algorithm called selfish herd optimizer (SHO). It assumes that the complete search space is an open area where herds of animals interact. Two types of search agents are modeled in this algorithm: a group of prey living in aggregation (a selfish herd) and a group of predators that hunts for the prey within the same aggregation. Briefly, seven stages are involved in this algorithm. They are population initializing, survival value assignation, a structure of a selfish herd, herd movement operators, predator’s movement operators, predation phase and restoration phase. The complete description of the algorithm is given in [47].

3.5 Lightening search algorithm

Lightening search algorithm (LSA) [43] is a novel optimization algorithm inspired by the sinuous nature of lightening during a thunderstorm. This algorithm is derived from the mechanism of step leader propagation, and it uses the concept of fast particles called projectiles. The projectile is analogous to an initial population size. The main steps involved in LSA are described as:

  1. 1.

    Projectile Model: LSA has three different types of projectiles, namely transition, space and lead projectiles. The transition projectiles generate first step leader population for solutions, space projectiles are responsible for exploring and attempting to become the leader, and the lead projectiles engage in finding and exploiting the optimal solution.

  2. 2.

    Forking Procedure: Forking is the principle property of a stepped leader, in which two or more simultaneous and symmetrical branches emerge. This can be realized in the following two ways:

    • Symmetrical channels are produced since the nuclei collision of a projectile is realized using the opposite number.

    • Channel is presumed to appear at the tip of step leader. This is because the energy of most unsuccessful leader is redistributed after several propagation trials.

The stopping criterion is the maximum number of iterations, generally taken as 100. The maximum value of fitness function gives the final threshold points.

4 Proposed algorithm

In this section, a novel and effective scheme to compute the optimal multi-level threshold values is presented with a fusion-based Masi energy curve model employing standard cuttlefish algorithm (CFA). The proposed fusion approach is simple and easy to implement for image segmentation. In the following subsections, we describe each of the component algorithms employed in the proposed approach for multi-level image segmentation.

4.1 Energy curve

Energy curve uses the spatial content of the image. The valleys and peaks in energy curve are smoother than that obtained through histogram-based segmentation where latter uses only the intensity of the pixels. For thresholding the image, the optimal thresholds are obtained in the center of the valley regions of the energy curve. In the energy curve, an object in the image is characterized by each mode and a valley exits in between two adjacent modes. As we assumed earlier, an image I with maximum intensity value L is represented as a matrix \( I = \{ I_{xy} ,1 \le x \le M,1 \le y \le N\} \) with dimension of the image being M × N. For calculating the spatial content of the image, we first calculate the spatial correlation between the neighboring pixels. Thus, for a given position (x, y), \( N_{xy}^{p} = \{ (x + u,y + v),(u,v) \in N^{p} \} \) is used as neighborhood N of order p where p denotes the configuration at which neighborhood takes place [24, 25]. The system is expressed in spatial terms as (u, v) \( \in \) {(± 1, 0), (0, ± 1), (1, ± 1), (− 1, ± 1)} and is shown in Fig. 1.

Fig. 1
figure 1

Spatial representation of neighborhood system N2

The energy of the image I at gray intensity value l (0 ≤ l ≤ L) is calculated by creating a two-dimensional matrix for each and every intensity value as \( B_{x} = \{ b_{x,y} ,1 \le x \le M,1 \le y \le N\} \) where bx,y = 1 if the intensity at the present position is greater than l, the intensity value (lx,y > l), or else bx,y = − 1. Let \( C = \{ c_{x,y} ,1 \le x \le M,1 \le y \le N\} \) be a constant matrix where cx,y = 1, ∀ (x, y) the energy value Ex of the image I at gray intensity value l is computed as:

$$ E_{x} = - \sum\limits_{x = 1}^{M} {\sum\limits_{y = 1}^{N} {\sum\limits_{{rs \in N_{xy}^{2} }} {b_{xy} \cdot b_{rs} } } } + \sum\limits_{x = 1}^{M} {\sum\limits_{y = 1}^{N} {\sum\limits_{{rs \in N_{xy}^{2} }} {c_{xy} \cdot c_{rs} } } } $$
(20)

The right hand side of Eq. (20) is a constant term involved to ensure positive energy value, El ≥ 0. Equation (20) also depicts that, for a given image I at intensity value l, the energy will be zero if all the elements of the binary image Bl are either 1 or − 1. This approach determines the energy associated with every intensity value of the image to generate a curve considering spatial content information of the image.

Pictorial interpretation of Eq. (20) is provided in the revised manuscript following the reviewer suggestion using an example. Figure 1 in the revised manuscript represents the energy curve of an image and also shows how it is different from the histogram of the same image. The matrix Bl is of a size similar to that of the image, which consists of either + 1 or − 1 entries which are computed through this binary image Bl. For a quick reference, some part of the entries of matrix Bl is presented in the below figure.

4.1.1 Thresholding

One of the simplest and easiest ways to segment an image is thresholding as already mentioned. The ease of thresholding comes from the simplicity in computing the threshold values (th) and applying them over the histogram until an ending criterion is reached compared to other approaches for image segmentation. In this work, we have applied thresholding over the energy curve instead of the histogram since the former better determines the spatial position of the pixel which can be expressed as:

$$\begin{aligned} & I_{\text{sg}} (r,c)\\ &\quad = \left\{ {\begin{array}{*{20}l} {I_{G} (r,c)} \hfill & {{\text{if}}\quad I_{G} (r,c) \le {\text{thr}}_{1} } \hfill \\ {{\text{thr}}_{k - 1} } \hfill & {{\text{if}}\quad {\text{th}}_{k - 1} < I_{G} (r,c) \le {\text{thr}}_{k} ,\quad k = 2,3, \ldots nt} \hfill \\ {I_{G} (r,c)} \hfill & {{\text{if}}\quad I_{G} (r,c) > {\text{thr}}_{nt} } \hfill \\ \end{array} } \right. \end{aligned}$$
(21)

where Isg (r, c) and IG (r, c) represent the gray value of the segmented image and the original image at the pixel position r and c, respectively. As most applications require the segmentation into two or more classes, the energy curve is grouped into n*t + 1 classes using n*t thresholds, where thrk is the k-th threshold value used for the segmentation process. The most difficult problem in the case of thresholding-based image segmentation is to obtain the optimal thresholds that can assure the best classification of pixels. In this paper, the concept of energy curve is introduced into the Masi entropy framework by substituting histogram with energy curve in order to enhance the quality of segmentation.

4.2 Energy curve with Masi entropy method

Masi entropy is based completely on the probability distribution for image segmentation. The energy curve, on the other hand, uses the spatial content of the image. However, this concept of energy curve can be applied to the Masi entropy as the latter uses the histogram. Hence, one can easily replace the histogram by the energy curve. The energy value Ex from Eq. (20) for each of the pixels in an image, according to its availability, generates a probability which is given by the following equation:

$$ {\text{PE}}_{x} = \frac{{E_{x} }}{M \times N}\quad {\text{where}},\quad E_{x} \ge \, 0\quad {\text{and}}\quad \mathop \sum \limits_{x = 1}^{M \times N} E_{x} = 1 $$
(22)

For bi-level thresholding,

$$ w_{0} = \sum\limits_{x = 0}^{\text{th}} {E_{x} } ,\quad w_{1} = \sum\limits_{{x = {\text{th}} + 1}}^{L - 1} {E_{x} } $$
(23)

For multi-level thresholding,

$$ w_{0} = \sum\limits_{x = 0}^{{{\text{th}}_{1} }} {E_{x} } ,\quad w_{1} = \sum\limits_{{x = {\text{th}} + 1}}^{{{\text{th}}_{2} }} {E_{x} } ,\quad w_{2} = \sum\limits_{{i = {\text{th}}_{2} + 1}}^{{{\text{th}}_{3} }} {E_{x} } ,\quad w_{n} = \sum\limits_{{i = {\text{th}}_{n} }}^{L - 1} {E_{x} } $$
(24)

On replacing these new energy-curve-based probability distributions in the place of histogram, finally Eq. (18) changes to Eq. (25) as follows:

$$ E_{r} (I_{r} /{\text{th}}_{1r} ) = E_{r} (C_{0r} /{\text{th}}_{1r} ) \, + E_{r} (C_{1r} /{\text{th}}_{1r} ) $$
(25)
$$ \begin{aligned} & E_{r} (C_{0r} /{\text{th}}_{1l} ) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{i = th + 1}^{{{\text{th}}_{1r} }} {\left( {\frac{{E_{x} }}{{w_{0r} }}} \right) \log \left( {\frac{{E_{x} }}{{w_{0r} }}} \right)} } \right] \\ & E_{r} (C_{1r} /{\text{th}}_{1r} ) = \frac{1}{1 - r}\log \left[ {1 - (1 - r)\sum\limits_{{i = {\text{th}}_{1r} + 1}}^{L - 1} {\left( {\frac{{E_{x} }}{{w_{1r} }}} \right) \log \left( {\frac{{E_{x} }}{{w_{1r} }}} \right)} } \right] \\ \end{aligned} $$
(26)

Even though the energy-curve-based Masi entropy gives satisfactory results, the computational cost and the time complexity are high. In order to overcome this issue, meta-heuristic optimization algorithm called the cuttlefish algorithm is used in this paper. The complete details of the cuttlefish algorithm are explained in the following.

4.3 Cuttlefish algorithm

The CFA [39,40,41,42] is a nature-inspired meta-heuristic optimization algorithm. It is based on the camouflage feature of cuttlefish that disappears by adjusting its color same as that of its surroundings. There are three skin layers that are responsible for its camouflage nature, the chromatophores, iridophores and leucophores. This CFA mimics the mechanism of these three layers to find an optimum solution. The global optimal solution is obtained by two main processes:

  • Reflection used to replicate the phenomena of reflection.

  • Visibility used to simulate the visibility of matching patterns.

The new solution (Pnew) is formulated using reflection and visibility and is given by

$$ P_{\text{new}} = V_{\text{Visibility}} + R_{\text{Reflection}} $$
(27)

The main steps involved in CFA are shown in Fig. 2. Similar to the other meta-heuristic optimization algorithms, CFA also starts with random solutions to initialize the population. The initial population P (cells) consists of N cells, and each cell has d number of points. The range of values for initial population is generated as

Fig. 2
figure 2

Flowchart of cuttlefish optimization algorithm

$$ P_{i,j} = {\text{Random}}*(U_{\text{L}} - L_{\text{L}} ) + L_{\text{L}} $$
(28)

where i = 1, 2…N represents the number of cells and j = 1, 2…d represents the number of points per cell. The random is a function used to generate a random number in between [0,1]. The lower limit (LL) and the upper limit (UL) are selected based on the problem domain. For an eight-bit image, the lower limit is set to 1 and upper limit to 256.

The fitness function is evaluated and the best solution is kept in best and the average of best points is stored in AVbest. The population is divided into four equal groups, namely G1, G2, G3 and G4. Each group works independently and shares only the best solution among them. Out of the four groups, two groups (G1 and G4) are dedicated to local search, while the other two (G2 and G3) are dedicated to the global search to find an optimal threshold value. As a part of the optimization criteria, the population in each group is updated using the following principles.

Group1 new population: The new population in G1 is generated using Eq. (27). The visibility and reflection values for group1 are calculated as:

$$ R_{{{\text{Reflection}}_{j} }} = R*G1_{i} \times j $$
(29)
$$ V_{{{\text{Visibility}}_{j} }} = V*({\text{Best}}_{j} - G1_{i} \times j $$
(30)

where G1 represents the cells in group1. Bestj represents the best solution points. R and V represent the reflection degree and visibility degree, respectively, and are defined as

$$ R = R_{\text{rand}} ()*\left( {r1 - r2} \right) + r2 $$
(31)
$$ V = R_{\text{rand}} ()*\left( {v1 - v2} \right) + v2 $$
(32)

The values r1, r2, v1, and v2 are user defined values. In group1, V is set to 1 and the parameters are chosen as r1 = 2 and r2 = − 1 according to [39].

Group2 new population: Similar to the group1, the new population in group2 are generated using Eq. (27). The difference is that the reflection is now computed as

$$ R_{{{\text{Reflection}}_{j} }} = R*{\text{Best}}_{j} $$
(33)

and visibility is calculated using Eq. (30). For this group, R value is taken as 1, while the V is computed using Eq. (32) with v1 = 1.5 and v2 = − 1.5 according to [39].

Group3 new population: The reflection is computed using Eq. (33), and visibility for group 3 is calculated as

$$ V_{{{\text{Visibility}}_{j} }} = V*{\text{Best}}_{j} - {\text{AVbest}}_{j} $$
(34)

and the population in G3 is updated using Eq. (27). In this group, the parameters v1, v2 are set to (1, − 1) and R value is chosen as 1, according to the conditions given in [39].

Group4 new population: For each cell, the new population in this group is generated using Eq. (28).

In each iteration, for each group, the fitness function is evaluated as Snew and the value is compared with current solution Si. If the Snew value is better than the Si, then Si is updated with Snew and Sbest is compared with Snew. If, Snew is better than Sbest, then the algorithm replaces Sbest by Snew. The algorithm repeats the above process until the stopping criteria are satisfied. The final optimal solution is returned as Sbest.

Cuttlefish algorithm is a meta-heuristic optimization algorithm, inspired by the color changing behavior of Cuttlefish for finding optimal thresholds. The light reflecting and visibility factors of matching patterns are the added advantages for selecting CFA. These two processes are used as searching mechanisms to find optimal thresholds. Proficiency of the CFA algorithm is verified [39,40,41,42] with few other well-known nature-inspired optimization techniques such as DA, LSA, SHO, SCA that have been previously proposed in the literature. Furthermore, effectiveness of the CFA algorithm is compared with DA, LSA, SHO and SCA for image segmentation application. In CFA algorithm, initial population is divided into four groups and local best solution is calculated in each group. Based on these four best solutions, new population is generated and processed to find the best solution. The method of generating solution is more effective in case of CFA which makes the algorithm more efficient as compared to other algorithms.

4.4 Fusion based on local contrast

The primary objective of image enhancement is to increase the perceptual quality of the image for human observers or to better resolve the information present within the image. At the same time, most of the enhancement methods do not completely resolve problems like loss of details and loss of local contrast. One of the solutions for these deficiencies is image fusion based on local contrast. In this case, a new image is formed by combining the good qualities of the original image and image obtained by applying enhancement method. Motivated by this fact, in this paper, the above stated image fusion method based on local contrast [38] is exploited for image segmentation. The outcome of fusion-based image segmentation is much better than that obtained through each of the individual segmentation methods.

Basically, local contrast depends upon the difference of the intensity value in a small local space within an image. Let Pi, i = 1, 2 be two gray-level images which are normalized. Then, the local contrast can be expressed as:

$$ M_{i} (p,q) = \hbox{max} (J_{i} (p,q)) - \hbox{min} (J_{i} (p,q)) $$
(35)

where Ji(p,q) indicates the 3 × 3 local image of Pi focused at the position (p,q). min(.) and max(.) indicates the minimum and maximum gray values of the local image, respectively. The local contrast is plotted by considering the difference of intensity values in each of the 3 × 3 local images. Besides, there are also other expressions for the local contrast, such as the one used in the logarithmic image processing (LIP). This definition is consistent with the human point of view and the image formation styles (for further details please refer to [36, 38]).

The image fusion technique exploits the local contrast concept. Let for each point (p, q), the difference between M1 and M2 be N = M2 − M1. So, the fusion weight function is stated as:

$$ W_{f} = \frac{1}{{1 + \text{e}^{{ - p(\hat{N} - q)}} }} $$
(36)

where p and q are two fixed values in the smooth increasing sigmoidal function Wf. The steepness in the position of \( \hat{N} = q \) and the mean value position (namely 0.5) of Wf are adjusted by the constants p and q. In this paper, two fixed values are employed and those are p = 0.5 and \( q = - \hbox{min} (N)/(\hbox{max} (N) - \hbox{min} (N)) \). The function \( \hat{N} = N - \hbox{min} (N)/(\hbox{max} (N) - \hbox{min} (N)) \) represents the normalized difference. The main idea of selecting fusion weight function as the sigmoidal function over linear function is that the former will surmount the deficiencies by preserving the good qualities of the two images better than latter. Thus, the image obtained by fusion can be represented as:

$$ R = \hat{W}_{f} P_{2} + (1 - \hat{W}_{f} )P_{1} $$
(37)

where the normalized fusion weight function is given by \( \hat{W}_{f} = (W_{f} - \hbox{min} (W_{f} ))/(\hbox{max} (W_{f} ) - \hbox{min} (W_{f} )) \). The reason behind employing a normalized fusion weight function is that when the function \( \hat{W}_{f} \to 0 \), the image P1 will be more predominant than P2 in the fused image. In other words, the local contrast of M1 will be more dominant than that of M2 and vice versa when \( \hat{W}_{f} \to 1 \).

To improve the segmentation quality in this paper, we propose a novel algorithm based on the local contrast. The local contrast computed using Eq. (35) is equal to the morphological gradient. It results in contrast intensity within a close neighborhood of the pixel. Specifically, it is the difference between the dilation and the erosion of an image. The close neighborhood is considered because statistical image features are spatially non-stationary, and image distortion is also a space-variant. Moreover, increasing the neighborhood size increases computational complexity. Initially, the segmentation is carried out on Masi entropy-based fitness function, and we obtain the thresholded image. From the thresholded image, we separated the value channel (from HSV color space) and fused it with the input image value channel based on the similarity distance metric. In the HSV color space, the hue and saturation (H and S) are responsible for the chromaticity information of the image. To improve the visual sections of the thresholded regions, we have used the fusion method.

4.4.1 Fusion for color image thresholding

The above fusion method is for grayscale images, and it can also be implemented for color images. For color image fusion, the first step is to convert red–green–blue (RGB) color image into hue-saturation-value (HSV) color space and only the value components should be processed in the fusion method. Therefore, the final image is a combination of the fused value component, and the hue and saturation components of the original image. Clearly, for color image segmentation, the original image is fused with the segmented image in place of an enhanced image.

4.5 Proposed fusion-Masi energy curve model

In this section, we describe the proposed novel local-contrast-based fusion method [36] for segmentation exploiting Masi energy curve model along with CFA. In Masi energy curve method [33], an image is divided into a finite number of small classes with the help of threshold values. Furthermore, these thresholds are calculated using the energy curve instead of the histogram. As the number of thresholds increases, the complexity of the problem also increases due to rising modality and the restrictions of the search space. The use of the CFA minimizes the said complexity. In order to further improve the quality of the segmented images, the concept of fusion based on local contrast has been used in this study. Consequently, the proposed fusion-based Masi energy curve approach dominates energy-based Masi entropy.

To enhance the quality of the segmented image, a fusion based on local contrast [36] was introduced in color image segmentation. In this process, fusion of the original image and the segmented image takes place to give new hybrid segmented image. Consequently, this new image will have characteristics from both original image and segmented image. Experimentally, it also is proven that the fidelity parameters of the new fusion image dominate the segmented image. At the same time, the precision in the details of the segmented images is increased. After obtaining the energy-curve-based Masi entropy with CFA algorithm, the concept of fusion based on local contrast has been added to the aforementioned approach to enhance the quality of the segmented image.

The idea of the energy curve was used for image segmentation in [24, 25]. For decades, use of histogram has been the dominant and simple option for thresholding-based image segmentation. However, the computation of histogram does not include the spatial relationship among the surrounding pixels. In this paper, the author’s thrust is to develop a novel concept to perform image thresholding, which creates a light to the new research for color image segmentation. In a fundamental departure from the current practice of histogram-based image segmentation, the proposed approach uses the energy curve concept. It is worth mentioning that the proposed energy curve utilizes the spatial contextual information of the neighborhood pixels, which is lacking in the general histogram. To achieve this, we adopted the essential features of the histogram by considering the spatial relationships among the pixels.

The steps involved in the proposed segmentation technique are described as follows:

Step 1 An image with dimension M × N that is to be segmented is taken as an input and it is saved into an image array I. The maximum value of gray level is calculated, along with the normalized gray-level probability distribution H = {h0, h1…hL1} for the image. Next, the energy Ex is calculated using Eq. (20), while Eq. (1) is replaced by Eq. (22) since we have used the concept of energy curve instead of histogram.

Step 2 It is to note that the image is assumed to be completely homogeneous and the threshold is set to a minimum possible value. The maximum entropy Emax is then calculated for the image. The value of the entropic parameter q, α and r for Masi’s entropy is set to 0.8, 0.8 and 1.2, respectively, to obtain optimum results.

Step 3 Assuming the threshold to be th, the intensity level values of image I are divided into two classes, one being C0 and the other being C1 where C0 = {0, 1, 2…th} and C1 = {th + 1, th + 2…L − 1}. Next, the dimension of the population in CFA algorithm, number of thresholds, the maximum number of iterations and boundary points are initialized.

Step 4 Using Eq. (23), the prior probabilities w0 and w1 corresponding to C0 and C1 are calculated and the new set of probability distributions DC0 and DC1 are derived using Eq. (6).

Step 5 The entropy for Masi-based methods is maximized using Eqs. (8), (10), (12) and (14), respectively. While maximizing the entropy for each of the cases, the optimal threshold value th is assumed to lie within the domain G = {0, 1…L − 1}.

Step 6 The segmentation of image is carried out through thresholding by the optimum threshold value th.

Step 7 The obtained two new thresholds, th1l lying in G1l = {0, 1…th} and th1r in G1r = {th + 1, th + 2…L − 1} are the thresholds required for bi-level thresholding.

Step 8 The maximization of entropies for each obtained segment is continued until the desired level of thresholds is determined.

Step 9 The thresholds are determined in the following order: th, th1l, th1r, th2l, th2r, th3l, th3r, th4l, th4r, th5l, th5r, th6l, and th6r using the range of intensity values as {0, 1…L − 1}, {0, 1…th}, {th + 1, th + 2…L − 1}, {th1l + 1, th1l + 2…th}, {th + 1, th + 2…th1r}, {0, 1…th1l}, {th1r + 1, th1r + 2…L − 1}, {th2l + 1, th2l + 2…th}, {th + 1, th + 2…th}, {th1l + 1, th1l + 2…th2l}, {th2r + 1, th2r + 2…th1r}, {th3l + 1, th3l + 2…thll}, {th1r + 1, th1r + 2…th3r}, respectively.

Step 10 The levels of thresholds are selected symmetrically to get optimal results. Next, we normalize the segmented and original images and calculate the local contrast using Eq. (35).

Step 11 Weight function is calculated using Eq. (36), and the final fused image is obtained using Eq. (37). The proposed segmentation algorithm is summarized pictorially in the flowchart shown in Fig. 3.

Fig. 3
figure 3

Flowchart of proposed method

5 Experimental results and discussion

In this section, comprehensive experimental results including performance assessment tables and illustrative examples are presented to demonstrate the effectiveness of the proposed scheme over some of the dominant existing approaches.

5.1 Experimental setup

In this paper, the simulations are obtained by using the 3, 5, 8, 10 different thresholds and all these have been executed using MATLAB R2017a on Windows machine using an Intel® Corei7 CPU @ 3.6 GHz processor with 8 GB of RAM. The proposed fusion-based multi-level thresholding employing local contrast is evaluated with different algorithms like LSA, SHO, SCA and DA. It is also compared with Masi energy curve method with same search algorithms. Almost all the results of the proposed method show better quality of segmentation and consistency. The number of iterations for all algorithms is chosen to be 500. The image fidelity parameters like mean error (ME), mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index module (SSIM), feature similarity index module (FSIM) and entropy values are used as metrics for contrast purpose in this paper. The formulae for all these parameters are enlisted in Table 1, where I and I` represents the original image and the segmented image, respectively. M and N stand for size (rows and columns) of the image.

Table 1 Fidelity parameters considered to test the efficiency of proposed method with other algorithms

5.2 Image data set

In this paper, the proposed fusion-based Masi energy curve model is tested with a set of benchmark images from the Kodim dataset [66] and Berkeley dataset [67]. This set contains ten complex color images, and all images are in JPEG format with the size of 256 × 256 pixels and are shown in Fig. 4a–k. Color image processing is associated with complex problems due to which most of the existing thresholding methods are evaluated over single frame gray-level images. Color images contain highly dense information which leads to uncertainties and inaccuracy in segmentation task. Therefore, obtaining greater accuracy during the segmentation of such images is a challenging problem.

Fig. 4
figure 4

ak Represent original test images from Kodak and Berkeley dataset [66, 67]

5.3 Performance evaluation and comparison

In this paper, comparison between Masi energy curve and fusion-based Masi energy curve with search algorithms like LSA, SHO, SCA, DA and CFA has been done. The segmented images obtained using the proposed approach are better than those obtained using the existing former methods in all aspects of explored fidelity parameters, i.e., ME, MSE, PSNR, FSIM, SSIM and entropy. Table 1 enlists the values for the six assessment parameters studied in this paper. Table 2 gives threshold values of Masi energy curve model, and Table 3 reports the comparison of ME and entropy values of Masi energy curve model along with that for the proposed fusion method, respectively. The results of fusion-based Masi energy curve model beat the outcomes of Masi energy curve model. Furthermore, the proposed method shows higher entropy. It is well known that higher entropy for an image indicates more information contained in the segmented images.

Table 2 Comparison of thresholds values obtained by using different methods for each sample images
Table 3 Comparison of entropy and ME computed by different algorithms using Masi entropy

The obtained experimental results suggest that the fusion-based segmentation method has effective features with better contrast. By comparing all these results, the quality of the proposed fusion-based segmented images is observed to be superior to those obtained through the previous individual segmented images. This new method exploits the results of the individual segmentation methods and enhances them with the help of fusion. Hence, the segmented outcomes are improved, and the contrast of the image is enhanced. With the help of fusion, accuracy in the details of the segmented images is also increased. Thus, the results of the proposed method, for each of the sample images, validate that the CFA algorithm is comprehensively superior to the other included schemes in terms of efficiency, solution quality and robustness as reported by the performance assessment indices in each case.

Table 4 shows the comparison of MSE and PSNR values, where the proposed fusion method dominates energy-curve-based Masi entropy method. The experimental results exhibit that fusion-based multi-level thresholding of color image is a feasible solution to the insufficiencies of existing image segmentation techniques. The inaccuracies like the loss of features, loss of local contrast and gray-level destruction and abrupt intensity changes can be successfully minimized by fusing a thresholded result with the original image. Moreover, the fusion-Masi energy curve with CFA approach can significantly decrease the number of function evaluations preserving the good search abilities of an optimization algorithm.

Table 4 Comparison of MSE and PSNR computed by different algorithms using Masi entropy

Out of all the values of SSIM and FSIM for segmented images shown in Table 5, the values obtained using Masi energy curve and fusion-based Masi energy curve are comparable to each other. The best values of SSIM and FSIM are obtained by the proposed fusion-Masi energy curve along with the CFA method, where it stands at the top ahead of energy-based Masi entropy for almost all the images. The difference in the entropic value of the segmented image from the original image gives the amount of information lost during the segmentation process. In image segmentation, preservation of information is very crucial which implies that the original and segmented entropic values should be nearly equal. Tables 2 and 3 show the comparison of loss of information incurred when the proposed methodology is implemented using energy-based Masi entropy and then coupled with fusion method. For most of the cases, a minimal loss is noted when the test images are segmented using the proposed approach.

Table 5 Comparison of SSIM and FSIM computed by different algorithms using Masi entropy

From the illustrations presented in Figs. 5, 6, 7, 8 and 9, the qualitative supremacy of the proposed method over other methods can be easily perceived. The segmentation outputs enormously depend upon the utilization of objective function. In the context of image segmentation, the objective function is used to determine the threshold values efficiently without any exhaustive search. Therefore, the results of the optimization algorithms are varying due to exploitation of different objective functions. Figures 5, 6, 7, 8 and 9 also show the segmented results of each test images at 3-level, 5-level, 8-level and 10-level thresholding. From those images, it can be visually investigated that the proposed method surpasses the energy-based approach exploiting Masi entropy. From Tables 3, 4 and 5, it can be clearly identified that the proposed segmentation technique overcomes other methods. Based on the quantitative and qualitative assessment presented in this work, proposed fusion-based Masi energy curve with CFA method leads to better performance as it offers more reliable and efficient thresholded results. Comparison of CPU time consumed by the explored and proposed methods for each of the sample images is given in Table 6.

Fig. 5
figure 5

Comparison of segmentation results obtained by using different methods for each sample images

Fig. 6
figure 6

Comparison of segmentation results obtained by using different methods for each sample images

Fig. 7
figure 7

Comparison of segmentation results obtained by using different methods for each sample images

Fig. 8
figure 8

Comparison of segmentation results obtained by using different methods for each sample images

Fig. 9
figure 9

Comparison of segmentation results obtained by using different methods for each sample images

Table 6 Comparison of CPU time obtained by using different methods for each sample images

Proposed Masi-Energy-Fusion using CFA (MASI-ENG-F-CFA) scheme has been assessed using benchmarked images, and performance is compared with well-known recently developed meta-heuristic optimization methods like SHO, SCA, DA and LSA. Figures 5, 6, 7, 8 and 9 show the superior visual quality and better contrast of the segmented images obtained using the proposed approach. All the benchmark images are well segmented using the best threshold values obtained through Masi-Energy-Fusion-CFA. The results presented in this paper demonstrate the superior search capability of CFA in addition to clear contrast rate due to the fusion process. The same has been experimentally validated visually as well as numerically using well-established evaluation parameters for image thresholding. The presented qualitative as well as quantitative experimental results, given in Tables 3, 4 and 5, display the superiority of proposed approach over SHO, SCA, DA and LSA in terms of quality and consistency. The best result in each case is highlighted through bold faces in Tables 3, 4 and 5. Due to the fusion process, better contrast is achieved in the thresholded images. This, in turn, can help in distinguishing edges and other objects after segmentation.

In this section, the performance assessment of Masi entropy for multi-level color image thresholding is given. The threshold levels and corresponding numerical assessment metrics using Masi entropy using each assessed algorithms are reported in Tables 2, 3, 4, 5 and 6 for m = 3, 5, 8, 10. The ME, MSE, PSNR, FSIM, SSIM and entropy values computed with the proposed (MASI-ENG-F-CFA) based method are listed in Tables 2, 3, 4, 5 and 6 and compared with the outcomes of SHO, SCA, DA and LSA, respectively, for fusion- and context-based scheme. The energy curve follows the properties of histogram, i.e., it also has the valleys and peaks. The proposed scheme is aimed to obtain segmentation for color images. Masi entropy is a well-known function for multi-level threshold selection, which considers the energy of each channel and performs the multi-level thresholding by maximizing the objective function. The final optimal threshold values are drawn by taking the average of the two threshold values. From Figs. 5, 6, 7, 8 and 9, it is clear that the proposed method (MASI-ENG-F-CFA) produces the superior segmentation results as compared to the SHO-, SCA-, DA-, and LSA-based segmentation, respectively, in almost every case. Moreover, the proposed CFA-based fusion approach offers faster segmentation outcome among energy-based SHO, SCA, DA, and LSA methods. Therefore, it may be summarized that the proposed CFA based approach indicates less CPU time in case of contextually fused multi-level thresholding results against SHO, SCA, DA, and LSA algorithms.

The procedure of searching optimal threshold points for multi-level color image segmentation can be measured as a constrained optimization task. The appropriate thresholding values show the accurateness of image segmentation. Accordingly, the quality of multi-level image thresholding-based segmentation is based on the performance of the meta-heuristic algorithms. Therefore, the presented CFA algorithm can be played an important role to obtain a fast multi-level thresholding-based image segmentation.

6 Conclusions

In this paper, a new local-contrast-based fusion method for color image multi-level thresholding is presented. The proposed method is inspired from the success of image fusion in the context of image enhancement. Motivated by that, the fusion strategy is exploited for the first time in the domain of image segmentation wherein the segmented result is fused with their original image in order to obtain superior results. The fusion-based multi-level thresholding approach is observed to preserve more information in the segmented images. Experimental outcomes indicate that fusion-based multi-level thresholding is simple and yields better solution in the context of color image segmentation than most of the dominant existing techniques. The effectiveness of the proposed approach is evaluated using well-known metrics like ME, Entropy, MSE, PSNR, SSIM and FSIM values. Quantitative results demonstrate that fusion-based Masi energy curve with CFA produces high-quality color segmented images when evaluated in terms of ME, entropy, MSE, PSNR, SSIM and FSIM values. In addition to that, edge detection in the case of proposed segmentation technique is more precise which helps in finding or extracting the higher amount of the information about the hidden objects in the original image. Moreover, the qualitative evaluation of the thresholded images depicts well-delimited regions which are easier to discriminate in comparison with existing dominant approaches.