Elsevier

Information Sciences

Volume 608, August 2022, Pages 670-695
Information Sciences

Nature inspired algorithms with randomized hypercomputational perspective

https://doi.org/10.1016/j.ins.2022.05.020Get rights and content

Highlights

  • Two closely related nature inspired algorithms are evaluated extensively on a function for which nothing can be proved.

  • Results show that both of them are similar.

  • The case of randomized hypercomputation is debunked.

  • Three theses related to nature inspired algorithms and hypercomputation are proposed.

  • Antithesis to belief, these theses are meant to be proved (or disproved) and not only meant as theses.

Abstract

The intention of this paper is to empirically compare two closely related (and considered novel metaphor based) nature inspired algorithms and then to use these comparisons and other ideas to shed some light on randomized hypercomputation. The Bat and the Novel Bat algorithms are two of the recently created nature inspired algorithms with the claim of novelty in both of them as both of them uses different randomization, but how novel or similar are they in reality remains hazy especially from the practical perspective. Therefore, here I compare both of them on a data dependent, unexplainable real world classification function. Particularly, I create two classification machines, one derived by BA and another derived by the NBA, using weighted liner loss twin support vector machine and compare it with other trailblazing classifiers on the real world UCI machine learning data sets. These machines are, especially compared with each other also. The results show that the formulated machines perform better with respect to other classifiers on more than 80% of comparisons but are extraordinarily similar between themselves, as they agree with each other up to even four decimal places in almost 100% of cases. Moreover, these optimization algorithms form a perfect example of partially random machines which are claimed to be hypercomputational. But, here it can be shown that unless one has an access to the uncomputable input and uses it intelligently one cannot hypercompute.

Introduction

The No-Free-Lunch theorems are foundational results in supervised machine learning and optimization theories. The No-Free-Lunch theorems for supervised machine learning (NFLM) [1], [2], [3], [4] basically says that the entire supervised machine learning algorithms are the same with respect to all of the datasets, while the No-Free-Lunch theorems for optimization (NFLO) says that the entire optimization algorithms are just the same with respect to all of the (objective) functions [1], [5]. These theorems have a few conditions with them which are generally satisfied with respect to the Turing machine model. Despite these two foundational results experts are motivated to produce new optimization and supervised machine learning algorithms for which it is not possible to prove any proposed properties, for e.g. algorithm’s generalization power [1], [2], [3], [4], [5].

Motivated by the last line of the previous paragraph in this paper, I first evaluate two of the most “celebrated” and considered “novel” optimization algorithms, the bat algorithm (BA) [6] and the novel bat algorithm (NBA) [7] extensively (and especially) with each other. The special function on which they are tested is data dependent, unexplainable [44], binary classification function from supervised machine learning “weighted liner loss twin support vector machine” (WLTSVM) [8]. Moreover, it is interesting to know that the problem of classification in general is NP-complete. The role model of many learning algorithms “support vector machine” SVM [9], [10] is one in which the decision of a class of an unknown data is based on support vectors (with respect to training data) which are used to create two parallel supporting hyperplanes, while the decision hyperplane is in the center of the two supporting hyperplanes. Then, the decision of a class depends on the proximity of the unknown data to the supporting hyperplanes (or on which side of the decision hyperplane (with respect to the supporting hyperplanes) the data lays). In WLTSVM, the condition that the two supporting hyperplanes must be parallel is relaxed.

Here, it is interesting to know that optimization experts use randomization to solve wide varieties of problems for e.g. scheduling [46], routing [45], instance selection [47], information retrieval [48] etc. The previous line is because of the fact that optimization is omnipresent. But, there are many questions in the minds of experts that are needed to be addressed. Then, here WLTSVM is combined with BA and NBA to produce “weighted liner loss bat twin support vector machine” (WLBTSVM) and “weighted liner loss novel-bat twin support vector machine” (WLBNTSVM) respectively. These two are tested extensively on the real world, UCI machine learning datasets. While it is worth noting that the parameter optimization problem is NP-Complete [1]. In this regard, this paper answers the following:

  • (a)

    What can be said (theoretically and especially empirically) for NFLO following (e.g. grid) and violating (e.g. BA/NBA) algorithms, for a function (here WLTSVM) that is completely “blind”, i.e. for which nothing is known, when compared collectively?

  • (b)

    What can be said (theoretically and especially empirically) for NFLO violating algorithms, for a function that is completely “blind” when compared among themselves?

  • (c)

    Given the set of all of the algorithms can NFLO induces some structure on them and in what sense?

  • (d)

    Are there any (provable) differences between the BA and NBA?

  • (e)

    If some condition(s) of NFLO is violated, then what can be projected for such similar condition violating algorithms?

  • (f)

    Are all of the nature inspired algorithms same?

Moreover secondly, and apart from all this BA and the NBA are good examples of partially random machines (PRMs) [12] because of randomization in them, while it is postulated that such PRMs are hypercomputational in some cases. Here, hypercomputation is the extension of the concept of computation. Computation is defined by Turing Machines (TMs) which are nothing more than sets of well defined instructions. In this sense hypercomputation is the study of those machines that can go beyond the power of conventional TMs [12], [13], [14], [15], [16]. There are many forms of hypercomputation proposed in the literature for e.g.:

  • (i)

    Relativistic hypercomputation [16], [17], [18], [19]

  • (ii)

    Biological hypercomputation [20]

  • (iii)

    Randomized hypercomputation [12]

  • (iv)

    Quantum hypercomputation [21], [22], [23] etc.

That said, it is worth pointing out that the role model of hypercomputation is Turing’s O-machines (O for Oracle) and the concept of hypercomputation has never been proved or refuted completely, although there are various arguments in favor of and against it [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26]. The situation of this paper provides an excellent opportunity to empirically have some evidences against randomized hypercomputation. Randomized hypercomputation can be called by the name of evolutionary hypercomputation as evolution is the best possible real world candidate for it. In this regard, this paper answers the following:

  • (a)

    How randomization creates an opportunity for hypercomputation? Is it harnessable? If not, then what it lacks?

  • (b)

    Discrete physical systems are directed by the laws of nature, can they hypercompute?

  • (c)

    Can some Markov process hypercompute?

  • (d)

    Empirically what can be said about the Church-Turing thesis?

  • (e)

    What the results of this paper say about all these issues?

  • (f)

    Are nature inspired algorithms hypercomputational?

Then, the plan of the presentation here is as follows: section 2 presents the background from WLTSVM, BA, NBA, NFLO, NFLM to hypercomputation. Section 3 presents the construction of WLBTSVM and WLBNTSVM while section 4 (and APPENDIX-A) describes the results of the conducted experiments. Section 5 introduces the NFLO hierarchy and compares BA and NBA in the light of the available results (and available theory in APPENDIX-B). Then, section 6 is decisional on randomized hypercomputation while section 7 concludes this paper with future work.

Section snippets

Background

Since this paper is written for experts, here the basic material will be introduced while avoiding familiar concepts like SVM etc. starting with WLTSVM and continuing with BA, NBA, NFLO and NFLM. Computation and hypercomputation are also introduced while all of this opens up the doors for conclusive deductions after experiments. Then:

The binary classification problem asks one to find the class of a data point (patterns or objects), while providing one with two kinds (or classes) of objects for

Weighted Linear loss Bat Twin Support Vector Machine and Weighted Linear loss Novel Bat Twin Support Vector Machine

Now, it’s time to create an unorthodox classification function such that it is data dependent, unexplainable and NP-complete (even NP-hard). The previous line means that any algorithm is completely “blind” for it i.e. nothing is known and neither provable for it. Such a function will disclose the novelty claim of BA and NBA. Therefore, I create two classifiers, one with the help of BA “Weighted Linear loss Bat Twin Support Vector Machine (WLBTSVM)” and another driven by NBA “Weighted Linear

Implementations and experiments

I describe the implementations and experimental details of WLBTSVM, WLBNTSVM and the compared classifiers here. For clarity, I broke all of the details in two experimental sets, experiment set-I (E1) and experiment set-II (E2). E1 describes the details of compared classifiers and results of comparisons with WLBTSVM and WLBNTSVM in general i.e. on generally used datasets. E2 describes the details of compared classifiers and results of comparisons with WLBTSVM and WLBNTSVM in particular i.e. on

NFLO Hierarchy and BA vs. NBA

Spacious results can be inferred from the experiments performed in this paper which can answer many questions that an expert can have in mind regarding NFL theorems in general and, BA and NBA in particular. I will answer most important of them below and also introduce here the hierarchy of algorithms because of NFLO. Sufficient material is provided for the reader to even answer other NFL related questions (in APPENDIX A). Then:

(a) From Table 2, Table 3, Table 4, Table 5 it can be seen clearly

Randomized hypercomputation and NIAs

The word randomization is the cause and support of NIA1st and NIA2nd and is also the basis of an extraordinary different claim which says that randomized (evolutionary) computation actually is randomized hypercomputation. This section is decisional about this claim with an interesting thesis NIA3rd coming forth as its decision. NIAs lacks a mathematical theory, but is bounded by physical laws (see Fig. 1) and therefore what follows depends heavily on the empirical results of this paper, laws of

Conclusions

This paper evaluates two closely related NIAs, BA and NBA empirically on a machine learning function WLTSVM. Results are recorded for the UCI datasets by performing standard 10-fold cross validation and comparisons with the trailblazing classifiers. In one line it can be concluded that:

With respect to randomization, there is no difference between BA and NBA on a completely blind (or “black box”) function.

Moreover, answering the questions aroused at the end of sec. 2.2.2: in its raw form, i.e.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

I am highly thankful to my parents, the journal’s editorial team for their efficiency and to the anonymous reviewers whose vital comments pushed me to create foundational material. I find myself unable to thank the grace of the supreme power who is beyond anyone’s comprehension. Every effort has been made to make this research error free, readable and enjoyable, while any remaining mistakes will be mine.

References (49)

  • B. Richhariya et al.

    A reduced universum twin support vector machine for class imbalance learning

    Pattern Recogn.

    (2020)
  • B. Richhariya et al.

    A robust fuzzy least squares twin support vector machine for class imbalance learning

    Appl. Soft Comput.

    (2018)
  • S. Chen et al.

    Global convergence analysis of the bat algorithm using a Markovian framework and dynamical system theory

    Expert Syst. Appl.

    (2018)
  • M. Kordos et al.

    Fuzzy clustering decomposition of genetic algorithm-based instance selection for regression problems

    Inf. Sci.

    (2022)
  • C. Cobos et al.

    Clustering of web search results based on the cuckoo search algorithm and balanced Bayesian information criterion

    Inf. Sci.

    (2014)
  • A. Sharma

    Stochastic nonparallel hyperplane support vector machine for binary classification problems and no-free-lunch theorems

    Evol. Intel.

    (2022)
  • D.H. Wolpert

    The supervised learning no-free-lunch theorems

  • D.H. Wolpert

    The lack of a priori distinctions between learning algorithms

    Neural Comput.

    (1996)
  • D.H. Wolpert

    The existence of a priori distinctions between learning algorithms

    Neural Comput.

    (1996)
  • D.H. Wolpert et al.

    No free lunch theorems for optimization

    IEEE Trans. Evol. Comput.

    (1997)
  • X.S. Yang

    A new metaheuristic bat-inspired algorithm

  • B. Schölkopf et al.

    Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

    (2002)
  • S.S. Haykin

    Neural Networks and Learning Machines

    (2009)
  • C.C. Aggarwal

    Data mining: the textbook

    (2015)
  • View full text