Logarithmic simulated annealing for X-ray diagnosis
Introduction
Since the seminal paper by Asada et al. [5], there has been a rapidly growing interest in new, unconventional types of medical knowledge-based systems which are designed as artificial neural networks, trained by examples (“positive” and “negative”) related to a specific diagnostic problem. So far, the research has been concentrating on digital X-ray-based medical diagnosis [13], [15], [22], [33], [34], [35], although there are applications in different medical branches, for instance, in electrocardiographic measurement and clinical laboratories, see [12], [24].
In [23], the detection of microcalcifications by neural networks was studied. After training a total number of almost 3000 examples, a classification rate of ≈88% was achieved on several 100 test examples.
The paper [30] introduces the assignment of fractal dimensions to tumour structures. The fractal dimensions are assigned to contours which have been extracted by commonly used filtering operations. In fact, these contours represent polygonal structures within a binary image. For example, the fractal dimensions D1=1.13 and D2=1.40 are assigned to the boundary and the interior, respectively, of a glioblastoma.
A high classification rate of nearly 98% is reported in [26], where the Wisconsin breast cancer diagnosis (WBCD) database of 683 cases is taken for learning and testing. The approach is based on feature extraction from image data and uses nine visually assessed characteristics for learning and testing. Among the characteristics are the uniformity of cell size, the uniformity of cell shape, and the clump thickness.
In the present paper, we utilize an extension of the Perceptron algorithm by a simulated annealing-based search strategy [11], [21] for the automated detection of focal liver tumours. The only input to the algorithm are the image data without any preprocessing. Since focal liver tumour detection is not part of screening procedures like the detection of microcalcifications [16], [19], [23], [26], [28], a certain effort is required to collect the image material. To our knowledge, results on neural network applications to focal liver tumour detection are not available in the literature. Therefore, we could not include comparisons to related, previous work in our paper.
During the last decade, research on the classical Perceptron algorithm has been revitalized by a number of papers, see, e.g. [6], [7], [9], [14], [17], [31]. The research on this type of classification algorithms has a long history and goes along with the efforts to find fast and reliable algorithms that solve systems of linear inequalities , . Agmon [2] proposed in 1954 a simple iteration procedure that starts with an arbitrary initial vector . When does not represent a solution of the system, then is taken as the orthogonal projection of the farthest hyperplane which corresponds to a violated linear inequality: , where and maximizes among the violated .
Basically the same method is known as the classical Perceptron algorithm [29]. If the set of points can be separated by a linear function, the following convergence property can be proved for the Perceptron algorithm [25]: let S denote the set of positive and negative input vectors and be a unit vector solution to the separation problem, i.e. for all and for all . Then the Perceptron algorithm converges in at most 1/σ2 iterations, where , η∈{+,−}. The parameter σ has the interpretation of for the angle between and and the value of σ can be exponentially small in terms of the dimension n.
But in general, the much simpler Perceptron algorithm performs well even if the sample set is not consistent with any weight vector of linear threshold functions (see, e.g. [19], [33]). When the sample set is linearly separable, Baum [6] has proved that under modest assumptions it is likely that the Perceptron algorithm will find a highly accurate approximation of a solution vector in polynomial time.
Variants of the Perceptron algorithm on sample sets that are inconsistent with linear separation are presented in [7], [8], [9], [14]. For example, if the (average) inconsistency with linear separation is small relative to σ, then with high probability the Perceptron algorithm will achieve a good classification of samples in polynomial time [8], [9].
Our simulated annealing procedure employs a logarithmic cooling schedule , i.e. the “temperature” decreases at each step. With the modified Perceptron algorithm, we performed computational experiments on fragments of liver CT images. The fragments are of size 119×119 with eight bit grey levels. From 348 positive (with focal liver tumours) and 348 negative examples we calculated independently hypotheses of the type THF=w1x1+⋯+wnxn≥ϑ for n=14161. Then, we performed tests on various sets of 50 positive and negative examples, respectively, that were not presented to the algorithm in the learning phase. The test was performed on threshold circuits of depth-two and three, where in both cases the first layer consists of functions THF. For depth-two and 11 functions THF we obtained ≈91% correct classification. For depth-three circuits with three subcircuits of depth-two we achieved about 97% correct classification on the different sets of 50+50 test examples.
The input to our algorithm was derived from the DICOM standard representation of CT images [20].
The choice of the crucial parameter Γ is based on estimations of the maximum escape depth from local minima of the associated energy landscape. The estimations of Γ were obtained by preliminary computational experiments on CT images. We used this method before in [32] where logarithmic simulated annealing was applied to job shop scheduling.
Section snippets
Basic definitions
The simulated annealing-based extension comes into play when the number of misclassified examples for the new hypothesis is larger than that for the previous one. If this is the case, a random decision is made according to the rules of simulated annealing procedures. When the new hypothesis is rejected, a random choice is made among the misclassified examples for the calculation of the next hypothesis.
To describe our extension of the Perceptron algorithm in more detail, we have to define the
The logarithmic cooling schedule
We are focusing on a special type of inhomogeneous Markov chain where the value c(k) changes in accordance with
The choice of c(k) is motivated by Hajek’s Theorem [18] on logarithmic cooling schedules for inhomogeneous Markov chains. To explain Hajek’s result, we first need to introduce some parameters characterizing local minima of the objective function:
Definition 1 A configuration is said to be reachable at height h from , (if)∃f0,f1,…, , such that
Computational experiments
In most applications, simulated annealing-based heuristics are designed for homogeneous Markov chains, where the convergence to the Boltzmann distribution at fixed temperatures is important for the performance of the algorithm, see [1]. We utilized the general framework of inhomogeneous Markov chains described in [4] for the design of a pattern classification heuristic. In particular, we paid attention to the choice of the parameter Γ which is crucial to the quality of solutions as well as to
Concluding remarks
We performed computational experiments with an extension of the Perceptron algorithm by a simulated annealing-based heuristic that employs the logarithmic cooling schedule , where Γ is a parameter of the underlying configuration space. The experiments were performedon on fragments of liver CT images. The image data are the only input to the algorithm, i.e. no feature extraction or preprocessing is performed. The fragments are of size 119×119 with 8 bit grey levels. From 348
Acknowledgements
The authors would like to thank Eike Hein and Daniela Melzer for preparing the image material. The research has been partially supported by the Strategic Research Programme at The Chinese University of Hong Kong under Grant No. SRP 9505, by a Hong Kong Government RGC Earmarked Grant Ref. No. CUHK 4010/98E, and by the AIF Research Programme under Grant No. FKV 0352401N7.
References (35)
- et al.
Stochastic simulations of two-dimensional composite packings
J. Comput. Phys.
(1997) Metropolis, simulated annealing, and iterated energy transformation algorithms: theory and experiments
J. Complexity
(1996)- et al.
Evolving artificial neural networks for screening features from mammograms
Artif. Intell. Med.
(1998) - et al.
Feature selection for optimized skin tumour recognition using genetic algorithms
Artif. Intell. Med.
(1999) Computer-assisted cervical cancer screening using neural networks
Cancer Lett.
(1994)Use of artificial neural networks in modeling associations of discriminant factors: towards an intelligent selective breast cancer screening
Artif. Intell. Med.
(1999)- Aarts EHL, Korst JHM. Simulated annealing and Boltzmann machines: a stochastic approach. New York: Wiley,...
The relaxation method for linear inequalities
Can. J. Math.
(1954)- Albrecht, A Wong, CK. On logarithmic simulated annealing. In: van Leeuwen J, Watanabe O, Hagiya M, Mosses PD, Ito T,...
- et al.
Neural network approach for differential diagnosis of interstitial lung diseases: a pilot study
Radiology
(1990)
The perceptron algorithm is fast for nonmalicious distributions
Neural Comput.
A polynomial-time algorithm for learning noisy linear threshold functions
Algorithmica
Learning linear threshold approximations using perceptrons
Neural Comput.
A thermodynamical approach to the travelling salesman problem: an efficient simulation algorithm. Institute of Physics and Biophysics, Comenius University, Bratislava, 1982
J. Optim. Theory Appl.
An expert system for the detection of cervical cancer cells using knowledge-based image analyser
Artif. Intell. Med.
Cited by (3)
Bounded-depth threshold circuits for computer-assisted CT image classification
2002, Artificial Intelligence in MedicineCitation Excerpt :Due to the relatively small number of learning examples in each of the classes Cl5l, l=1,…, 5, the run-time to compute even a depth-six circuit is very short (see Table 1). When the pre-processing steps are not taken into account, the underlying circuit structure from Table 2 for depth-five corresponds to depth-three circuits from [2]. Although the test is now performed on 200 images, which is twice as large as the number of test examples in [2], the classification rate for m=3 is comparable to the 97% from [2] even for the relatively small number of k=7.
Optimization of multiobjective simulated annealing array for optical synthetic aperture
2021, Laser and Optoelectronics ProgressMedical staff scheduling using simulated annealing
2015, Quality Innovation Prosperity
- 1
On leave from IBM T.J. Watson Research Center, P.O. Box 210, Yorktown Heights, NY, USA.