The Use of a Supervised k-Means Algorithm on Real-Valued Data with Applications in Health

Al-Harbi, Sami H.; Rayward-Smith, Vic J.

doi:10.1007/3-540-45034-3_58

Sami H. Al-Harbi³ &
Vic J. Rayward-Smith³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2718))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

3682 Accesses
7 Citations

Abstract

k-means is traditionally viewed as an unsupervised algorithm for the clustering of a heterogeneous population into a number of more homogeneous groups of objects. However, it is not necessarily guaranteed to group the same types (classes) of objects together. In such cases, some supervision is needed to partition objects which have the same class label into one cluster. This paper demonstrates how the popular k-means clustering algorithm can be profitably modified to be used as a classifier algorithm. The output field itself cannot be used in the clustering but it is used in developing a suitable metric defined on other fields. The proposed algorithm combines Simulated Annealing and the modified k-means algorithm. We also apply the proposed algorithm to real data sets, which result in improvements in confidence when compared to C4.5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ayan, N. F.: Using Information Gain as Feature Weight. 8th Turkish Symposium on Artificial Intelligence and Neural Networks. (1999)
Google Scholar
Brittain, D.: Optimisation of the Telecommunication Access Network. Bristol, UK: University of Bristol (1999)
Google Scholar
Copson, E. T.: Metric Spaces. Cambridge University Press (1968)
Google Scholar
Everitt, B.: Cluster Analysis. Social Science Research Council (1974)
Google Scholar
Hartigan, J.: Clustering Algorithms. John Wiley and Sons Inc (1975)
Google Scholar
Huang, Z.: Clustering Large Data Sets with Mixed Numberic and Categorical Values. Proceedings of The First Pacific-Asia Conference on Knowledge Discovery and Data Mining (1997)
Google Scholar
Lanner Group Inc.: Data Lamp Version 2.02: Technology for knowing. http://www.lanner.com.
MacQueen, J.: Some methods for classification and analysis of multivariate observations. Proceeding of the 5th Berkeley Symposium. (1967) 281–297
Google Scholar
Rayward-Smith V. J., Osman I. H., Reeves C. R. and Smith G. D.: Modern Heuristic Search Methods. John Wiley and Sons Ltd. (1996)
Google Scholar
Sigillito V.: National Institiute of Diabetes and Digestive and Kidney Diseases. http://www.icu.uci.edu/pub/machine-learning-data-bases. UCI repository of machine learining databases.
William H. Wolberg and O.L. Mangasarian.: pattern separation for medical diagnosis applied to breast cytology. http://www.icu.uci.edu/pub/machine-learning-databases. UCI repository of machine learining databases.

Download references

Author information

Authors and Affiliations

School of Information Systems, University of East Anglia, Norwich, NR4 7TJ, UK
Sami H. Al-Harbi & Vic J. Rayward-Smith

Authors

Sami H. Al-Harbi
View author publications
You can also search for this author in PubMed Google Scholar
Vic J. Rayward-Smith
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, Loughborough University, Loughborough, LE11 3TU, England
Paul W. H. Chung & Chris Hinde &
Dept. of Computer Science, Southwest Texas State University, 601 University Drive, San Marcos, TX, 78666, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al-Harbi, S.H., Rayward-Smith, V.J. (2003). The Use of a Supervised k-Means Algorithm on Real-Valued Data with Applications in Health. In: Chung, P.W.H., Hinde, C., Ali, M. (eds) Developments in Applied Artificial Intelligence. IEA/AIE 2003. Lecture Notes in Computer Science(), vol 2718. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45034-3_58

Download citation

DOI: https://doi.org/10.1007/3-540-45034-3_58
Published: 24 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40455-2
Online ISBN: 978-3-540-45034-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics