“Copasetic Clustering”: Making Sense of Large-Scale Images

Fraser, Karl; O’Neill, Paul; Wang, Zidong; Liu, Xiaohui

doi:10.1007/978-3-540-30537-8_11

“Copasetic Clustering”: Making Sense of Large-Scale Images

Karl Fraser⁴,
Paul O’Neill⁴,
Zidong Wang⁴ &
…
Xiaohui Liu⁴

Conference paper

1070 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3327))

Abstract

In an information rich world, the task of data analysis is becoming ever more complex. Even with the processing capability of modern technology, more often than not, important details become saturated and thus, lost amongst the volume of data. With analysis problems ranging from discovering credit card fraud to tracking terrorist activities the phrase “a needle in a haystack” has never been more apt. In order to deal with large data sets current approaches require that the data be sampled or summarised before true analysis can take place. In this paper we propose a novel pyramidic method, namely, copasetic clustering, which focuses on the problem of applying traditional clustering techniques to large-scale data sets while using limited resources. A further benefit of the technique is the transparency into intermediate clustering steps; when applied to spatial data sets this allows the capture of contextual information. The abilities of this technique are demonstrated using both synthetic and biological data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berkhin, P.: Survey of clustering data mining techniques. In: Accrue Software, San Jose, CA (2002)
Google Scholar
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Dunn, C.J.: A fuzzy relative of ISODATA process and its use in detecting compact well-separated clusters. Cybernetics 3(3), 32–57 (1974)
Article MathSciNet MATH Google Scholar
Wann, D.C., Thomopoulos, A.S.: A comparative study of self-organising clustering algorithms Dignet and ART2. Neural Networks 10(4), 737–743 (1997)
Article Google Scholar
DuMouchel, W., Volinsky, C., Johnson, T., Cortes, C., Pregibon, D.: Squashing flat files flatter. In: Proceedings of the 5th ACM SIGKDD, pp. 6–15 (1999)
Google Scholar
Motwani, R., Raghavan, P.: Randomised algorithms. Cambridge University Press, Cambridge (1995)
Book MATH Google Scholar
Moore, K.S.: Making Chips. IEEE Spectrum, 54–60 (2001)
Google Scholar
Orengo, A.C., Jones, D.T., Thorton, M.J.: Bioinformatics: Genes, proteins & computers, pp. 217–244. BIOS scientific publishers limited (2003)
Google Scholar
The chipping forecast II. Nature Genetics Supplement, 461–552 (2002)
Google Scholar
Yang, H.Y., Buckley, J.M., Dudoit, S., Speed, P.T.: Comparison of methods for image analysis on cDNA microarray data. J. Comput. Graphical Stat. 11, 108–136 (2002)
Article MathSciNet Google Scholar
Netravali, N.A., Haskell, G.B.: Digital pictures: Representation, compression and standards, 2nd edn. Plenum Press, New York (1995)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems and Computing, Brunel University, Uxbridge, Middlesex, UB8 3PH, UK
Karl Fraser, Paul O’Neill, Zidong Wang & Xiaohui Liu

Authors

Karl Fraser
View author publications
You can also search for this author in PubMed Google Scholar
Paul O’Neill
View author publications
You can also search for this author in PubMed Google Scholar
Zidong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100080, China; College of Information Science and Technology, University of Nebraska at Omaha, 68182, Omaha, NE, USA
Yong Shi
Institute of Policy & Management, Chinese Academy of Sciences, P.O. Box, 100080, Beijing, P.R. China
Weixuan Xu
College of Information Science and Technology, University of Nebraska at Omaha, 68182, NE, USA
Zhengxin Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fraser, K., O’Neill, P., Wang, Z., Liu, X. (2004). “Copasetic Clustering”: Making Sense of Large-Scale Images. In: Shi, Y., Xu, W., Chen, Z. (eds) Data Mining and Knowledge Management. CASDMKM 2004. Lecture Notes in Computer Science(), vol 3327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30537-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-30537-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23987-1
Online ISBN: 978-3-540-30537-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics