Skip to main content

Integration of Distributed Biological Data Using Modified K-Means Algorithm

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4819))

Abstract

The goals of bioinformatics are the solving of biological questions and the active driving of the work of biologists by offering search and analysis methods for research data. The internet brings us distributed environments in which we can access the databases of various research groups. However, a very large quantity of data always causes trouble, creating crucial problems, such as problems with the search for and analysis of data in these distributed environments. Data clustering can be a solution when searching for data. However, this task is very tedious because its execution time is directly proportional to the volume of data. In this paper we propose a distributed clustering scenario and a modified K-means algorithm for the efficient clustering of biological data, and demonstrate the enhancement in performance that it brings.

This study was supported by a grant of the Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea. (0412-MI01-0416-0002).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Crignon, I., Grzybek, S., Staedtler, F., Masiello, A., Dressman, M., Taheri, F., Stock, R., Lenges, E., Pitarelli, R., Genesio, F., Reinhardt, M.: An Architecture for Standardization and Management of Gene Expression Data in a Global Organization. In: ECCB (European Conference on Computational Biology), Paris, France, September 27-30 (2003)

    Google Scholar 

  2. Zhang, B., Formaml, G.: Distributed Data Clustering Can Be Efficient and Exact, Software Technology Laboratory HPLaboratories, Palo Alto HPL-2000-158 (December 4, 2000)

    Google Scholar 

  3. Albert, O., Zomaya, Y.: Tarek E1-Ghazawi.:Parallel and Distributed Computing for Data Mining. IEEE Concurrency 7(4), 11–13 (1999)

    Article  Google Scholar 

  4. MAGE-ML, http://www.mged.org/Workgroups/MAGE/mage-ml.html

  5. Zhang, Y.-F., Mao, J.-L., Xiong, Z.-Y.: An efficient clustering algorithm. 2003 International Conference on Machine Learning and Cybernetics 1, 261–265 (2003)

    Google Scholar 

  6. Martin, W., Horton, R.M.: MageBuilder: A Schema Translation Tool for Generating MAGE-ML from Tabular Microarray Data. In: 2003 IEEE Proceedings of Computational Systems Bioinformatics (CSB 2003), pp. 431–432 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Zhi-Hua Zhou Joshua Zhexue Huang Xiaohua Hu Jinyan Li Chao Xie Jieyue He Deqing Zou Kuan-Ching Li Mário M. Freire

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeong, J., Ryu, B., Shin, D., Shin, D. (2007). Integration of Distributed Biological Data Using Modified K-Means Algorithm. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77018-3_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77016-9

  • Online ISBN: 978-3-540-77018-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics