Skip to main content

Abstract

In recent years there is an increasing interest for analytical methods that learn patterns over large-scale data distributed over Peer-to-Peer (P2P) networks and support applications. Mining patterns in such distributed and dynamic environment is a challenging task, because centralization of data is not feasible. In this paper, we have proposed a distributed classification technique based on relevance vector machines (RVM) and local model exchange among neighboring peers in a P2P network. In such networks, the evaluation criteria for an efficient distributed classification algorithm is based on the size of resulting local models (communication efficiency) and their prediction accuracy. RVM utilizes dramatically fewer kernel functions than a state-of-the-art “support vector machine” (SVM), while demonstrating comparable generalization performance. This makes RVM a suitable choice to learn compact and accurate local models at each peer in a P2P network. Our model propagation approach, exchange resulting models with peers in a local neighborhood to produce more accurate network wide global model, while keeping the communication cost low throughout the network. Through extensive experimental evaluations, we demonstrate that by using more relevant and compact models, our approach outperforms the baseline model propagation approaches in terms of accuracy and communication cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ang, H.-H., Gopalkrishnan, V., Hoi, S. C., & Ng, W. W. (2008). Cascade RSVM in Peer-to-Peer Networks. In European Conference on Machine Learning and Knowledge Discovery in Databases.

    Google Scholar 

  • Bhaduri, K., Wolff, R., Giannella, C., & Kargupta, H. (2008). Distributed decision-tree induction in peer-to-peer systems. Statistical Analysis and Data Mining, 1(2), 85–103.

    Article  MathSciNet  Google Scholar 

  • Caruana, G., & Li, M. (2012). A survey of emerging approaches to spam filtering. ACM Computing Surveys, 44(2), Article 9, 27.

    Google Scholar 

  • Chang, C.-C., & Lin, C.-J., LIBSVM. (2011). A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.

    Google Scholar 

  • Datta, S., Giannella, C., & Kargupta, H. (2009). Approximate distributed k-means clustering over a peer-to-peer network. Transactions on Knowledge and Data Engineering, 21(10), 1372–1388.

    Article  Google Scholar 

  • Lee, Y.-J., & Mangasarian, O. L.(2001). RSVM: Reduced support vector machines. In First SIAM International Conference on Data Mining, 5–7.

    Google Scholar 

  • Luo, P., Xiong, H., Kevin, L., & Shi, Z. (2007). Distributed classification in peer-to-peer networks. In 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07)

    Google Scholar 

  • MacKay, D. J. (1996). Bayesian methods for back propagation networks. Models of neural networks III (pp. 211–254). New York: Springer

    Book  Google Scholar 

  • Odysseas, P., Siberski, W., & Siersdorfer, S. (2011). Collaborative classification over P2P networks. In 20th International Conference Companion on World Wide Web (WWW ’11)

    Google Scholar 

  • Tipping, M. E. (2001). Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.

    MATH  MathSciNet  Google Scholar 

  • Wolff, R., & Schuster, A. (2004). Association rule mining in peer-to-peer systems. Transactions on Systems, Man, and Cybernetics, Part B, 34(6), 2426–2438.

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded by the Seventh Framework Program of European Commission, through the project REDUCTION (No. 288254). www.reduction-project.eu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Umer Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Khan, M.U., Nanopoulos, A., Schmidt-Thieme, L. (2015). P2P RVM for Distributed Classification. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds) Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44983-7_13

Download citation

Publish with us

Policies and ethics