Skip to main content

Distributed Classification of Data Streams: An Adaptive Technique

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9263))

Included in the following conference series:

  • 1774 Accesses

Abstract

Mining data streams is a critical task of actual Big Data applications. Usually, data stream mining algorithms work on resource-constrained environments, which call for novel requirements like availability of resources and adaptivity. Following this main trend, in this paper we propose a distributed data stream classification technique that has been tested on a real sensor network platform, namely, Sun SPOT. The proposed technique shows several points of research innovation, with are also confirmed by its effectiveness and efficiency assessed in our experimental campaign.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. IEEE Trans. Syst. Man Cybern. Part B 38, 393422 (2002)

    Google Scholar 

  2. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html

  3. Bonifati, A., Cuzzocrea, A.: Efficient fragmentation of large XML documents. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 539–550. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Cameron J.J., Cuzzocrea A., Jiang F., Leung C.K.-S.: Mining frequent itemsets from sparse data streams in limited memory environments. In: Proceedings of the 14th International Conference on Web-Age Information Management, pp. 51–578 (2013)

    Google Scholar 

  5. Cuzzocrea, A.: Analytics over big data: exploring the convergence of data warehousing, OLAP and data-intensive cloud infrastructures. In: Proceedings of COMPSAC 2013, pp. 481–483 (2013)

    Google Scholar 

  6. Cuzzocrea, A., Chakravarthy, S.: Event-based lossy compression for effective and efficient OLAP over data streams. Data Knowl. Eng. 69(7), 678–708 (2010)

    Article  Google Scholar 

  7. Cuzzocrea, A., Darmont, J., Mahboubi, H.: Fragmenting very large XML data warehouses via K-means clustering algorithm. Int. J. Bus. Intell. Data Min. 4(3/4), 301–328 (2009)

    Article  Google Scholar 

  8. Cuzzocrea, A., Furfaro, F., Mazzeo, G.M., Saccá, D.: A grid framework for approximate aggregate query answering on summarized sensor network readings. In: Meersman, R., Tari, Z., Corsaro, A. (eds.) OTM-WS 2004. LNCS, vol. 3292, pp. 144–153. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  9. Cuzzocrea, A., Furfaro, F., Masciari, E., Sacca’, D., Sirangelo, C.: Approximate query answering on sensor network data streams. In: Stefanidis, A., Nittel, S. (eds.) GeoSensor Networks, pp. 53–72. CRC Press, Boca Raton (2004)

    Google Scholar 

  10. Cuzzocrea, A., Gaber, M.M., Shiddiqi, A.M.: Adaptive data stream mining for wireless sensor networks. In: Proceedings of IDEAS 2014, pp. 284–287 (2014)

    Google Scholar 

  11. Cuzzocrea, A., Russo, V., Saccà, D.: A robust sampling-based framework for privacy preserving OLAP. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 97–114. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Cuzzocrea, A., Sacc, D.: Balancing accuracy and privacy of OLAP aggregations on data cubes. In: Proceedings of DOLAP 2010, pp. 93–98 (2010)

    Google Scholar 

  13. Cuzzocrea, A., Sacc, D., Ullman, J.D.: Big data: a research agenda. In: Proceedings of IDEAS 2013, pp. 198–203 (2013)

    Google Scholar 

  14. Gaber, M.M.: Data stream mining using granularity-based approach. In: Abraham, A., Hassanien, A.E., de Leon, F., de Carvalho, A.P., Snášel, V. (eds.) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol. 206, pp. 47–66. Springer, Berlin (2009)

    Chapter  Google Scholar 

  15. Gaber, M.M.: Advances in data stream mining. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 2(1), 79–85 (2012)

    Google Scholar 

  16. Iordache, O.: Methods. In: Iordache, O. (ed.) Polystochastic Models for Complexity. UCS, vol. 4, pp. 17–61. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  17. Gaber, M.M., Yu, P.S.: A holistic approach for resource-aware adaptive data stream mining. J. New Gener. Comput. 25(1), 95–115 (2006)

    Article  Google Scholar 

  18. Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: A survey of classification methods in data streams. In: Aggarwal, C.C. (ed.) Data Streams Models and Algorithms. Advances in Database Systems, pp. 39–59. Springer, Heidelberg (2007)

    Google Scholar 

  19. Gama, J., Gaber, M.M.: Learning from Data Streams: Processing Techniques in Sensor Networks. Springer, Berlin (2007). ISBN 1420082329, 9781420082326

    Book  Google Scholar 

  20. Ganguly, A., Gama, J., Omitaomu, O., Gaber, M.M., Vatsavai, R.R.: Knowledge Discovery from Sensor Data. CRC Press, Boca Raton (2008). ISBN 1420082329, 9781420082326

    Book  Google Scholar 

  21. Krishnaswamy S., Gama J., Gaber M.M.: Advances in data stream mining for mobile and ubiquitous environments. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2607–2608 (2011)

    Google Scholar 

  22. Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) TLDKS VIII. LNCS, vol. 7790, pp. 174–196. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  23. Phung N.D., Gaber M.M., Rohm, U.: Resource-aware online data mining in wireless sensor networks. In: Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Data Mining, pp. 139–146 (2007)

    Google Scholar 

  24. Rodrigues, P.P., Gama, J., Lopes, L.: Clustering distributed sensor data streams. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 282–297. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  25. Shah R., Krishnaswamy S., Gaber M.M.: Resource-aware very fast k-means for ubiquitous data stream mining. In: Proceedings of Second International Workshop on Knowledge Discovery in Data Streams, held in conjunction with the ECML/PKDD 2005, Porto, Portugal (2005)

    Google Scholar 

  26. Sheng, B., Li, Q., Mao, W., Jin, W.: Outlier detection in sensor networks. In: Proceedings of the 8th ACM International Symposium on Mobile and Ad Hoc Networking and Computing, pp. 219–228 (2007)

    Google Scholar 

  27. Stahl, F., Gaber, M.M., Bramer, M.: Scaling up data mining techniques to large datasets using parallel and distributed processing. In: Rausch, P., Sheta, A.F., Ayesh, A. (eds.) Business Intelligence and Performance Management. Advanced Information and Knowledge Processing, pp. 243–259. Springer, London (2013)

    Chapter  Google Scholar 

  28. Subramaniam S., Palpanas T., Papadopoulos D., Kalogeraki V., Gunopulos D.: Online outlier detection in sensor data using non-parametric models. In: Proceedings of the 32nd International Conference on Very Large Databases, pp. 187–198 (2006)

    Google Scholar 

  29. Yin, J., Gaber, M.M.: Clustering distributed time series in sensor networks. In: Proceedings of the Eighth IEEE International Conference on Data Mining, pp. 678–687, Pisa, Italy, 15–19 December 2008

    Google Scholar 

  30. Yu, B., Cuzzocrea, A., Jeong, D.H., Maydebura, S.: On managing very large sensor-network data using bigtable. In: Proceedings of CCGRID 2012, pp. 918–922 (2012)

    Google Scholar 

  31. Zhuang, Y., Chen, L.: In-network outlier cleaning for data collection in sensor networks. In: Proceedings of the 1st International VLDB Workshop on Clean Databases, pp. 678–687 (2006)

    Google Scholar 

  32. Zhuang, Y., Chen, L., Wang, X., Lian, J.: A weighted average-based approach for cleaning sensor data. In: Proceedings of the 27th International Conference on Distributed Computing Systems, pp. 678–687 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cuzzocrea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cuzzocrea, A., Gaber, M.M., Shiddiqi, A.M. (2015). Distributed Classification of Data Streams: An Adaptive Technique. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science(), vol 9263. Springer, Cham. https://doi.org/10.1007/978-3-319-22729-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22729-0_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22728-3

  • Online ISBN: 978-3-319-22729-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics