Skip to main content

Parallel Attribute Reduction Based on MapReduce

  • Conference paper
Rough Sets and Knowledge Technology (RSKT 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8818))

Included in the following conference series:

Abstract

With the explosive increment of data, varieties of the parallel attribute reduction algorithm have been studied. To promote its efficiency, this paper proposes a new parallel attribute reduction algorithm based on MapReduce. It contains three parts, parallel computation of a simplified decision table, parallel computation of attribute significance and parallel computation of decision table. Data with different sizes are experimented. The experimental result shows that our algorithm has the ability of processing massive data with efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM SIGOPS Operating Systems Review 37(5), 29–43 (2003)

    Article  Google Scholar 

  2. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, OSDI 2004, vol. 6, p. 10. USENIX Association, Berkeley (2004)

    Google Scholar 

  3. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  4. Chang, F., Dean, J., Ghemawat, S., et al.: Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS) 26(2), 4 (2008)

    Article  Google Scholar 

  5. Pawlak, Z.: Rough set. International Journal of Computer and Information Sciences 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  6. Zhang, J., Li, T., Ruan, D., et al.: A parallel method for computing rough set approximations. Information Sciences 194, 209–223 (2012)

    Article  Google Scholar 

  7. Zhang, J., Wong, J., Li, T., Li, P.Y.: A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems. International Journal of Approximate Reasoning (2013)

    Google Scholar 

  8. Qian, J., Miao, D.Q., Zhang, Z.H.: Knowledge reduction algorithms in cloud computing. Jisuanji Xuebao (Chinese Journal of Computers) 34(12), 2332–2343 (2011)

    Google Scholar 

  9. Wang, G.: Rough Set Theory and knowledge Acquisition. Jiaotong University Press, Xi’an (2001) (in Chinese)

    Google Scholar 

  10. White, T.: Hadoop: The definitive guide. O’Reilly Media, Inc. (2012)

    Google Scholar 

  11. Zhangyan, X., Zuopeng, L., Bingru, Y., et al.: A quick attribute reduction algorithm with complexity of max {O (| C|| U|), O (| C| 2| U/C|)}. Chinese Journal of Computers 29(3), 391–399 (2006)

    Google Scholar 

  12. Qian, J., Miao, D.Q., Zhang, Z.H.: Research on Discernibility Matrix Knowledge Reduction Algorithm in Cloud Computing. Computer Science 38(8), 193 (2011)

    Google Scholar 

  13. Yang, Y., Chen, Z., Liang, Z., Wang, G.: Attribute reduction for massive data based on rough set theory and mapReduce. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds.) RSKT 2010. LNCS, vol. 6401, pp. 672–678. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Yang, Y., Chen, Z.: Parallelized computing of attribute core based on rough set theory and mapReduce. In: Li, T., Nguyen, H.S., Wang, G., Grzymala-Busse, J., Janicki, R., Hassanien, A.E., Yu, H. (eds.) RSKT 2012. LNCS, vol. 7414, pp. 155–160. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Qian, J., Miao, D., Zhang, Z., et al.: Parallel attribute reduction algorithms using MapReduce. Information Sciences (2014)

    Google Scholar 

  16. Hadoop project develops open-source software for reliable, scalable, distribute computing, http://hadoop.apache.org

  17. Newman, D., Hettich, S., Blake, C., Merz, C.: UCI Repository of Machine Learning Databases, University of California, Department of Information andComputer Science, Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Xi, D., Wang, G., Zhang, X., Zhang, F. (2014). Parallel Attribute Reduction Based on MapReduce. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds) Rough Sets and Knowledge Technology. RSKT 2014. Lecture Notes in Computer Science(), vol 8818. Springer, Cham. https://doi.org/10.1007/978-3-319-11740-9_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11740-9_58

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11739-3

  • Online ISBN: 978-3-319-11740-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics