Reference Hub2
Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing

Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing

Sikha Bagui, Arup Kumar Mondal, Subhash Bagui
Copyright: © 2019 |Volume: 10 |Issue: 4 |Pages: 16
ISSN: 1947-3532|EISSN: 1947-3540|EISBN13: 9781522566878|DOI: 10.4018/IJDST.2019100101
Cite Article Cite Article

MLA

Bagui, Sikha, et al. "Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing." IJDST vol.10, no.4 2019: pp.1-16. http://doi.org/10.4018/IJDST.2019100101

APA

Bagui, S., Mondal, A. K., & Bagui, S. (2019). Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing. International Journal of Distributed Systems and Technologies (IJDST), 10(4), 1-16. http://doi.org/10.4018/IJDST.2019100101

Chicago

Bagui, Sikha, Arup Kumar Mondal, and Subhash Bagui. "Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing," International Journal of Distributed Systems and Technologies (IJDST) 10, no.4: 1-16. http://doi.org/10.4018/IJDST.2019100101

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In this work the authors present a parallel k nearest neighbor (kNN) algorithm using locality sensitive hashing to preprocess the data before it is classified using kNN in Hadoop's MapReduce framework. This is compared with the sequential (conventional) implementation. Using locality sensitive hashing's similarity measure with kNN, the iterative procedure to classify a data object is performed within a hash bucket rather than the whole data set, greatly reducing the computation time needed for classification. Several experiments were run that showed that the parallel implementation performed better than the sequential implementation on very large datasets. The study also experimented with a few map and reduce side optimization features for the parallel implementation and presented some optimum map and reduce side parameters. Among the map side parameters, the block size and input split size were varied, and among the reduce side parameters, the number of planes were varied, and their effects were studied.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.