Reference Hub3
Multi-Step Iterative Algorithm for Feature Selection on Dynamic Documents

Multi-Step Iterative Algorithm for Feature Selection on Dynamic Documents

Prafulla Bharat Bafna, Shailaja Shirwaikar, Dhanya Pramod
Copyright: © 2016 |Volume: 6 |Issue: 2 |Pages: 17
ISSN: 2155-6377|EISSN: 2155-6385|EISBN13: 9781466692497|DOI: 10.4018/IJIRR.2016040102
Cite Article Cite Article

MLA

Bafna, Prafulla Bharat, et al. "Multi-Step Iterative Algorithm for Feature Selection on Dynamic Documents." IJIRR vol.6, no.2 2016: pp.24-40. http://doi.org/10.4018/IJIRR.2016040102

APA

Bafna, P. B., Shirwaikar, S., & Pramod, D. (2016). Multi-Step Iterative Algorithm for Feature Selection on Dynamic Documents. International Journal of Information Retrieval Research (IJIRR), 6(2), 24-40. http://doi.org/10.4018/IJIRR.2016040102

Chicago

Bafna, Prafulla Bharat, Shailaja Shirwaikar, and Dhanya Pramod. "Multi-Step Iterative Algorithm for Feature Selection on Dynamic Documents," International Journal of Information Retrieval Research (IJIRR) 6, no.2: 24-40. http://doi.org/10.4018/IJIRR.2016040102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

The authors propose clustering based multistep iterative algorithm. The important step is where terms are grouped by synonyms. It takes advantage of semantic relativity measure between the terms. Term frequency is computed of the group of synonyms by considering the relativity measure of the terms appearing in the document from the parent term in the group. This increases the importance of terms which though individually appear less frequently but together show their strong presence. The authors tried experiments on different real and artificial datasets such as NEWS 20, Reuters, emails, research papers on different topics. Resulted entropy shows that their algorithm gives improved result on certain set of documents which are well-articulated, such as research papers. The results are marginal on documents where the message is emphasized by repetitions of terms specifically the documents that are rapidly generated such as emails. The authors also observed that newly arrived documents get appropriately mapped based on proximity to the semantic group.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.