Skip to main content

An Adaptive Iterative PCA-SVM Based Technique for Dimensionality Reduction to Support Fast Mining of Leukemia Data

  • Conference paper
  • First Online:
Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 236))

  • 1699 Accesses

Abstract

Primary Goal of a Data mining technique is to detect and classify the data from a large data set without compromising the speed of the process. Data mining is the process of extracting patterns from a large dataset. Therefore the pattern discovery and mining are often time consuming. In any data pattern, a data is represented by several columns called the linear low dimensions. But the data identity does not equally depend upon each of these dimensions. Therefore scanning and processing the entire dataset for every query not only reduces the efficiency of the algorithm but at the same time minimizes the speed of processing. This can be solved significantly by identifying the intrinsic dimensionality of the data and applying the classification on the dataset corresponding to the intrinsic dataset only. Several algorithms have been proposed for identifying the intrinsic data dimensions and reducing the same. Once the dimension of the data is reduced, it affects the classification rate and classification rate may drop due to reduction in number of data points for decision. In this work we propose a unique technique for classifying the leukemia data by identifying and reducing the dimension of the training or knowledge dataset using Iterative process of Intrinsic dimensionality discovery and reduction using Principal Components Analysis (PCA) technique. Further the optimized data set is used to classify the given data using Support Vector Machines (SVM) classification. Results show that the proposed technique performs much better in terms of obtaining optimized data set and classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. http://en.wikipedia.org/wiki/Leukemia

  2. http://en.wikipedia.org/wiki/Microarray_databases

  3. http://www.ncbi.nlm.nih.gov/geo/

  4. Jing, L., Shuzhong, L., Ming, L., Jianyun, N.: Application of dimensionality reduction analysis to fingerprint recognition. In: Proceedings of 2008 International Symposium on Computational Intelligence and Design, iscid, vol. 2, pp. 102–105 (2008)

    Google Scholar 

  5. Lespinats, S., Verleysen, M., Giron, A., Fertil, G.: DD-HDS: a method for visualization and exploration of high-dimensional data. IEEE Trans. Neural Netw. 18(5), 1265–1279 (2007)

    Google Scholar 

  6. Segall, R. S., Pierce, R. M.: Data mining of Leukemia cells using self-organized maps. In: Proceedings of 2009 ALAR Conference on Applied Research in Information Technology, 13 February (2009)

    Google Scholar 

  7. Segall, R. S.: Data mining of microarray databases for the analysis of environmental factors on corn and maize. In: Proceedings of the 2005 Conference of Applied Research in Information Technology, Sponsored by Acxiom Laboratory for Applied Research (ALAR), University of Central Arkansas, 18 February (2005)

    Google Scholar 

  8. Segall, R.S.: Data mining of microarray databases for the analysis of environmental factors on plants using cluster analysis and predictive regression. In: Proceedings of the Thirty-sixth Annual Conference of the Southwest Decision Sciences Institute, vol. 36, no. 1, Dallas, TX, 3–5 March (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vikrant Sabnis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer India

About this paper

Cite this paper

Sabnis, V., Khare, N. (2014). An Adaptive Iterative PCA-SVM Based Technique for Dimensionality Reduction to Support Fast Mining of Leukemia Data. In: Babu, B., et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1602-5_152

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1602-5_152

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-1601-8

  • Online ISBN: 978-81-322-1602-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics