Skip to main content

Advertisement

Log in

Combining analytic kernel models for energy-efficient data modeling and classification

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Energy-efficient computing has now become a key challenge not only for data-center operations, but also for many other energy-driven systems, with the focus on reducing of all energy-related costs, and operational expenses, as well as its corresponding and environmental impacts. However, current intelligent data models are typically performance driven. For instance, most data-driven machine-learning approaches are often known to require high computational cost in order to find the global optima. Designing more accurate intelligent data models to satisfy the market needs will hence lead to a higher likelihood of energy waste due to the increased computational cost. This paper thus introduces an energy-efficient framework for large-scale data modeling and classification/prediction. It can achieve a predictive accuracy comparable to or better than the state-of-the-art machine-learning models, while at the same time, maintaining a low computational cost when dealing with large-scale data. The effectiveness of the proposed approaches has been demonstrated by our experiments with two large-scale KDD data sets: Mtv-1 and Mtv-2.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Gartenberg A (2011) Bringing smarter computing to big data, Smarter computing builds a Smarter Planet: 2 in a Series. Available at http://www.adamgartenberg.com/gartenberg/agartenberg.nsf/dx/bringing-smarter-computing-to-big-data1

  2. Hopkins MS (2011) Big data analytics and the path from insights to value. MIT Sloan Manag Rev, 21–32

  3. Tantar AA, Danoy G, Bouvry P, Khan SU (2011) Energy-efficient computing using agent-based multi-objective dynamic optimization. In: Kim JH, Lee MJ (eds) Green IT: technologies and applications. Springer, New York. ISBN 978-3-642-22178-1, Chap. 14

    Google Scholar 

  4. Pinel F, Pecero J, Bouvry P, Khan SU (2010) Memory-aware green scheduling on multi-core processors. In: the 39th IEEE international conference on parallel processing (ICPP), San Diego, CA, USA, September 2010, pp 485–488

    Google Scholar 

  5. Kliazovich D, Bouvry D, Khan SU (2010) DENS: Data center energy-efficient network-aware scheduling. In: ACM/IEEE international conference on green computing and communications (GreenCom), Hangzhou, China, December 2010, pp 69–75

    Google Scholar 

  6. Wang L, Khan SU (2011) Review of performance metrics for green data centers: a taxonomy study. J Supercomput. doi:10.1007/s11227-011-0704-3

    Google Scholar 

  7. Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V. (1994) Boosting and other ensemble methods. Neural Comput 6(6):1289–1301

    Article  MATH  Google Scholar 

  8. Melo JCB, Cavalcanti GDC, Guimaraes GDC (2003) PCA feature extraction for protein structure prediction. In: IEEE proc of the 2003 international joint conference on neural networks, Oregon, USA

    Google Scholar 

  9. Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: NIPS. MIT Press, Cambridge

    Google Scholar 

  10. Zhu X, Wu X, Yang Y (2004) Dynamic classifier selection for effective mining from noisy data streams. In: IEEE int conf in data mining (ICDM’04)

    Google Scholar 

  11. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  12. Hendrix C, Fuchs E, Grohskopf L, Clough D, Guidos A, Leal J, Wahl R (2005) Dual isotope imaging simultaneously distinguishes the distribution of microbicide and HIV surrogates in the distal colon following simulated intercourse. Presentation, Johns Hopkins University and Centers for Disease Control and Prevention

  13. Ahmed NK, Atiya AF, ElGayar N, El-Shishiny H (2007) Tourism demand forecasting using machine learning methods. Int J Artificial Intell Mach Learn (AIML), Special issue on computational methods for the tourism industry

  14. Yoo PD, Sikder A, Taheri J, Zhou BB, Zomaya AY (2008) DomNet: protein domain boundary prediction server. IEEE Trans NanoBiosci 7(2):172–181

    Article  Google Scholar 

  15. Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28:459–471

    Article  Google Scholar 

  16. Srinoy S (2007) Intrusion detection model based on particle swarm optimization and support vector machine. In: IEEE symposium on CISDA, pp 186–192

    Google Scholar 

  17. Garcia-Nieto J, Talbi EG, Alba E, Jourdan E (2007) A comparison between genetic algorithm and PSO approaches for gene selection and classification of microarray data. In: Proceedings of ACM (GECCO), pp 427–429

    Google Scholar 

  18. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: Proceedings of IEEE international conference on systems, man and cybernetics, pp 4104–4108

    Google Scholar 

  19. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139

    Article  MathSciNet  MATH  Google Scholar 

  20. Schapire RE (1999) Theoretical views of boosting and applications, algorithm learning theory. In: Lecture notes in computer science, vol 1720. Springer, Berlin, pp 13–25

    Google Scholar 

  21. Bate A, Lindquist M, Edwards IR, Olsson S, Orre R, Lansner A, De Freitas RM (1998) A Bayesian neural network method for adverse drug reaction signal generation. Clin Pharmacol 54(4):315–321

    Article  Google Scholar 

  22. Dietterich TG, Bakiri G (1995) Machine learning bias, statistical bias and statistical variance of decision tree algorithms. Dept Comput Sci, Oregon State Univ, Corvallies, Tech Rep

  23. Larose DT (2005) Discovering knowledge in data. Wiley, New York

    MATH  Google Scholar 

Download references

Acknowledgements

We are grateful to the Lincoln Laboratory at Massachusetts Institute of Technology (MIT) in the U.S. for providing us the Mtv-2 data set, as well as their invaluable discussions; and special thanks to the British Telecom (BT) and Etisalat BT Innovation Center (EBTIC) for their constructive criticism on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul D. Yoo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, P.D., Zomaya, A.Y. Combining analytic kernel models for energy-efficient data modeling and classification. J Supercomput 63, 790–799 (2013). https://doi.org/10.1007/s11227-012-0776-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0776-8

Keywords

Navigation