Learning-aided predictor integration for system performance prediction

Zhang, Jian; Figueiredo, Renato J.

doi:10.1007/s10586-007-0041-8

Learning-aided predictor integration for system performance prediction

Published: 02 October 2007

Volume 10, pages 425–442, (2007)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Jian Zhang¹ &
Renato J. Figueiredo¹

74 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

The integration of multiple predictors promises higher prediction accuracy than the accuracy that can be obtained with a single predictor. The challenge is how to select the best predictor at any given moment. Traditionally, multiple predictors are run in parallel and the one that generates the best result is selected for prediction. In this paper, we propose a novel approach for predictor integration based on the learning of historical predictions. Compared with the traditional approach, it does not require running all the predictors simultaneously. Instead, it uses classification algorithms such as k-Nearest Neighbor (k-NN) and Bayesian classification and dimension reduction technique such as Principal Component Analysis (PCA) to forecast the best predictor for the workload under study based on the learning of historical predictions. Then only the forecasted best predictor is run for prediction. Our experimental results show that it achieved 20.18% higher best predictor forecasting accuracy than the cumulative MSE based predictor selection approach used in the popular Network Weather Service system. In addition, it outperformed the observed most accurate single predictor in the pool for 44.23% of the performance traces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Particle swarm optimization algorithm: an overview

Article 17 January 2017

A survey on ensemble learning

Article 30 August 2019

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Article Open access 18 December 2020

References

Foster, I.: The anatomy of the grid: enabling scalable virtual organizations. In: Proceedings of First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 6–7 (2001)
Krsul, I., Ganguly, A., Zhang, J., Fortes, J., Figueiredo, R.: Vmplants: providing and managing virtual machine execution environments for grid computing. In: Proceedings of Supercomputing’04, Washington, DC, 6–12 November 2004
Pinter, S., Aridor, Y., Shultz, S., Guenender, S.: Improving machine virtualization with ‘hotplug memory’. In: Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing, pp. 168–175 (2005)
Wolski, R.: Dynamically forecasting network performance using the network weather service. J. Clust. Comput. (1998)
Matsuba, I., Suyari, H., Weon, S., Sato, D.: Practical chaos time series analysis with financial applications. In: Proceedings of 5th International Conference on Signal Processing, vol. 1, Beijing, pp. 265–271 (2000)
Pavlidis, N., Tasoulis, D., Vrahatis, M.: Financial forecasting through unsupervised clustering and evolutionary trained neural networks. In: The 2003 Congress on Evolutionary Computation, vol. 4, pp. 2314–2321 (2003)
Cheng, L., Marsic, I.: Modeling and prediction of session throughput of constant bit rate streams in wireless data networks. IEEE Wirel. Commun. Netw. 3, 1733–1741 (2003)
Google Scholar
Magni, P., Bellazzi, R.: A stochastic model to assess the variability of blood glucose time series in diabetic patients self-monitoring. IEEE Trans. Biomed. Eng. 53(6), 977–985 (2006)
Article Google Scholar
Didan, K., Huete, A.: Analysis of the global vegetation dynamic metrics using modis vegetation index and land cover products. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS’04), vol. 3, pp. 2058–2061 (2004)
Chan, P., Mahoney, M.: Modeling multiple time series for anomaly detection. In: 5th IEEE International Conference on Data Mining, pp. 90–97 (2005)
Dinda, P.: The statistical properties of host load. Sci. Program. 7(3–4), (1999)
Dinda, P.: Host load prediction using linear models. Clust. Comput. 3(4) (2000)
Dinda, P.: Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems. IEEE Trans. Parallel Distrib. Syst. 17(2) (2006)
Yang, L., Schopf, J.M., Foster, I.: Conservative scheduling: Using predicted variance to improve scheduling decisions in dynamic environments. In: Proceedings of the ACM/IEEE Conference on Supercomputing, November 15–21, 2003, p. 31
Zhang, Y., Sun, W., Inoguchi, Y.: CPU load predictions on the computational grid. In: 6th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 06, vol. 1, May 2006, pp. 321–326 (2006)
Liang, J., Nahrstedt, K., Zhou, Y.: Adaptive multi-resource prediction in distributed resource sharing environment. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 293–300 (2004)
Vazhkudai, S., Schopf, J.: Predicting sporadic grid data transfers. In: Proceedings of International Symposium on High Performance Distributed Computing, pp. 188–196 (2002)
Vazhkudai, S., Schopf, J., Foster, I.: Using disk throughput data in predictions of end-to-end grid data transfers. In: The 3rd International Workshop on Grid Computing (GRID 2002), November 2002
Gunter, S., Bunke, H.: An evaluation of ensemble methods in handwritten word recognition based on feature selection. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR, vol. 1, August 2004, pp. 388–392
Jain, G., Ginwala, A., Aslandogan, Y.: An approach to text classification using dimensionality reduction and combination of classifiers. In: Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, IRI 2004, November 2004, pp. 564–569
Goldberg, R.P.: Survey of virtual machine research. IEEE Comput. Mag. 7(6), 34–45 (1974)
Google Scholar
Figueiredo, R., Dinda, P., Fortes, J.: A case for grid computing on virtual machines. In: Proceedings of 23rd International Conference on Distributed Computing Systems, 19–22 May 2003, pp. 550–559 (2003)
Clark, C., Fraser, K., Hand, S., Hanseny, J., July, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proceedings of NSDI’05, Boston, MA (2005)
Sapuntzakis, C., Chandra, R., Pfaff, B., Chow, J., Lam, M., Rosenblum, M.: Optimizing the migration of virtual computers. In: Proceedings of the Fifth Symposium on Operating Systems Design and Implementation, December 2002, pp. 377–390 (2002)
Kozuch, M., Satyanarayanan, M.: Internet suspend/resume. In: Proceedings of Fourth IEEE Workshop on Mobile Computing Systems and Applications, pp. 40–46 (2002)
Smith, J., Nair, R.: Virtual Machines. Kaufmann, New York (2005)
MATH Google Scholar
Zhang, J., Figueiredo, R.: Application classification through monitoring and learning of resource consumption patterns. In: Proceedings of IPDPS’06, Rhodes Island, Greece, 25–29 April 2006
Zhang, J., Figueiredo, R.: Autonomic feature selection for application classification. In: Proceedings of ICAC ’06, pp. 43–52 (2006)
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles, Bolton Landing, NY, USA (2003)
http://www.vmware.com/pdf/esx25_admin.pdf
Keahey, K., Foster, I., Freeman, T., Zhang, X.: Virtual workspaces: Achieving quality of service and quality of life in the grid. Sci. Program. J. 13(4), 265–276 (2005)
Google Scholar
Reed, D., Pratt, I., Menage, P., Early, S., Stratford, N.: Xenoservers: accountable execution of untrusted programs. In: Proceedings of the Seventh Workshop on Hot Topics in Operating Systems, Rio Rico, AZ, pp. 136–141 (1999)
V. white paper, Comparing the mui, virtualcenter, and vmkusage. [Online]. Available: www.vmware.com/pdf/mui_vmkusage2.pdf
Cryer, J.D.: Time Series Analysis. Duxbury, Boston (1986)
MATH Google Scholar
John, S.G., Rawlings, O., Dickey, D.A.: Applied Regression Analysis. Springer, Berlin (2001)
Google Scholar
Trevor Hastie, R.T., Friedman, J.: The Elements of Statistical Learning. Springer, Berlin (2001)
MATH Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, New York (2001)
MATH Google Scholar
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Knowledge Discovery and Data Mining, pp. 245–250, 2001
Sirovich, L., Everson, R.: Management and analysis of large scientific datasets. Int. J. Supercomput. Appl. 6(1), 50–68 (1992)
Google Scholar
Yang, J., Zhang, Y., Kisiel, B.: A scalability analysis of classifiers in text categorization. In: ACM SIGIR’03, pp. 96–103 (2003)
Friedman, F., Baskett, J.H., Shustek, L.: An algorithm for finding nearest neighbors. IEEE Trans. Comput. 24(10), 1000–1006 (1975)
Article MATH Google Scholar
Friedman, J., Bentley, J.H., Finkel, R.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3, 209–226 (1977)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Computing and Information Systems (ACIS) Laboratory, Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA
Jian Zhang & Renato J. Figueiredo

Authors

Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Renato J. Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Figueiredo, R.J. Learning-aided predictor integration for system performance prediction. Cluster Comput 10, 425–442 (2007). https://doi.org/10.1007/s10586-007-0041-8

Download citation

Received: 19 July 2007
Accepted: 13 August 2007
Published: 02 October 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10586-007-0041-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning-aided predictor integration for system performance prediction

Abstract

Access this article

Similar content being viewed by others

Particle swarm optimization algorithm: an overview

A survey on ensemble learning

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning-aided predictor integration for system performance prediction

Abstract

Access this article

Similar content being viewed by others

Particle swarm optimization algorithm: an overview

A survey on ensemble learning

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation