Abstract
In today’s scaled out systems, co-scheduling data analytics work with high priority user workloads is common as it utilizes better the vast hardware availability. User workloads are dominated by periodic patterns, with alternating periods of high and low utilization, creating promising conditions to schedule data analytics work during low activity periods. To this end, we show the effectiveness of machine learning models in accurately predicting user workload intensities, essentially by suggesting the most opportune time to co-schedule data analytics work. Yet, machine learning models cannot predict the effects of performance interference when co-scheduling is employed, as this constitutes a “new” observation. Specifically, in tiered storage systems, their hierarchical design makes performance interference even more complex, thus accurate performance prediction is more challenging. Here, we quantify the unknown performance effects of workload co-scheduling by enhancing machine learning models with queuing theory ones to develop a hybrid approach that can accurately predict performance and guide scheduling decisions in a tiered storage system. Using traces from commercial systems we illustrate that queuing theory and machine learning models can be used in synergy to surpass their respective weaknesses and deliver robust co-scheduling solutions that achieve high performance.
Similar content being viewed by others
Notes
The Wikipedia tracesare publicly available [4]. Due of confidentiality agreements, the storage system trace or provider details can not be made publicly available.
We assume \(t_w = 1\) min in our experimental evaluation, but this could be adjusted according to the specific system requirement.
For presentation reasons, we use a 2-tiered storage system to explain our methodology, but it could be easily extended to storage systems with more tiers. This discussion also applies to caching.
References
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytics. CIDR 11, 261–272 (2011)
Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: Mad skills: new analysis practices for big data. Proc. VLDB Endow. 2(2), 1481–1492 (2009)
Zhuang, Z., Ramachandra, H., Tran, C., Subramaniam, S., Botev, C., Xiong, C., Sridharan, B.: Capacity planning and headroom analysis for taming database replication latency: experiences with linkedin internet traffic. In: Proceedings of the 6th ACM/SPEC ICPE, pp. 39–50 (2015)
Urdaneta, G., Pierre, G., van Steen, M.: Wikipedia workload analysis for decentralized hosting. Elsevier Comput. Netw. 53(11), 1830–1845 (2009)
Xue, J., Yan, F., Riska, A., Smirni, E.: Storage workload isolation via tier warming: how models can help. In: Proceedings of the 11th ICAC, pp. 1–11 (2014)
Peters, M.: 3PAR: optimizing I/O service levels. ESG White Paper (2010)
Laliberte, B.: Automate and optimize a tiered storage environment—FAST! ESG White Paper (2009)
Amazon ElastiCache. http://aws.amazon.com/elasticache. Accessed 11 Mar 2015
Guerra, J., Pucha, H., Glider, J.S., Belluomini, W., Rangaswami, R.: Cost effective storage using extent based dynamic tiering. In: FAST, pp. 273–286 (2011)
Oh, Y., Choi, J., Lee, D., Noh, S.H.: Caching less for better performance: balancing cache size and update cost of flash memory cache in hybrid storage systems. In: FAST, pp. 313–326 (2012)
FIO Benchmark. http://www.freecode.com/projects/fio. Accessed 11 Mar 2015
Bjorkqvist, M., Chen, L.Y., Binder, W.: Opportunistic service provisioning in the cloud. In: 5th IEEE CLOUD, pp. 237–244 (2012)
Ansaloni, D., Chen, L.Y., Smirni, E., Binder, W.: Model-driven consolidation of java workloads on multicores. In: 42nd IEEE/IFIP DSN, pp. 1–12 (2012)
Birke, R., Björkqvist, M., Chen, L.Y., Smirni, E., Engbersen, T.: (Big)data in a virtualized world: volume, velocity, and variety in cloud datacenters. In: FAST, pp. 177–189 (2014)
Leemis, L.M., Park, S.K.: Discrete-Event Simulation: A First Course. Pearson Prentice Hall, Upper Saddle River (2006)
George, B.: Time Series Analysis: Forecasting & Control, 3rd edn. Pearson Education India, Gurgaon (1994)
Goodwin, P.: The holt-winters approach to exponential smoothing: 50 years old and going strong. In: Foresight, pp. 30–34 (2010)
Frank, R.J., Davey, N., Hunt, S.P.: Time series prediction and neural networks. J. Intell. Robot. Syst. 31(1–3), 91–103 (2001)
Hassoun, M.H.: Fundamentals of Artificial Neural Networks, 1st edn. MIT Press, Cambridge (1995)
Hill, T., O’Connor, M., Remus, W.: Neural network models for time series forecasts. Manag. Sci. 42(7), 1082–1092 (1996)
Demuth, H., Beale, M., Hagan, M.: Neural network toolbox\(^{TM}\) 6, User Guide
Stokely, M., Mehrabian, A., Albrecht, C., Labelle, F., Merchant, A.: Projecting disk usage based on historical trends in a cloud environment. In: Proceedings of the 3rd Workshop on ScienceCloud, pp. 63–70 (2012)
Ross, S.M.: Introduction to Probability and Statistics for Engineers and Scientists. Academic Press, Cambridge (2009)
Tijms, H.C.: A first course in stochastic models. Wiley, New York (2003)
Lim, H.C., Babu, S., Chase, J.S.: Automated control for elastic storage. In: Proceedings of the 7th ICAC. ACM, pp. 1–10 (2010)
Cucinotta, T., Checconi, F., Abeni, L., Palopoli, L.: Self-tuning schedulers for legacy real-time applications. In: Proceedings of the 5th EuroSys, pp. 55–68 (2010)
Ferrer, A.J., HernáNdez, F., Tordsson, J., Elmroth, E., Ali-Eldin, A., Zsigri, C., Sirvent, R., Guitart, J., Badia, R.M., Djemame, K., et al.: Optimis: a holistic approach to cloud service provisioning. Futur. Gener Comput. Syst. 28, 66–77 (2012)
Singh, R., Shenoy, P., Natu, M., Sadaphal, V., Vin, H.: Analytical modeling for what-if analysis in complex cloud computing applications. ACM SIGMETRICS Perform. Eval. Rev. 40(4), 53–62 (2013)
Zhang, Q., Cherkasova, L., Smirni, E.: A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: Proceedings of the 4th ICAC, pp. 27–36 (2007)
Yan, F., Riska, A., Smirni, E.: Busy bee: how to use traffic information for better scheduling of background tasks. In: Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pp. 145–156 (2012)
Cortez, P., Rio, M., Rocha, M., Sousa, P.: Multi-scale internet traffic forecasting using neural networks and time series methods. Expert Syst. 29(2), 143–155 (2012)
Li, J., Moore, A.W.: Forecasting web page views: methods and observations. J. Mach. Learn. Res. 9(10), 2217–2250 (2008)
Couceiro, M., Romano, P., Rodrigues, L.: A machine learning approach to performance prediction of total order broadcast protocols. In: 4th IEEE SASO, pp. 184–193 (2010)
Didona, D., Quaglia, F., Romano, P., Torre, E.: Enhancing performance prediction robustness by combining analytical modeling and machine learning. In: Proceedings of the 6th ACM/SPEC ICPE, pp.145–156 (2015)
Acknowledgments
This work is supported by NSF Grant CCF-1218758.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xue, J., Yan, F., Riska, A. et al. Scheduling data analytics work with performance guarantees: queuing and machine learning models in synergy. Cluster Comput 19, 849–864 (2016). https://doi.org/10.1007/s10586-016-0563-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-016-0563-z