Abstract
As email workloads keep rising, email servers need to handle this explosive growth while offering good quality of service to users. In this work, we focus on modeling the workload of the email servers of four universities (2 from Greece, 1 from the UK, 1 from Australia). We model all types of email traffic, including user and system emails, as well as spam. We initially tested some of the most popular distributions for workload characterization and used statistical tests to evaluate our findings. The significant differences in the prediction accuracy results for the four datasets led us to investigate the use of a Recurrent Neural Network (RNN) as time series modeling to model the server workload, which is a first for such a problem. Our results show that the use of RNN modeling leads in most cases to high modeling accuracy for all four campus email traffic datasets.
Similar content being viewed by others
References
Jackson, T., Dawson, R., Wilson, D.: The cost of email interruption. J. Syst. Inf. Technol. 5, 81–92 (2001)
Takemura, T., Ebara, H.: Spam mail reduces economic effects. In: Proceedings of the 2nd IEEE International Conference on the Digital Society (2008)
Kashyap, A., et al.: Internet Security Threat report (2014). http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v19_21291018.en-us.pdf. Accessed 15 June 2017
Gomez, L.H., Cazita, C., Almeida, J.M., Almeida, V., Meira Jr., W.: Workload models of spam and legitimate e-mails. Perform. Eval. 64(7–8), 690–741 (2007)
Bertolotti, L., Calzarossa, M.C.: Workload characterization of email servers. In: Proceedings of SPECTS (2000)
Shah, S., Noble, B.D.: A study of e-mail patterns. Softw. – Pract. Exp. 37(14), 1515–1538 (2007)
Paxson, V.: Empirically-derived analytic models of wide-area TCP connections. IEEE/ACM Trans. Netw. 2(4), 316–336 (1994)
Anderson, T.W., Darling, D.A.: Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Stat. 23(2), 193–212 (1952)
Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, 2nd edn. McGraw-Hill, New York City (1991)
Massey, F.J.: The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Lanfranchi, L.I., Bing, B.K.: MPEG-4 bandwidth prediction for broadband cable networks. IEEE Trans. Broadcast. 54(4), 741–751 (2008)
Boukoros, S., Kalampogia, A., Koutsakis, P.: A new highly accurate workload model for campus email traffic. In: Proceedings of the International Conference on Computing, Networking and Communications (ICNC), pp. 1–7 (2016)
Navaroli, N., DuBois, C., Smyth, P.: Statistical models for exploring individual email communication behavior. In: Proceedings of the Asian Conference on Machine Learning (2012)
Hüsken, M., Stagge, P.: Recurrent neural networks for time series classication. Neurocomputing 50(C), 223–235 (2013)
Längkvist, M., Karlsson, L., Loutfi, A.: A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogn. Lett. 42(1), 11–24 (2014)
Rather, A.M., Agarwal, A., Sastry, V.: Recurrent neural network and a hybrid model for prediction of stock returns. Expert Syst. Appl. 42(6), 3234–3241 (2015)
Bontempi, G., Ben Taieb, S., Le Borgne, Y.-A.: Machine learning strategies for time series forecasting. In: Aufaure, M.-A., Zimányi, E. (eds.) eBISS 2012. LNBIP, vol. 138, pp. 62–77. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36318-4_3
Acknowledgements
We would like to sincerely thank Mr. Panagiotis Kontogiannis, Head of the Educational Computational Infrastructure at the Technical University of Crete, Mr. Martin Connell, Senior Systems Engineer at LJMU and Mr. Mario Pinelli, Manager of Computer Services and IT at Murdoch University. Without their help with collecting the datasets this research would not have been possible.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Boukoros, S., Nugaliyadde, A., Marnerides, A., Vassilakis, C., Koutsakis, P., Wong, K.W. (2017). Modeling Server Workloads for Campus Email Traffic Using Recurrent Neural Networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10638. Springer, Cham. https://doi.org/10.1007/978-3-319-70139-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-70139-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70138-7
Online ISBN: 978-3-319-70139-4
eBook Packages: Computer ScienceComputer Science (R0)