Skip to main content
Log in

A machine learning approach to classifying YouTube QoE based on encrypted network traffic

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the widespread use of encryption in Over-The-Top video streaming traffic, network operators generally lack insight into application-level quality indicators (e.g., video quality levels, buffer underruns, stalling duration). They are thus faced with the challenge of finding solutions for monitoring service performance and estimating customer Quality of Experience (QoE) degradations based solely on passive monitoring solutions deployed within their network. We address this challenge by considering the concrete case of YouTube, whereby we present a methodology for the classification of end users’ QoE when watching YouTube videos, based only on statistical properties of encrypted network traffic. We have developed a system called YouQ which includes tools for monitoring and analysis of application-level quality indicators and corresponding traffic traces. Collected data is then used for the development of machine learning models for QoE classification based on computed traffic features per video session. To test the YouQ system and methodology, we collected a dataset corresponding to 1060 different YouTube videos streamed across 39 different bandwidth scenarios, and tested various classification models. Classification accuracy was found to be up to 84% when using three QoE classes (“low”, “medium” or “high”) and up to 91% when using binary classification (classes “low” and “high”). To improve the models in the future, we discuss why and when prediction errors occur. Moreover, we have analysed YouTube’s adaptation algorithm, thus providing valuable insight into the logic behind the quality level selection strategy, which may also be of interest in improving future QoE estimation algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Aggarwal V, Halepovic E, Pang J, Venkataraman S, Yan H (2014) Prometheus: toward quality of experience estimation for mobile apps from passive network measurements Proceedings of the 15th workshop on mobile computing systems and applications. ACM, p 18

    Google Scholar 

  2. Archibald R, Liu Y, Corbett C, Ghosal D (2011) Disambiguating HTTP: classifying web applications Wireless communications and mobile computing conference (IWCMC), 2011 7th international. IEEE, pp 1808–1813

    Google Scholar 

  3. Aroussi S, Mellouk A (2014) Survey on machine learning-based QoE-QoS correlation models International conference on computing, management and telecommunications (commantel), 2014. IEEE, pp 200–204

    Google Scholar 

  4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  5. Callado A, Kamienski C, Szabó G, Gerö BP, Kelner J, Fernandes S, Sadok D (2009) A survey on internet traffic identification. IEEE Commun Surveys Tutorials 11(3):37–52

    Article  Google Scholar 

  6. Casas P, D’Alconzo A, Fiadino P, Bar A, Finamore A, Zseby T (2014) When YouTube does not work-–analysis of QoE-relevant degradation in google CDN traffic. IEEE Trans Netw Serv Manag 11(4):441–457

    Article  Google Scholar 

  7. Casas P, Fiadino P, Bar A, D’Alconzo A, Finamore A, Mellia M (2014) Youtube all around: characterizing YouTube from mobile and fixed-line network vantage points European conference on networks and communications (EuCNC), 2014. IEEE, pp 1–5

    Google Scholar 

  8. Casas P, Seufert M, Wamser F, Gardlo B, Sackl A, Schatz R (2016) Next to you: monitoring quality of experience in cellular networks from the end-devices. IEEE Trans Netw Service Manag 13(2):181–196

    Article  Google Scholar 

  9. Casas P, Seufert M, Schatz R (2013) YOUQMON: a system for on-line monitoring of YouTube QoE in operational 3G networks. ACM SIGMETRICS Performance Evaluation Review 41(2):44–46

    Article  Google Scholar 

  10. Chen QA, Luo H, Rosen S, Mao ZM, Iyer K, Hui J, Sontineni K, Lau K (2014) Qoe doctor: diagnosing mobile App QoE with automated UI control and cross-layer analysis Proceedings of the 2014 conference on internet measurement conference. ACM, pp 151–164

    Google Scholar 

  11. Data mining with weka: Decision trees. https://www.youtube.com/watch?v=l7R9NHqvi0y

  12. Dimopoulos G, Leontiadis I, Barlet-Ros P, Papagiannaki K (2016) Measuring video qoe from encrypted traffic Proceedings of the 2016 ACM on internet measurement conference. ACM, pp 513–526

    Google Scholar 

  13. Eckert M, Knoll TM (2012) ISAAR (Internet service quality assessment and automatic reaction) a QoE monitoring and enforcement framework for internet services in mobile networks International conference on mobile networks and management. Springer, pp 57–70

    Google Scholar 

  14. Finamore A, Mellia M, Munafò MM, Torres R, Rao SG (2011) Youtube everywhere: impact of device and infrastructure synergies on user experience Proceedings of the 2011 ACM SIGCOMM conference on internet measurement. ACM, pp 345–360

    Google Scholar 

  15. Ghadiyaram D, Bovik AC, Yeganeh H, Kordasiewicz R, Gallant M (2014) Study of the effects of stalling events on the quality of experience of mobile streaming videos IEEE global conference on signal and information processing (GlobalSIP), 2014. IEEE, pp 989–993

    Google Scholar 

  16. Hamilton R, Iyengar J, Swett I, Wilk A (2016) QUIC: a UDP-based secure and reliable transport for HTTP/2. IETF draft-tsvwg-quic-protocol-02. https://datatracker.ietf.org/doc/html/draft-ietf-quic-http-02

  17. Han YT, Park HS (2010) Game traffic classification using statistical characteristics at the transport layer. ETRI J 32(1):22–32

    Article  Google Scholar 

  18. Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11(1):63–90

    Article  MATH  Google Scholar 

  19. Hoque MA, Siekkinen M, Nurminen JK, Aalto M, Tarkoma S (2015) Mobile multimedia streaming techniques: QoE and energy saving perspective. Pervasive Mob Comput 16:96–114

    Article  Google Scholar 

  20. Horvath G, Fazekas P (2015) Modelling of YouTube traffic in high speed mobile networks 21th European wireless conference; proceedings of European wireless 2015. VDE, pp 1–6

    Google Scholar 

  21. Hoßfeld T, Egger S, Schatz R, Fiedler M, Masuch K, Lorentzen C (2012) Initial delay vs. interruptions: between the devil and the deep blue sea Fourth international workshop on quality of multimedia experience (QoMEX), 2012. IEEE, pp. 1–6

    Google Scholar 

  22. Hoßfeld T, Heegaard PE, Varela M (2015) Qoe beyond the MOS: added value using quantiles and distributions Seventh international workshop on quality of multimedia experience (QoMEX), 2015. IEEE, pp. 1–6

    Google Scholar 

  23. Hoßfeld T, Schatz R, Biersack E, Plissonneau L (2013) Internet video delivery in YouTube: from traffic measurements to quality of experience Data traffic monitoring and analysis. Springer, pp 264–301

    Google Scholar 

  24. Hoßfeld T, Seufert M, Sieber C, Zinner T (2014) Assessing effect sizes of influence factors towards a QoE model for HTTP adaptive streaming Sixth international workshop on quality of multimedia experience (QoMEX), 2014. IEEE, pp. 111–116

    Google Scholar 

  25. Katsarakis M, Teixeira R, Papadopouli M, Christophides V (2016) Towards a causal analysis of video QoE from network and application QoS Proceedings of ACM SIGCOMM workshop on QoE-based analysis and management of data communication networks, internet-QoE 2016. ACM, pp 1–6

    Google Scholar 

  26. Keogh E (2015) Naïve Bayes classifier. http://www.cs.ucr.edu/~eamonn/CE/Bayesian%20Classification%20withInsect_examples.pdf

  27. Li W, Spachos P, Chignell M, Leon-Garcia A, Zucherman L, Jiang J (2016) Understanding the relationships between performance metrics and QoE for over-the-top video IEEE international conference on communications (ICC), 2016. IEEE, pp 1–6

    Google Scholar 

  28. Mansy A, Ammar M, Chandrashekar J, Sheth A (2014) Characterizing client behavior of commercial mobile video streaming services Proceedings of workshop on mobile video delivery. ACM, p 8

    Google Scholar 

  29. Moore A, Zuev D, Crogan M (2005) Discriminators for use in flow-based classification. Queen Mary and Westfield College, Department of Computer Science

  30. Moore AW, Zuev D (2005) Internet traffic classification using bayesian analysis techniques ACM SIGMETRICS Performance evaluation review, vol 33. ACM, pp 50–60

    Google Scholar 

  31. Moorthy AK, Choi LK, Bovik AC, De Veciana G (2012) Video quality assessment on mobile devices: subjective, behavioral and objective studies. IEEE J Sel Top Sign Proces 6(6):652–671

    Article  Google Scholar 

  32. Nam H, Kim KH, Calin D, Schulzrinne H (2014) Youslow: a performance analysis tool for adaptive bitrate video streaming ACM SIGCOMM Computer communication review, vol 44. ACM, pp 111–112

    Google Scholar 

  33. Net Promoter. http://www.netpromoter.com/know/

  34. Nguyen TT, Armitage G (2008) A survey of techniques for internet traffic classification using machine learning. IEEE Commun Surv Tutorials 10(4):56–76

    Article  Google Scholar 

  35. Orsolic I, Pevec D, Suznjevic M, Skorin-Kapov L (2016) Youtube QoE estimation based on the analysis of encrypted network traffic using machine learning 2016 IEEE global communications conference: workshops: quality of experience for multimedia communications (GC16 workshops QOEMC). washington, USA

    Google Scholar 

  36. Plakia M, Katsarakis M, Charonyktakis P, Papadopouli M, Markopoulos I (2016) On user-centric analysis and prediction of QoE for video streaming using empirical measurements 8th international conference on quality of multimedia experience (QoMEX), 2016. IEEE, pp 1–6

    Google Scholar 

  37. Platt JC (1999) 12 fast training of support vector machines using sequential minimal optimization. Advances in kernel methods, pp 185–208

  38. Qian L, Chen H, Xie L (2015) SVM-based QoE estimation model for video streaming service over wireless networks International conference on wireless communications & signal processing (WCSP), 2015. IEEE, pp 1–6

    Google Scholar 

  39. Ramos-Muñoz JJ, Prados-Garzon J, Ameigeiras P, Navarro-Ortiz J, López-Soler JM (2014) Characteristics of mobile YouTube traffic. IEEE Wirel Commun 21(1):18–25

    Article  Google Scholar 

  40. Reichl P, Egger S, Möller S, Kilkki K, Fiedler M, Hoßfeld T, Tsiaras C, Asrese A (2015) Towards a comprehensive framework for QoE and user behavior modelling Seventh international workshop on quality of multimedia experience (QoMEX), 2015. IEEE, pp 1–6

    Google Scholar 

  41. Roughan M, Sen S, Spatscheck O, Duffield N (2004) Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification Proceedings of the 4th ACM SIGCOMM conference on internet measurement. ACM, pp 135–148

    Google Scholar 

  42. RStudio: Shiny. http://shiny.rstudio.com

  43. Sackl A, Casas P, Schatz R, Janowski L, Irmer R (2015) Quantifying the impact of network bandwidth fluctuations and outages on web QoE Seventh international workshop on quality of multimedia experience (QoMEX), 2015. IEEE, pp 1–6

    Google Scholar 

  44. Schatz R, Hoßfeld T, Casas P (2012) Passive YouTube QoE monitoring for ISPs 2012 6th IEEE international conference on IMIS. pp 358–364

    Google Scholar 

  45. Seufert M, Egger S, Slanina M, Zinner T, Hoßfeld T, Tran-Gia P (2015) A survey on quality of experience of HTTP adaptive streaming. IEEE Commun Surv Tutorials 17(1):469–492

    Article  Google Scholar 

  46. Shafiq MZ (2015) Tracking mobile video QoE in the encrypted internet. White-paper submission, U. of Iowa. p 3

  47. Shafiq MZ, Erman J, Ji L, Liu AX, Pang J, Wang J (2014) Understanding the impact of network dynamics on mobile video user engagement ACM SIGMETRICS performance evaluation review, vol 42. ACM, pp 367–379

    Google Scholar 

  48. Sieber C, Blenk A, Hinteregger M, Kellerer W (2015) The cost of aggressive HTTP adaptive streaming: quantifying YouTube’s redundant traffic 2015 IFIP/IEEE international symposium on integrated network management (IM). IEEE, pp 1261–1267

    Google Scholar 

  49. Telecommunication standarization sector of ITU (2016) Parametric bitstream-based quality assessment of progressive download and adaptive audiovisual streaming services over reliable transport. Tech. Rep. P.1203 international telecommunication union

  50. Testolin A, Zanforlin M, De Grazia MDF, Munaretto D, Zanella A, Zorzi M, Zorzi M (2014) A machine learning approach to QoE-based video admission control and resource allocation in wireless systems Ad Hoc networking workshop (MED-HOC-NET), 2014 13th annual mediterranean. IEEE, pp 31–38

    Google Scholar 

  51. The Zettabyte Era: trends and analysis. Tech. Rep., Cisco. (2015) http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/VNI_Hyperconnectivity_WP.pdf

  52. Wamser F, Casas P, Seufert M, Moldovan C, Tran-Gia P, Hossfeld T (2016) Modeling the YouTube stack: from packets to quality of experience. Comput Netw 109:211–224

    Article  Google Scholar 

  53. Wamser F, Seufert M, Casas P, Irmer R, Tran-Gia P, Schatz R (2015) Yomoapp: a tool for analyzing QoE of YouTube HTTP adaptive streaming in mobile networks Proceedings of EuCNC 2015. IEEE, pp 239–243

    Google Scholar 

  54. Weka 3: data mining software in java. http://www.cs.waikato.ac.nz/ml/weka/

  55. Wu T, Huysegems R, Bostoen T (2015) Scalable network-based video-freeze detection for HTTP adaptive streaming IEEE 23rd international symposium on quality of service (IWQoS), 2015. IEEE, pp 95–104

    Google Scholar 

  56. Zec M, Mikuc M (2004) Operating system support for integrated network emulation in IMUNES Workshop on operating system and architectural support for the on demand IT infrastructure (1; 2004)

    Google Scholar 

  57. Zhang J, Fang G, Peng C, Guo M, Wei S, Swaminathan V (2016) Profiling energy consumption of DASH video streaming over 4G LTE networks Proceedings of the 8th international workshop on mobile video. ACM, p 3

    Google Scholar 

Download references

Acknowledgements

This work has been conducted in the scope of the project “Survey and analysis of monitoring solutions for YouTube network traffic and application layer KPIs” funded by Ericsson Nikola Tesla, Croatia. This work has also been supported in part by the Croatian Science Foundation under the project UIP-2014-09-5605 (Q-MANIC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irena Orsolic.

Appendix A: Network traffic features

Appendix A: Network traffic features

The table below explains all the extracted features. The features are calculated for the network traffic captured during the watch time of each video.

Feature name

Feature description

avgPacketSize

Average packet size [b y t e s]

averageSizeThroughTime

Average of sizes of transferred data per 5s interval [b y t e s]

minimalSizeThroughTime

Minimum of sizes of transferred data per 5s interval [b y t e s]

maximalSizeThroughTime

Maximum of sizes of transferred data per 5s interval [b y t e s]

sizeThroughTimeStdDev

Standard deviation of sizes of transferred data per 5s interval

 

[b y t e s]

sizeThroughTimeMedian

Median of sizes of transferred data per 5s interval [b y t e s]

averageInterarrivalTime

Average packet interarrival time [s]

minimalInterarrivalTime

Minimal packet interarrival time [s]

maximalInterarrivalTime

Maximal packet interarrival time [s]

avgInterarrivalTimeThroughTime

Average of interarrival time averages per 5s interval [s]

interarrivalTimeThroughTimeStdDev

Standard deviation of interarrival time averages per 5s

 

interval [s]

interarrivalTimeThroughTimeMedian

Median of interarrival time averages per 5s interval [s]

effectiveThroughput

Average of average throughput values calculated per 5s

 

intervals, including only those intervals where throughput

 

per interval was higher than 0.3 Mbps [M b p s]

minThroughputThroughTime

Minimum of average throughputs per 5s interval [M b p s]

maxThroughputThroughTime

Maximum of average throughputs per 5s interval [M b p s]

throughputStdDev

Standard deviation of average throughputs per 5s interval

 

[M b p s]

throughputMedian

Median of average throughputs per 5s interval [M b p s]

initialThroughput2

Throughput in first 2 seconds [M b p s]

initialThroughput3

Throughput in first 3 seconds [M b p s]

initialThroughput5

Throughput in first 5 seconds [M b p s]

initialThroughput10

Throughput in first 10 seconds [M b p s]

dupack

Number of duplicate acknowledgements

dupackOverAll

Ratio of duplicate acknowledgements

retransmission

Number of retransmissions

retransmissionOverAll

Retransmission ratio

ackLostSegment

Number of packets that acknowledge lost segment

ackLostSegmentOverAll

Ratio of packets that acknowledge lost segment

push

Number of packets with TCP flag push set

pushOverAll

Ratio of packets with TCP flag push set

reset

Number of packets with TCP flag reset set

resetOverAll

Ratio of packets with TCP flag reset set

numberOfFlows

Number of TCP flows established

numberOfServers

Number of contacted servers

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Orsolic, I., Pevec, D., Suznjevic, M. et al. A machine learning approach to classifying YouTube QoE based on encrypted network traffic. Multimed Tools Appl 76, 22267–22301 (2017). https://doi.org/10.1007/s11042-017-4728-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4728-4

Keywords

Navigation