skip to main content
10.1145/3605763.3625274acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Entangled Clouds: Measuring the Hosting Infrastructure of the Free Contents Web

Published:26 November 2023Publication History

ABSTRACT

Free content websites (FCWs) are a critical part of the Internet, and understanding them is essential for their wide use. This study statistically explores the distribution of free content websites globally by analyzing their hosting network scale, cloud service provider, and country-level distribution, combined and per the content category they provide, and by contrasting these measurements to the characteristics of premium content websites (PCWs). Our study further contrasts the distribution of these websites to general websites sampled from the Alexa top-1M websites and explores their security attributes using various security indicators.

We found that FCWs and PCWs are hosted mainly in medium-scale networks, a scale that is shown to be associated with a high concentration of malicious websites. Moreover, FCWs cloud and country-level distributions are shown to be heavy-tailed, although with unique patterns compared to PCWs. Our study contributes to understanding the FCWs ecosystem through various quantitative analyses. The results highlight the possibility of containing their harm, when malicious, through effective isolation and filtering thanks to their network, cloud, and country-level concentration.

References

  1. --. 2022a. hrefhttps://ipdata.co/about.htmlReliable IP ddress Data. (2022). Last access December 14, 2022.Google ScholarGoogle Scholar
  2. --. 2022b. hrefhttps://www.virustotal.com/Analyze suspicious files and URLs to detect types of malware automatically. (2022). Last access December 14, 2022.Google ScholarGoogle Scholar
  3. --. 2023. hrefhttps://en.ipshu.com/IP Address Lookup Tools. (2023). Last access January 19, 2023.Google ScholarGoogle Scholar
  4. Devdatta Akhawe, Adam Barth, Peifung E. Lam, John C. Mitchell, and Dawn Song. 2010. hrefhttps://doi.org/10.1109/CSF.2010.27Towards a Formal Foundation of Web Security. In Proceedings of the 23rd IEEE Computer Security Foundations Symposium, CSF. 290--304.Google ScholarGoogle ScholarCross RefCross Ref
  5. Abdulrahman Alabduljabbar, Ahmed Abusnaina, Ulku Meteriz-Yildiran, and David Mohaisen. 2021. hrefhttps://doi.org/10.1145/3463676.3485608TLDR: Deep Learning-Based Automated Privacy Policy Annotation with Key Policy Highlights. In ACM WPES. 103--118.Google ScholarGoogle ScholarCross RefCross Ref
  6. Abdulrahman Alabduljabbar, Runyu Ma, Sultan Alshamrani, Rhongho Jang, Songqing Chen, and David Mohaisen. 2022a. Poster: Measuring and Assessing the Risks of Free Content Websites. In NDSS.Google ScholarGoogle Scholar
  7. Abdulrahman Alabduljabbar, Runyu Ma, Soohyeon Choi, Rhongho Jang, Songqing Chen, and David Mohaisen. 2022b. hrefhttps://doi.org/10.1145/3494108.3522769Understanding the Security of Free Content Websites by Analyzing their SSL Certificates: A Comparative Study. In CySSS@AsiaCCS. 19--25.Google ScholarGoogle ScholarCross RefCross Ref
  8. Abdulrahman Alabduljabbar and David Mohaisen. 2022. hrefhttps://doi.org/10.1145/3487553.3524663Measuring the Privacy Dimension of Free Content Websites through Automated Privacy Policy Analysis and Annotation. In Companion of The Web Conference, WWW. 860--867.Google ScholarGoogle ScholarCross RefCross Ref
  9. Mohammed Alaqdhi, Abdulrahman Alabduljabbar, Kyle Thomas, Saeed Salem, DaeHun Nyang, and David Mohaisen. 2022. hrefhttps://doi.org/10.48550/arXiv.2210.12083Do Content Management Systems Impact the Security of Free Content Websites? A Correlation Analysis. In CSoNet.Google ScholarGoogle ScholarCross RefCross Ref
  10. Mohammed Alkinoon, Sung J. Choi, and David Mohaisen. 2021a. hrefhttps://doi.org/10.1007/978--3-030--89432-0_22Measuring Healthcare Data Breaches. In Proceedings of the 22nd International Conference on Information Security Applications, WISA. 265--277.Google ScholarGoogle ScholarCross RefCross Ref
  11. Mohammed Alkinoon, Marwan Omar, Manar Mohaisen, and David Mohaisen. 2021b. hrefhttps://doi.org/10.1007/978--3-030--91434--9_16Security Breaches in the Healthcare Domain: A Spatiotemporal Analysis. In Proceedings of the 10th International Conference on Computational Data and Social Networks (CSoNet). Springer, 171--183.Google ScholarGoogle ScholarCross RefCross Ref
  12. Omar Alrawi and Aziz Mohaisen. 2016. hrefhttps://doi.org/10.1145/2872518.2888610Chains of Distrust: Towards Understanding Certificates Used for Signing Malicious Applications. In Proceedings of the 25th International Conference on World Wide Web,(WWW). 451--456.Google ScholarGoogle ScholarCross RefCross Ref
  13. Izzat Alsmadi and Fahad Mira. 2018. hrefhttps://ieeexplore.ieee.org/abstract/document/8592962Website security analysis: variation of detection methods and decisions. In Proceedings of the 21st IEEE/Saudi Computer Society National Computer Conference (NCC).Google ScholarGoogle Scholar
  14. Pradeep Bangera and Sergey Gorinsky. 2017. hrefhttps://doi.org/10.23919/IFIPNetworking.2017.8264851Ads versus regular contents: Dissecting the web hosting ecosystem. In Proceedings of Networking Conference, IFIP Networking and Workshops, Stockholm, Sweden, IEEE. 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  15. Stefano Calzavara, Alvise Rabitti, and Michele Bugliesi. 2016. hrefhttps://doi.org/10.1145/2976749.2978338Content Security Problems?: Evaluating the Effectiveness of Content Security Policy in the Wild. In ACM CCS. 1365--1375.Google ScholarGoogle ScholarCross RefCross Ref
  16. Stefano Calzavara, Alvise Rabitti, and Michele Bugliesi. 2018. hrefhttps://doi.org/10.1145/3149408Semantics-Based Analysis of Content Security Policy Deployment. ACM Trans. Web, Vol. 12, 2 (2018), 10:1--10:36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. David G. Dobolyi and Ahmed Abbasi. 2016. hrefhttps://doi.org/10.1109/ISI.2016.7745439PhishMonger: A free and open source public archive of real-world phishing websites. In Proceedings of IEEE Conference on Intelligence and Security Informatics, ISI. 31--36.Google ScholarGoogle ScholarCross RefCross Ref
  18. Steven Englehardt and Arvind Narayanan. 2016. hrefhttps://doi.org/10.1145/2976749.2978313Online Tracking: A 1-million-site Measurement and Analysis. In ACM CCS. 1388--1401.Google ScholarGoogle ScholarCross RefCross Ref
  19. Daniel Fett, Ralf Kü sters, and Guido Schmitz. 2017. hrefhttps://doi.org/10.1109/CSF.2017.20The Web SSO Standard OpenID Connect: In-depth Formal Security Analysis and Security Guidelines. In Proceedings of the 30th IEEE Computer Security Foundations Symposium, CSF. 189--202.Google ScholarGoogle ScholarCross RefCross Ref
  20. Emilio Figueras-Mart'in, Roberto Magá n-Carrió n, and Juan Boubeta-Puig. 2022. hrefhttps://doi.org/10.1016/j.jisa.2022.103229Drawing the web structure and content analysis beyond the Tor darknet: Freenet as a case of study. J. Inf. Secur. Appl. , Vol. 68, 8 (2022), 103229.Google ScholarGoogle ScholarCross RefCross Ref
  21. Huw Fryer, Sophie Stalla­Bourdillon, and Tim Chown. 2015. hrefhttps://doi.org/10.1016/j.clsr.2015.05.011Malicious web pages: What if hosting providers could actually do something.. Comput. Law Secur. Rev. , Vol. 31, 4 (2015), 490--505.Google ScholarGoogle ScholarCross RefCross Ref
  22. Sid Ghodke. 2022. hrefhttps://www.kaggle.com/datasets/cheedcheed/top1mTop 1 Million Websites. (2022). Last access December 8, 2022.Google ScholarGoogle Scholar
  23. hrefhttp://dl.acm.org/citation.cfm?id=2994548Marie Vasek and Matthew Weeden and Tyler Moore. 2016. Measuring the Impact of Sharing Abuse Data with Web Hosting Providers. In Proceedings of the Workshop on Information Sharing and Collaborative Security, ACM. 71--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tam T. Huynh, Thuc D. Nguyen, Nhung T. H. Nguyen, and Hanh Tan. 2020. hrefhttps://doi.org/10.1007/978--3-030--63083--6_24Privacy-Preserving for Web Hosting. In Industrial Networks and Intelligent Systems - 6th EAI International Conference (Proceedings of Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Springer ), Vol. 334. 314--323.Google ScholarGoogle ScholarCross RefCross Ref
  25. Ranjita Pai Kasturi, Yiting Sun, Ruian Duan, Omar Alrawi, Ehsan Asdar, Victor Zhu, Yonghwi Kwon, and Brendan Saltaformaggio. 2020. hrefhttps://doi.org/10.1109/SP40000.2020.00116TARDIS: Rolling Back The Clock On CMS-Targeting Cyber Attacks. In Proceedings of the IEEE Symposium on Security and Privacy, SP. 1156--1171.Google ScholarGoogle ScholarCross RefCross Ref
  26. Surbhi Khare and Abhishek Badholia. 2022. hrefhttps://doi.org/10.4018/ijismd.297629Analysis of Cloud and Self-Web-Hosting Services Based on Security Parameters. Int. J. Inf. Syst. Model. Des. , Vol. 13, 6 (2022), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jan Kohout and Tomáv s Pevný. 2015. hrefhttps://doi.org/10.1109/INM.2015.7140487Automatic discovery of web servers hosting similar applications. In IFIP International Symposium on Integrated Network Management, IEEE. 1310--1315.Google ScholarGoogle ScholarCross RefCross Ref
  28. Suleyman Kondakci. 2009. hrefhttps://doi.org/10.1016/j.cose.2009.03.007A concise cost analysis of Internet malware. Comput. Secur. , Vol. 28, 7 (2009), 648--659.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Georgios Kontaxis, Demetres Antoniades, Iasonas Polakis, and Evangelos P. Markatos. 2011. hrefhttps://doi.org/10.1145/1972551.1972558An empirical study on the security of cross-domain policies in rich internet applications. In Proceedings of the Fourth European Workshop on System Security, EuroSec.Google ScholarGoogle ScholarCross RefCross Ref
  30. Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn, and Huy Kang Kim. 2014. hrefhttps://doi.org/10.1007/978--3--319--15087--1_1ADAM: Automated Detection and Attribution of Malicious Webpages. In Proceedings of the 15th International Workshop on Information Security Applications, WISA. 3--16.Google ScholarGoogle ScholarCross RefCross Ref
  31. Dongwon Lee, Kihwan Nam, Ingoo Han, and Kanghyun Cho. 2022. hrefhttps://doi.org/10.1016/j.im.2022.103681From free to fee: Monetizing digital content through expected utility-based recommender systems. Inf. Manag. , Vol. 59, 6 (2022), 103681.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhou Li, Kehuan Zhang, Yinglian Xie, Fang Yu, and XiaoFeng Wang. 2012. hrefhttps://doi.org/10.1145/2382196.2382267Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the ACM Conference on Computer and Communications Security, CCS. 674--686.Google ScholarGoogle ScholarCross RefCross Ref
  33. Xiaojing Liao, Chang Liu, Damon McCoy, Elaine Shi, Shuang Hao, and Raheem A. Beyah. 2016. hrefhttps://doi.org/10.1145/2872427.2883008Characterizing Long-tail SEO Spam on Cloud Web Hosting Services. In Proceedings of the 25th International Conference on World Wide Web, ACM. 321--332.Google ScholarGoogle ScholarCross RefCross Ref
  34. Timothy Libert. 2015. hrefhttp://arxiv.org/abs/1511.00619Exposing the Hidden Web: An Analysis of Third-Party HTTP Requests on 1 Million Websites. CoRR (2015).Google ScholarGoogle Scholar
  35. Elisa Mannes and Carlos Maziero. 2019. hrefhttps://doi.org/10.1145/3311888Naming Content on the Network Layer: A Security Analysis of the Information-Centric Network Model. ACM Comput. Surv. , Vol. 52, 3 (2019), 44:1--44:28.Google ScholarGoogle ScholarCross RefCross Ref
  36. Srdjan Matic, Gareth Tyson, and Gianluca Stringhini. 2019. hrefhttps://doi.org/10.1145/3308558.3313664PYTHIA: a Framework for the Automated Analysis of Web Hosting Environments. In World Wide Web Conference. 3072--3078.Google ScholarGoogle ScholarCross RefCross Ref
  37. Seyed Ali Mirheidari, Sajjad Arshad, Saeidreza Khoshkdahan, and Rasool Jalili. 2012. hrefhttps://ieeexplore.ieee.org/document/6470968/Two novel server-side attacks against log file in Shared Web Hosting servers. In Proceedings of The 7th International Conference for Internet Technology and Secured Transactions, ICITST IEEE. 318--323.Google ScholarGoogle Scholar
  38. Seyed Ali Mirheidari, Sajjad Arshad, Saeidreza Khoshkdahan, and Rasool Jalili. 2018. hrefhttp://arxiv.org/abs/1811.00922A Comprehensive Approach to Abusing Locality in Shared Web Hosting Servers. CoRR , Vol. abs/1811.00922 (2018).Google ScholarGoogle Scholar
  39. Aziz Mohaisen. 2015. hrefhttps://doi.org/10.1109/HotWeb.2015.20Towards Automatic and Lightweight Detection and Classification of Malicious Web Contents. In IEEE Hot Topics in Web Systems and Technologies. 67--72.Google ScholarGoogle ScholarCross RefCross Ref
  40. Aziz Mohaisen, Omar Alrawi, and Manar Mohaisen. 2015. hrefhttps://doi.org/10.1016/j.cose.2015.04.001AMAL: High-fidelity, behavior-based automated malware analysis and classification. Comput. Secur. , Vol. 52 (2015), 251--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Van Linh Nguyen, Po-Ching Lin, and Ren-Hung Hwang. 2019. hrefhttp://arxiv.org/abs/1903.05470Preventing the attempts of abusing cheap-hosting Web-servers for monetization attacks. CoRR , Vol. abs/1903.05470 (2019).Google ScholarGoogle Scholar
  42. Arman Noroozian, Elsa Rodr'i guez, Elmer Lastdrager, Takahiro Kasama, Michel van Eeten, and Carlos Ga n á n. 2021. hrefhttps://doi.org/10.1109/EuroSP51992.2021.00031Can ISPs Help Mitigate IoT Malware? A Longitudinal Study of Broadband ISP Security Efforts. In Proceedings of the IEEE European Symposium on Security and Privacy, EuroS&P. 337--352.Google ScholarGoogle ScholarCross RefCross Ref
  43. Simone Raponi and Roberto Di Pietro. 2020. hrefhttps://doi.org/10.1109/ACCESS.2020.2981207A Longitudinal Study on Web-Sites Password Management (in)Security: Evidence and Remedies. IEEE Access (2020), 52075--52090.Google ScholarGoogle ScholarCross RefCross Ref
  44. Syed R. Rizvi, Brian D. Killough, Andrew Cherry, and Sanjay Gowda. 2018. hrefhttps://doi.org/10.1109/IGARSS.2018.8518084Lessons Learned and Cost Analysis of Hosting a Full Stack Open Data Cube (ODC) Application on the Amazon Web Services (AWS). In Proceedings of International Geoscience and Remote Sensing Symposium, IEEE. 8643--8646.Google ScholarGoogle ScholarCross RefCross Ref
  45. Sayak Saha Roy, Unique Karanjit, and Shirin Nilizadeh. 2022. hrefhttps://doi.org/10.48550/arXiv.2212.02563A Large-Scale Analysis of Phishing Websites Hosted on Free Web Hosting Domains. CoRR , Vol. abs/2212.02563 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  46. Nayanamana Samarasinghe, Aashish Adhikari, Mohammad Mannan, and Amr M. Youssef. 2022. hrefhttps://doi.org/10.1145/3485447.3512223Et tu, Brute? Privacy Analysis of Government Websites and Mobile Apps. In ACM Web Conference.Google ScholarGoogle ScholarCross RefCross Ref
  47. Samaneh Tajalizadehkhoob, Tom van Goethem, Maciej Korczynski, Arman Noroozian, Rainer Bö hme, Tyler Moore, Wouter Joosen, and Michel van Eeten. 2017. hrefhttps://doi.org/10.1145/3133956.3133971Herding Vulnerable Cats: A Statistical Approach to Disentangle Joint Responsibility for Web Security in Shared Hosting. In Proceedings of the SIGSAC Conference on Computer and Communications Security, ACM. 553--567.Google ScholarGoogle ScholarCross RefCross Ref
  48. Synthia Wang, Kyle MacMillan, Brennan Schaffner, Nick Feamster, and Marshini Chetty. 2021. hrefhttps://arxiv.org/abs/2110.15345A First Look at the Consolidation of DNS and Web Hosting Providers. CoRR , Vol. abs/2110.15345 (2021).Google ScholarGoogle Scholar
  49. Nimesha Wickramasinghe, Mohamed Nabeel, Kenneth Thilakaratne, Chamath Keppitiyagama, and Kasun De Zoysa. 2021. hrefhttps://arxiv.org/abs/2111.00142Uncovering IP Address Hosting Types Behind Malicious Websites. CoRR , Vol. abs/2111.00142 (2021). ioGoogle ScholarGoogle Scholar

Index Terms

  1. Entangled Clouds: Measuring the Hosting Infrastructure of the Free Contents Web

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CCSW '23: Proceedings of the 2023 on Cloud Computing Security Workshop
      November 2023
      95 pages
      ISBN:9798400702594
      DOI:10.1145/3605763
      • Program Chairs:
      • Francesco Regazzoni,
      • Apostolos Fournaris

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 November 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate37of108submissions,34%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA
    • Article Metrics

      • Downloads (Last 12 months)73
      • Downloads (Last 6 weeks)7

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader