skip to main content
10.1145/3440943.3444717acmconferencesArticle/Chapter ViewAbstractPublication PagesiceaConference Proceedingsconference-collections
research-article

Differential Privacy Protection with Group Onion Routing based on AI-based URL Classification

Published: 27 September 2021 Publication History

Abstract

Due to the rapid spread of tablet computers, smartphones, and other mobile information devices, wireless communication technology is fully developed and deployed widely. Mobile Internet access has been a main and important way to communicate the world in daily and it privacy protection also catches much attentions. The Onion Router, better known as Tor, is a technique for anonymous communication over internet without regional restrictions for privacy protection. Apart from this, Tor can reach sites that normal search engine cannot search. As the opinion of Tor, the transmitted data has been encrypted layer by layer, just like onion, before it reaching server. Our research proposed a system predicting URL's category with the use of machine learning technique before visiting. According to the category prediction, we represent different privacy level with three kinds of RSA key lengths on onion routing. Depending on various situations, our system can obtain the balance between security and time cost. Hence, our proposed scheme can make onion routing more flexible and efficiently.

References

[1]
The onion router(Tor). https://www.torproject.org/. 1 July, 2018.
[2]
Tor metrics. https://metrics.torproject.org/. 1 July, 2018.
[3]
R. Jansen, & N. Hooper, 2012. Shadow: Running Tor in a box for accurate and efficient experimentation. 19th Annual Network & Distributed System Security Symposium (NDSS 2012). Hilton San Diego Resort & Spa, 1--18.
[4]
H. K.Wardana, L. F.Handianto, B. W. Yohanes, 2017. The onion routing performance using shadow-plugin-TOR. 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI '17). IEEE Computer Society, Washington, DC. 1--5
[5]
Dmoz. http://dmoz-odp.org/, 1 July, 2018.
[6]
VirusTota. https://www.virustotal.com/zh-tw/, 1 July, 2018.
[7]
X Qi., & B. D. Davison, 2009. Web page classification: Features and algorithms. ACM computing surveys, 41, 2 (2009), 1--12.
[8]
C. S. Lim, K. J. Lee, G. C. Kim, 2005. Multiple sets of features for automatic genre classification of web documents. Information processing & management, 41, 5 (2005), 1263--1276.
[9]
J. Zhang, J. Qin, Q. Yan, 2006. The role of URLs in objectionable web content categorization. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, Washington, DC, 277-283.
[10]
Wikipedia, RSA (cryptosystem). https://en.wikipedia.org/wiki/RSA_(cryptosystem), July, 2018.
[11]
Wikipedia, Advanced Encryption Standard. https://en.wikipedia.org/wiki/Advanced_Encryption_Standard, 1 July, 2018.
[12]
C. W. Hsu, C. C. Chang, C. J. Lin, 2003. A practical guide to support vector classification. Department of Computer Science, National Taiwan University
[13]
I. Rish, 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence. New York: IBM. 3, 22(2001), 41--46.
[14]
P. F. Brown, P. V. Desouza, R. L. Mercer, V. J. D. Pietra, J. C. Lai, 1992. Class-based n-gram models of natural language. Computational linguistics, 18, 4(1992), 467--479.
[15]
W. B. Cavnar, 1994. N-gram-based text filtering for TREC-2. NIST special publication SP, 171--171.
[16]
R. E. Kimbrell, 1988. Searching for text? Send an n-gram. Byte, 13, 5(1998), 297--312.
[17]
C. Y. Suen, 1979. N-gram statistics for natural language understanding and text processing. IEEE transactions on pattern analysis and machine intelligence, 2, (1979), 164--172.
[18]
I.Guyon, A. Elisseeff, 2003. An introduction to variable and feature selection. Journal of machine learning research, 3, Mar (2003), 1157--1182.
[19]
J. Biesiada, W. Duch, 2007. Feature selection for high-dimensional data---a Pearson redundancy based filter. In Computer recognition systems. Springer, Berlin, Heidelberg. 242--249
[20]
M. Y. Kan, H. O. N. Thi, 2005. Fast webpage classification using URL features. In Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, New York, NY, 325-326
[21]
E. Baykan, M. Henzinger, L. Marian, I. Weber, 2009. Purely URL-based topic classification. In Proceedings of the 18th international conference on World wide web. ACM, New York, NY. 1109--1110.
[22]
R. Rajalakshmi, C. Aravindan. 2011. Naive bayes approach for website classification. Information Technology and Mobile Communication. Springer, Berlin, Heidelberg, 2011. 323--326.
[23]
Rajalakshmi, C. Aravindan, 2013. Web page classification using n-gram based URL features. In Advanced Computing (ICoAC 13), 2013 Fifth International Conference on. IEEE Computer Society, Washington, DC. 15--21.

Cited By

View all
  • (2022)Intelligent Garlic Routing for Securing Data Exchange in V2X Communication2022 IEEE Globecom Workshops (GC Wkshps)10.1109/GCWkshps56602.2022.10008525(286-291)Online publication date: 4-Dec-2022

Index Terms

  1. Differential Privacy Protection with Group Onion Routing based on AI-based URL Classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ACM ICEA '20: Proceedings of the 2020 ACM International Conference on Intelligent Computing and its Emerging Applications
    December 2020
    219 pages
    ISBN:9781450383042
    DOI:10.1145/3440943
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 September 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Anonymous transmission
    2. Machine learning
    3. Privacy
    4. Tor
    5. URL classification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Ministry of Science Technology, Taiwan

    Conference

    ACM ICEA '20
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Intelligent Garlic Routing for Securing Data Exchange in V2X Communication2022 IEEE Globecom Workshops (GC Wkshps)10.1109/GCWkshps56602.2022.10008525(286-291)Online publication date: 4-Dec-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media