skip to main content
10.1145/3133956.3138825acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
poster

POSTER: A PU Learning based System for Potential Malicious URL Detection

Published: 30 October 2017 Publication History

Abstract

This paper describes a PU learning (Positive and Unlabeled learning) based system for potential URL attack detection. Previous machine learning based solutions for this task mainly formalize it as a supervised learning problem. However, in some scenarios, the data obtained always contains only a handful of known attack URLs, along with a large number of unlabeled instances, making the supervised learning paradigms infeasible. In this work, we formalize this setting as a PU learning problem, and solve it by combining two different strategies (two-stage strategy and cost-sensitive strategy). Experimental results show that the developed system can effectively find potential URL attacks. This system can either be deployed as an assistance for existing system or be employed to help cyber-security engineers to effectively discover potential attack mode so that they can improve the existing system with significantly less efforts.

References

[1]
Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. 2009. Semi-Supervised Learning. IEEE Transactions on Neural Networks 20, 3 (2009), 542--542.
[2]
Marthinus C du Plessis, Gang Niu, and Masashi Sugiyama. 2014. Analysis of Learning from Positive and Unlabeled Data. In Advances in Neural Information Processing Systems 27. 703--711.
[3]
Charles Elkan and Keith Noto. 2008. Learning Classifiers from Only Positive and Unlabeled Data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 213--220.
[4]
Bing Liu, Yang Dai, Xiaoli Li, Wee Sun Lee, and Philip S Yu. 2003. Building Text Classifiers Using Positive and Unlabeled Examples. In Proceeding of the 3rd IEEE International Conference on Data Mining. 179--186.
[5]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In Proceeding ot the 8th IEEE International Conference on Data Mining. 413--422.
[6]
Justin Ma, Lawrence K Saul, Stefan Savage, and Geoffrey M Voelker. 2009. Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1245--1254.
[7]
Zhi-Hua Zhou and Ming Li. 2010. Semi-Supervised Learning by Disagreement. Knowledge and Information Systems 24, 3 (2010), 415--439.

Cited By

View all
  • (2024)Interdisciplinary Strategies for the Resurrection of Antibiotic Failures into Cutting-Edge HerbicidesInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-18601(1-3)Online publication date: 30-May-2024
  • (2024)Detecting Malicious Websites From the Perspective of System Provenance AnalysisIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.327761321:3(1406-1423)Online publication date: May-2024
  • (2024)A PU‐learning based approach for cross‐site scripting attacking reality detectionIET Networks10.1049/ntw2.12123Online publication date: 2-Apr-2024
  • Show More Cited By

Index Terms

  1. POSTER: A PU Learning based System for Potential Malicious URL Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
    October 2017
    2682 pages
    ISBN:9781450349468
    DOI:10.1145/3133956
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2017

    Check for updates

    Author Tags

    1. machine learning
    2. pu learning
    3. url attack detection

    Qualifiers

    • Poster

    Funding Sources

    • NSFC
    • Collaborative Innovation Center of Novel Software Technology and Industrialization

    Conference

    CCS '17
    Sponsor:

    Acceptance Rates

    CCS '17 Paper Acceptance Rate 151 of 836 submissions, 18%;
    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)42
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Interdisciplinary Strategies for the Resurrection of Antibiotic Failures into Cutting-Edge HerbicidesInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-18601(1-3)Online publication date: 30-May-2024
    • (2024)Detecting Malicious Websites From the Perspective of System Provenance AnalysisIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.327761321:3(1406-1423)Online publication date: May-2024
    • (2024)A PU‐learning based approach for cross‐site scripting attacking reality detectionIET Networks10.1049/ntw2.12123Online publication date: 2-Apr-2024
    • (2023)Prediction of Proteins in Cerebrospinal Fluid and Application to Glioma Biomarker IdentificationMolecules10.3390/molecules2808361728:8(3617)Online publication date: 21-Apr-2023
    • (2023)Positive-Unlabeled Learning With Label Distribution AlignmentIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.331943145:12(15345-15363)Online publication date: 26-Sep-2023
    • (2023)Exploring Global and Local Information for Anomaly Detection with Normal Samples2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394490(3422-3427)Online publication date: 1-Oct-2023
    • (2023)An Abnormal Event Classification Model for Big Data Platforms Based on Semi-supervised Learning2023 9th International Conference on Systems and Informatics (ICSAI)10.1109/ICSAI61474.2023.10423326(1-6)Online publication date: 16-Dec-2023
    • (2023)Self-paced and Reweighting PU Learning for Imbalanced Malicious Traffic DetectionGLOBECOM 2023 - 2023 IEEE Global Communications Conference10.1109/GLOBECOM54140.2023.10437512(6018-6023)Online publication date: 4-Dec-2023
    • (2023)A Review of Data-Driven Approaches for Malicious Website Detection2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT)10.1109/ACAIT60137.2023.10528600(75-82)Online publication date: 10-Nov-2023
    • (2023)RoSASInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10345960:5Online publication date: 1-Sep-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media