skip to main content
10.1145/2517312.2517322acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Early security classification of skype users via machine learning

Published: 04 November 2013 Publication History

Abstract

We investigate possible improvements in online fraud detection based on information about users and their interactions. We develop, apply, and evaluate our methods in the context of Skype. Specifically, in Skype, we aim to provide tools that identify fraudsters that have eluded the first line of detection systems and have been active for months. Our approach to automation is based on machine learning methods. We rely on a variety of features present in the data, including static user profiles (e.g., age), dynamic product usage (e.g., time series of calls), local social behavior (addition/deletion of friends), and global social features (e.g., PageRank). We introduce new techniques for pre-processing the dynamic (time series) features and fusing them with social features. We provide a thorough analysis of the usefulness of the different categories of features and of the effectiveness of our new techniques.

References

[1]
2013 Online Fraud Report, Online Payment Fraud Trends, Merchant Practices, and Benchmarks. Technical report, CyberSource Corporation, 201
[2]
A. P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, July 199
[3]
L. Breiman. Random Forests. In Machine Learning, volume 45, pages 1--33, 2001.
[4]
S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. In WWW, 1998.
[5]
J. Cao, M. Ahmadi, and M. Shridhar. Recognition of handwritten numerals with multiple feature and multistage classifier. Pattern Recognition, 28(2):153--160, Feb. 1995.
[6]
P.-A. Chirita, J. Diederich, and W. Nejdl. Mailrank: using ranking for spam detection. In CIKM, pages 373--380, 2005.
[7]
C. Cortes, D. Pregibon, and C. Volinsky. Communities of interest. In Proceedings of the Fourth International Conference on Advances in Intelligent Data Analysis, pages 105--114, 2001.
[8]
J. D.Ratley. ACFE: Report to Members. Association of Certified Fraud Examiners, 2012.
[9]
H. Farvaresh and M. M. Sepehri. A data mining framework for detecting subscription fraud in telecommunication. Engineering Applications of Artificial Intelligence, 24(1):182--194, Feb. 2011.
[10]
T. Fawcett and F. Provost. Adaptive fraud detection. Data mining and knowledge discovery, 316:291--316, 1997.
[11]
D. Fisher. Using egocentric networks to understand communication. Internet Computing, IEEE, (October):20--28, 2005.
[12]
M. Goldszmidt. Finding soon-to-fail disks in a haystack. Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems, 2012.
[13]
T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin. The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer, 27(2):83--85, 2005.
[14]
C. S. Hilas. Designing an expert system for fraud detection in private telecommunications networks. Expert Systems with Applications, 36(9):11559--11569, Nov. 2009.
[15]
C. S. Hilas and P. A. Mastorocostas. An application of supervised and unsupervised learning approaches to telecommunications fraud detection. Knowledge-Based Systems, 21(7):721--726, Oct. 2008.
[16]
T. Ho, J. Hull, and S. Srihari. Decision combination in multiple classifier systems. Pattern Analysis and Machine, 16(I), 1994.
[17]
J. Huang, Y. Xie, F. Yu, Q. Ke, M. Abadi, E. Gillum, and Z. M. Mao. Socialwatch: detection of online service abuse via large-scale social graphs. In Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, ASIA CCS '13, 2013.
[18]
L. Huang, A. D. Joseph, B. Nelson, B. I. P. Rubinstein, and J. D. Tygar. Adversarial machine learning. In Proceedings of the Fourth ACM Workshop on Artificial Intelligence and Security (AISec). ACM, New York, NY, USA, 20011.
[19]
J. Kittler and M. Hatef. On combining classifiers. IEEE Transactions on, 20(3):226--239, 1998.
[20]
Y. Ku, Y. Chen, and C. Chiu. A Proposed Data Mining Approach for Internet Auction Fraud Detection. pages 238--243, 2007.
[21]
M. McGlohon, S. Bay, M. G. Anderle, D. M. Steier, and C. Faloutsos. Snare: a link analytic system for graph labeling and risk detection. In KDD, pages 1265--1274, 2009.
[22]
B. Nelson, B. I. P. Rubinstein, L. Huang, A. D. Joseph, and J. D. Tygar. Classifier evasion: Models and open problems. In C. Dimitrakakis, A. Gkoulalas-Divanis, A. Mitrokotsa, V. Verykios, and Y. Saygin, editors, Privacy and Security Issues in Data Mining and Machine Learning, volume 6549 of Lecture Notes in Computer Science, pages 92--98. Springer Berlin / Heidelberg, 2011.
[23]
C. Phua, V. C. S. Lee, K. Smith-Miles, and R. W. Gayler. A comprehensive survey of data mining-based fraud detection research. CoRR, abs/1009.6119, 2010.
[24]
L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
[25]
S. Rosset, U. Murad, E. Neumann, Y. Idan, and G. Pinkas. Discovery of fraud rules for telecommunications-challenges and solutions. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 409--413. ACM Press, 19
[26]
K. A. Roth. 2012 AFP Payments Fraud and Control Survey. Technical report, The Association for Financial Professionals, 2012.
[27]
P. Viola and M. Jones. Robust real-time face detection. International Journal of Computer Vision, 2003.
[28]
J.-C. Wang and C.-C. Chiu. Recommending trusted online auction sellers using social network analysis. Expert Systems with Applications, 34(3):1666--1679, Apr. 2008.
[29]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks.

Cited By

View all
  • (2020)Who's calling? characterizing robocalls through audio and metadata analysisProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489235(397-414)Online publication date: 12-Aug-2020
  • (2020)Scalable and Imbalance-Resistant Machine Learning Models for Anti-money Laundering: A Two-Layered ApproachEnterprise Applications, Markets and Services in the Finance Industry10.1007/978-3-030-64466-6_3(43-58)Online publication date: 26-Nov-2020
  • (2019)Detecting Fake Accounts in Online Social Networks at the Time of RegistrationsProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3363198(1423-1438)Online publication date: 6-Nov-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
November 2013
116 pages
ISBN:9781450324885
DOI:10.1145/2517312
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fraud detection
  2. machine learning
  3. social graph

Qualifiers

  • Research-article

Conference

CCS'13
Sponsor:

Acceptance Rates

AISec '13 Paper Acceptance Rate 10 of 17 submissions, 59%;
Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Who's calling? characterizing robocalls through audio and metadata analysisProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489235(397-414)Online publication date: 12-Aug-2020
  • (2020)Scalable and Imbalance-Resistant Machine Learning Models for Anti-money Laundering: A Two-Layered ApproachEnterprise Applications, Markets and Services in the Finance Industry10.1007/978-3-030-64466-6_3(43-58)Online publication date: 26-Nov-2020
  • (2019)Detecting Fake Accounts in Online Social Networks at the Time of RegistrationsProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3363198(1423-1438)Online publication date: 6-Nov-2019
  • (2018)Detecting telecommunication fraud by understanding the contents of a callCybersecurity10.1186/s42400-018-0008-51:1Online publication date: 31-Aug-2018
  • (2018)A Machine Learning Approach to Prevent Malicious Calls over Telephony Networks2018 IEEE Symposium on Security and Privacy (SP)10.1109/SP.2018.00034(53-69)Online publication date: May-2018
  • (2017)A Risk-Scoring Feedback Model for Webpages and Web Users Based on Browsing BehaviorACM Transactions on Intelligent Systems and Technology10.1145/29282748:4(1-21)Online publication date: 6-May-2017
  • (2017)A novel approach to detect advertising spammer in micro-blog2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)10.1109/IAEAC.2017.8054325(1809-1813)Online publication date: Mar-2017
  • (2016)Strengthening Weak Identities Through Inter-Domain Trust TransferProceedings of the 25th International Conference on World Wide Web10.1145/2872427.2883015(1249-1259)Online publication date: 11-Apr-2016
  • (2015)A survey of user classification in social networks2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)10.1109/ICSESS.2015.7339230(1038-1041)Online publication date: Sep-2015
  • (2015)Complex Symbolic Sequence Encodings for Predictive Monitoring of Business ProcessesProceedings of the 13th International Conference on Business Process Management - Volume 925310.1007/978-3-319-23063-4_21(297-313)Online publication date: 31-Aug-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media