Abstract
Phone number recycling (PNR) refers to the event wherein a mobile operator collects a disconnected number and reassigns it to a new owner. It has posed a threat to the reliability of the existing authentication solution for e-commerce platforms. Specifically, a new owner of a reassigned number can access the application account with which the number is associated, and may perform fraudulent activities. Existing solutions that employ a reassigned number database from mobile operators are costly for e-commerce platforms with large-scale users. Thus, alternative solutions that depend on only the information of the applications are imperative. In this work, we study the problem of detecting accounts that have been compromised owing to the reassignment of phone numbers. Our analysis on Meituan’s real-world dataset shows that compromised accounts have unique statistical features and temporal patterns. Based on the observations, we propose a novel model called temporal pattern and statistical feature fusion model (TSF) to tackle the problem, which integrates a temporal pattern encoder and a statistical feature encoder to capture behavioral evolutionary interaction and significant operation features. Extensive experiments on the Meituan and IEEE-CIS datasets show that TSF significantly outperforms the baselines, demonstrating its effectiveness in detecting compromised accounts due to reassigned numbers.
摘要
“二次放号” 是指移动运营商回收已停机手机的号码并将其重新分配给新号主的行为. 这种操作方式对电子商务平台现有身份验证解决方案的可靠性构成了威胁. 具体而言, 重新分配号码的新号主可以使用该号码之前已绑定的应用程序账户, 并可能基于此开展欺诈活动. 对于拥有庞大用户群体的电子商务平台而言, 现有的基于移动运营商重新分配号码数据库的解决方案成本高昂. 因此, 我们迫切需要一种只依赖应用程序信息的解决方案. 本文深入探究了因二次放号引发的被盗账号检测问题. 基于对美团真实数据集的分析发现, 被盗账户具有独特的统计特征和时序模式. 基于这些观察结果, 我们提出一种时序模式与统计特征融合模型 (TSF). 该模型分别设计了时序模式编码器和统计特征编码器, 旨在捕获能够有效区分正常账号和异常账号的时序演化模式和关键行为特征. 在美团数据集和IEEE-CIS数据集上进行的大量实验表明, TSF的性能明显优于其它基线模型, 进一步验证了所提模型的有效性.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Alibaba Cloud, 2023. Phone Number Verification Service (in Chinese). https://www.alibabacloud.com/product/verify?spm=a3c0i.23458820.2359477120.2.2e137d3frQSEAI [Accessed on Mar. 25, 2023].
Al-Qurishi M, Hossain MS, Alrubaian M, et al., 2018. Leveraging analysis of user behavior to identify malicious activities in large-scale social networks. IEEE Trans Ind Inform, 14(2):799–813. https://doi.org/10.1109/TII.2017.2753202
Baytas IM, Xiao C, Zhang X, et al., 2017. Patient subtyping via time-aware LSTM networks. Proc 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.65–74. https://doi.org/10.1145/3097983.3097997
Bhattacharyya S, Jha S, Tharakunnel K, et al., 2011. Data mining for credit card fraud: a comparative study. Dec Support Syst, 50(3):602–613. https://doi.org/10.1016/j.dss.2010.08.008
Bilge L, Strufe T, Balzarotti D, et al., 2009. All your contacts are belong to us: automated identity theft attacks on social networks. Proc 18th Int Conf on World Wide Web, p.551–560. https://doi.org/10.1145/1526709.1526784
Bonaccorsi G, Pierri F, Cinelli M, et al., 2020. Economic and social consequences of human mobility restrictions under COVID-19. Proc Natl Acad Sci USA, 117(27):15530–15535. https://doi.org/10.1073/pnas.2007658117
Boshmaf Y, Logothetis D, Siganos G, et al., 2015. Integro: leveraging victim prediction for robust fake account detection in OSNs. Proc 22nd Network and Distributed System Security Symp, p.8–11.
Branco B, Abreu P, Gomes AS, et al., 2020. Interleaved sequence RNNs for fraud detection. Proc 26th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining, p.3101–3109. https://doi.org/10.1145/3394486.3403361
Cao SS, Yang XX, Chen C, et al., 2019. TitAnt: online real-time transaction fraud detection in ant financial. Proc VLDB Endowment, 12(12):2082–2093. https://doi.org/10.14778/3352063.3352126
Chai YD, Zhou YH, Li WF, et al., 2022. An explainable multi-modal hierarchical attention model for developing phishing threat intelligence. IEEE Trans Depend Sec Comput, 19(2):790–803. https://doi.org/10.1109/TDSC.2021.3119323
Cheng DW, Xiang S, Shang CC, et al., 2020. Spatio-temporal attention-based neural network for credit card fraud detection. Proc 34th AAAI Conf on Artificial Intelligence, p.362–369. https://doi.org/10.1609/aaai.v34i01.5371
Cho K, van Merriënboer B, Gulcehre C, et al., 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. Proc Conf on Empirical Methods in Natural Language Processing, p.1724–1734. https://doi.org/10.3115/v1/D14-1179
Dmitrienko A, Liebchen C, Rossow C, et al., 2014. On the (in)security of mobile two-factor authentication. Proc 18th Int Conf on Financial Cryptography and Data Security, p.365–383.
Doerfler P, Thomas K, Marincenko M, et al., 2019. Evaluating login challenges as a defense against account takeover. Proc World Wide Web Conf, p.372–382. https://doi.org/10.1145/3308558.3313481
Dou YT, Liu ZW, Sun L, et al., 2020. Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. Proc 29th ACM Int Conf on Information & Knowledge Management, p.315–324. https://doi.org/10.1145/3340531.3411903
Egele M, Stringhini G, Kruegel C, et al., 2017. Towards detecting compromised accounts on social networks. IEEE Trans Depend Sec Comput, 14(4):447–460. https://doi.org/10.1109/TDSC.2015.2479616
Federal Communications Commission of the United States, 2018. Reassigned Numbers Database. https://www.fcc.gov/reassigned-numbers-database [Accessed on Apr. 1, 2023].
Friedman JH, 2001. Greedy function approximation: a gradient boosting machine. Ann Statist, 29(5):1189–1232. https://doi.org/10.1214/AOS/1013203451
Fu YY, Zhang M, Xu X, et al., 2021. Partial feature selection and alignment for multi-source domain adaptation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.16654–16663. https://doi.org/10.1109/CVPR46437.2021.01638
Gao M, Li Z, Li RC, et al., 2023. EasyGraph: a multifunctional, cross-platform, and effective library for interdisciplinary network analysis. Patterns, 4(10):100839. https://doi.org/10.1016/j.patter.2023.100839
Gong QY, Chen Y, He XL, et al., 2018. DeepScan: exploiting deep learning for malicious account detection in location-based social networks. IEEE Commun Mag, 56(11):21–27. https://doi.org/10.1109/MCOM.2018.1700575
Gong QY, Liu YS, Zhang JY, et al., 2023. Detecting malicious accounts in online developer communities using deep learning. IEEE Trans Knowl Data Eng, 35(10):10633–10649. https://doi.org/10.1109/TKDE.2023.3237838
Greff K, Srivastava RK, Koutník J, et al., 2017. LSTM: a search space Odyssey. IEEE Trans Neur Netw Learn Syst, 28(10):2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924
He Y, Wang C, Li N, et al., 2020. Attention and memory-augmented networks for dual-view sequential learning. Proc 26th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining, p.125–134. https://doi.org/10.1145/3394486.3403055
Hochreiter S, Schmidhuber J, 1997. Long short-term memory. Neur Comput, 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hu BB, Zhang ZQ, Shi C, et al., 2019. Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. Proc 33rd AAAI Conf on Artificial Intelligence, p.946–953. https://doi.org/10.1609/aaai.v33i01.3301946
Huang JQ, Hu K, Tang QT, et al., 2021. Deep position-wise interaction network for CTR prediction. Proc 44th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.1885–1889. https://doi.org/10.1145/3404835.3463117
Karimi H, VanDam C, Ye LY, et al., 2018. End-to-end compromised account detection. Proc IEEE/ACM Int Conf on Advances in Social Networks Analysis and Mining, p.314–321. https://doi.org/10.1109/ASONAM.2018.8508296
Kawase R, Diana F, Czeladka M, et al., 2019. Internet fraud: the case of account takeover in online marketplace. Proc 30th ACM Conf on Hypertext and Social Media, p.181–190. https://doi.org/10.1145/3342220.3343651
Ke GL, Meng Q, Finley T, et al., 2017. LightGBM: a highly efficient gradient boosting decision tree. Proc 31st Int Conf on Neural Information Processing Systems, p.3149–3157.
Ke GL, Xu ZH, Zhang J, et al., 2019. DeepGBM: a deep learning framework distilled by GBDT for online prediction tasks. Proc 25th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining, p.384–394. https://doi.org/10.1145/3292500.3330858
Keren G, Schuller B, 2016. Convolutional RNN: an enhanced model for extracting features from sequential data. Proc Int Joint Conf on Neural Networks, p.3412–3419. https://doi.org/10.1109/IJCNN.2016.7727636
Kingma DP, Ba J, 2015. Adam: a method for stochastic optimization. Proc 3rd Int Conf on Learning Representations.
Lee K, Narayanan A, 2021. Security and privacy risks of number recycling at mobile carriers in the United States. Proc APWG Symp on Electronic Crime Research, p.1–17. https://doi.org/10.1109/eCrime54498.2021.9738792
Li A, Qin Z, Liu RS, et al., 2019. Spam review detection with graph convolutional networks. Proc 28th ACM Int Conf on Information and Knowledge Management, p.2703–2711. https://doi.org/10.1145/3357384.3357820
Li S, Liu K, Meng R, 2018. Research and design of interface for reassigned mobile numbers. Proc IEEE 18th Int Conf on Communication Technology, p.1311–1314. https://doi.org/10.1109/ICCT.2018.8599932
Liang T, Zeng GX, Zhong QW, et al., 2021. Credit risk and limits forecasting in e-commerce consumer lending service via multi-view-aware mixture-of-experts nets. Proc 14th ACM Int Conf on Web Search and Data Mining, p.229–237. https://doi.org/10.1145/3437963.3441743
Ling XL, Deng WW, Gu C, et al., 2017. Model ensemble for click prediction in Bing search ads. Proc 26th Int Conf on World Wide Web Companion, p.689–698. https://doi.org/10.1145/3041021.3054192
Liu ZQ, Chen CC, Yang XX, et al., 2018. Heterogeneous graph neural networks for malicious account detection. Proc 27th ACM Int Conf on Information and Knowledge Management, p.2077–2085. https://doi.org/10.1145/3269206.3272010
Mainali P, Psychoula I, Petitcolas FAP, 2022. ExMo: explainable AI model using inverse frequency decision rules. Proc 3rd Int Conf on Human-Computer Interaction, p.179–198. https://doi.org/10.1007/978-3-031-05643-7_12
McDonald A, Sugatan C, Guberek T, et al., 2021. The annoying, the disturbing, and the weird: challenges with phone numbers as identifiers and phone number recycling. Proc CHI Conf on Human Factors in Computing Systems, Article 559. https://doi.org/10.1145/3411764.3445085
McNemar Q, 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2):153–157. https://doi.org/10.1007/BF02295996
Mirian A, DeBlasio J, Savage S, et al., 2019. Hack for hire: exploring the emerging market for account hijacking. Proc World Wide Web Conf, p.1279–1289. https://doi.org/10.1145/3308558.3313489
Mobile China, 2017. Mobile Authentication: Capitalising on China’s Identity Market. https://www.gsma.com/solutions-and-impact/technologies/mobile-identity/gsma_resources/mobile-authentication-capitalising-chinas-identity-market [Accessed on Mar. 1, 2023].
Mulliner C, Borgaonkar R, Stewin P, et al., 2013. SMS-based one-time passwords: attacks and defense. Proc 10th Int Conf on Detection of Intrusions and Malware, and Vulnerability Assessment, p.150–159. https://doi.org/10.1007/978-3-642-39235-1_9
Nti IK, Somanathan AR, 2024. A scalable RF-XGBoost framework for financial fraud mitigation. IEEE Trans Comput Soc Syst, 11(2):1556–1563. https://doi.org/10.1109/TCSS.2022.3209827
Ping YK, Gao C, Liu TC, et al., 2021. User consumption intention prediction in Meituan. Proc 27th ACM SIGKDD Conf on Knowledge Discovery & Data Mining, p.3472–3482. https://doi.org/10.1145/3447548.3467178
Tao JL, Wang H, Xiong T, 2018. Selective graph attention networks for account takeover detection. Proc IEEE Int Conf on Data Mining Workshops, p.49–54. https://doi.org/10.1109/ICDMW.2018.00015
Thomas K, Akhawe D, Bailey M, et al., 2021. SoK: hate, harassment, and the changing landscape of online abuse. Proc IEEE Symp on Security and Privacy, p.247–267. https://doi.org/10.1109/SP40001.2021.00028
VanDam C, Tan PN, Tang JL, et al., 2018. CADET: a multi-view learning framework for compromised account detection on Twitter. Proc IEEE/ACM Int Conf on Advances in Social Networks Analysis and Mining, p.471–478. https://doi.org/10.1109/ASONAM.2018.8508654
Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31st Int Conf on Neural Information Processing Systems, p.6000–6010.
Viswanath B, Bashir MA, Crovella M, et al., 2014. Towards detecting anomalous user behavior in online social networks. Proc 23rd USENIX Security Symp, p.223–238.
Wang C, Zhu HY, 2022. Representing fine-grained cooccurrences for behavior-based fraud detection in online payment services. IEEE Trans Depend Sec Comput, 19(1):301–315. https://doi.org/10.1109/TDSC.2020.2991872
Wang C, Wang CQ, Zhu HY, et al., 2020. LAW: learning automatic windows for online payment fraud detection. IEEE Trans Depend Sec Comput, 18(5):2122–2135. https://doi.org/10.1109/TDSC.2020.3037784
Wang DX, Lin JB, Cui P, et al., 2019. A semi-supervised graph attentive network for financial fraud detection. Proc IEEE Int Conf on Data Mining, p.598–607. https://doi.org/10.1109/ICDM.2019.00070
Wang J, Zou JH, Wang HY, 2022. Sampling with replacement vs Poisson sampling: a comparative study in optimal subsampling. IEEE Trans Inform Theory, 68(10):6605–6630. https://doi.org/10.1109/TIT.2022.3176955
Welch BL, 1951. On the comparison of several mean values: an alternative approach. Biometrika, 38(3–4):330–336. https://doi.org/10.2307/2332579
Xu T, Goossen G, Cevahir HK, et al., 2021. Deep entity classification: abusive account detection for online social networks. Proc 30th USENIX Security Symp, p.4097–4114.
Yao TJ, Li Q, Liang SS, et al., 2020. BotSpot: a hybrid learning framework to uncover bot install fraud in mobile advertising. Proc 29th ACM Int Conf on Information & Knowledge Management, p.2901–2908. https://doi.org/10.1145/3340531.3412690
Ye QZ, Gao YB, Zhang ZH, et al., 2022. Modeling access environment and behavior sequence for financial identity theft detection in E-commerce services. Proc Int Joint Conf on Neural Networks, p.1–8. https://doi.org/10.1109/IJCNN55064.2022.9892383
Yu JF, Qiu MH, Jiang J, et al., 2018. Modelling domain relationships for transfer learning on retrieval-based question answering systems in E-commerce. Proc 11th ACM Int Conf on Web Search and Data Mining, p.682–690. https://doi.org/10.1145/3159652.3159685
Zhang YB, Zhao DB, Zhang J, et al., 2011. Interpolation-dependent image downsampling. IEEE Trans Image Process, 20(11):3291–3296. https://doi.org/10.1109/TIP.2011.2158226
Zou YX, Roundy K, Tamersoy A, et al., 2020. Examining the adoption and abandonment of security, privacy, and identity theft protection practices. Proc CHI Conf on Human Factors in Computing Systems, p.1–15. https://doi.org/10.1145/3313831.3376570
Author information
Authors and Affiliations
Contributions
All the authors contributed to the study conception and design. Min GAO, Shutong CHEN, Yangbo GAO, Zhenhua ZHANG, Yu CHEN, and Yang CHEN proposed the motivation of the study. Min GAO, Shutong CHEN, and Qiongzan YE performed the experiments. Min GAO drafted the paper. All the authors commented on previous versions of the paper. Min GAO, Yupeng LI, Xin WANG, and Yang CHEN revised the paper. All the authors read and approved the final version of the paper.
Corresponding author
Ethics declarations
All the authors declare that they have no conflict of interest.
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 62072115, 62202402, 61971145, and 61602122), the Shanghai Science and Technology Innovation Action Plan Project (No. 22510713600), the Guangdong Basic and Applied Basic Research Foundation, China (Nos. 2022A1515011583 and 2023A1515011562), the One-off Tier 2 Start-up Grant (2020/2021) of Hong Kong Baptist University (Ref. RCOFSGT2/20-21/COMM/002), Startup Grant (Tier 1) for New Academics AY2020/21 of Hong Kong Baptist University and Germany/Hong Kong Joint Research Scheme sponsored by the Research Grants Council of Hong Kong, China, the German Academic Exchange Service of Germany (No. G-HKBU203/22), and Meituan
List of supplementary materials
1 IEEE-CIS dataset information
2 Experimental setup for baseline methods
3 Performance of TSF for the IEEE-CIS dataset
4 Description of statistical features for the Meituan dataset
Fig. S1 Precision–recall curves of TSF and baselines for the IEEE-CIS dataset
Fig. S2 Ablation study for the IEEE-CIS dataset
Table S1 Performances of TSF and baselines for the IEEE-CIS dataset
Table S2 Results of the ablation study for the IEEE-CIS dataset
Table S3 Statistical features of each account used in the statistical feature encoder of TSF
Rights and permissions
About this article
Cite this article
Gao, M., Chen, S., Gao, Y. et al. Detecting compromised accounts caused by phone number recycling on e-commerce platforms: taking Meituan as an example. Front Inform Technol Electron Eng 25, 1077–1095 (2024). https://doi.org/10.1631/FITEE.2300291
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2300291