Skip to main content
Log in

IP2vec: an IP node representation model for IP geolocation

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Muir J A, Van Oorschot P C. Internet geolocation: evasion and counterevasion. ACM Computing Surveys, 2009, 42(1): 4

    Article  Google Scholar 

  2. Dan O, Parikh V, Davison B D. Improving IP geolocation using query logs. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 2016, 347–356

  3. Wang Y, Burgener D, Flores M, Kuzmanovic A, Huang C. Towards street-level client-independent IP geolocation. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. 2011, 365–379

  4. Zhao F, Xu R, Li R, Zhu M, Luo X. Street-level geolocation based on router multilevel partitioning. IEEE Access, 2019, 7: 59237–59248

    Article  Google Scholar 

  5. Gueye B, Ziviani A, Crovella M, Fdida S. Constraint-based geolocation of internet hosts. IEEE/ACM Transactions on Networking, 2006, 14(6): 1219–1232

    Article  Google Scholar 

  6. Wong B, Stoyanov I, Sirer E G. Octant: a comprehensive framework for the geolocalization of internet hosts. In: Proceedings of the 4th USENIX Conference on Networked Systems Design & Implementation. 2007, 23

  7. Arif M J, Karunasekera S, Kulkarni S. GeoWeight: internet host geolocation based on a probability model for latency measurements. In: Proceedings of the 33rd Australasian Conferenc on Computer Science - Volume 102. 2010, 89–98

  8. Dong Z, Perera R D W, Chandramouli R, Subbalakshmi K P. Network measurement based modeling and optimization for IP geolocation. Computer Networks, 2012, 56(1): 85–98

    Article  Google Scholar 

  9. Ciavarrini G, Greco M S, Vecchio A. Geolocation of internet hosts: accuracy limits through Cramér-Rao lower bound. Computer Networks, 2018, 135: 70–80

    Article  Google Scholar 

  10. Li Q, Wang Z, Tan D, Song J, Wang H, Sun L, Liu J. GeoCAM: an IP-based geolocation service through fine-grained and stable webcam landmarks. IEEE/ACM Transactions on Networking, 2021, 29(4): 1798–1812

    Article  Google Scholar 

  11. Dan O, Parikh V, Davison B D. IP geolocation through reverse DNS. ACM Transactions on Internet Technology, 2022, 22(1): 17

    Article  Google Scholar 

  12. Dan O, Parikh V, Davison B D. IP geolocation through geographic clicks. ACM Transactions on Spatial Algorithms and Systems, 2022, 8(1): 2

    Article  Google Scholar 

  13. Zu S, Luo X, Liu S, Liu Y, Liu F. City-level IP geolocation algorithm based on PoP network topology. IEEE Access, 2018, 6: 64867–64875

    Article  Google Scholar 

  14. Zhao F, Luo X, Gan Y, Zu S, Cheng Q, Liu F. IP geolocation based on identification routers and local delay distribution similarity. Concurrency and Computation: Practice and Experience, 2019, 31(22): e4722

    Article  Google Scholar 

  15. Chen J N, Liu F L, Shi Y F, Luo X. Towards IP location estimation using the nearest common router. Journal of Internet Technology, 2018, 19(7): 2097–2110

    Google Scholar 

  16. Ding S, Zhao F, Luo X. A street-level IP geolocation method based on delay-distance correlation and multilayered common routers. Security and Communication Networks, 2021, 2021: 6658642

    Article  Google Scholar 

  17. Zu S, Luo X, Zhang F. IP-geolocater: a more reliable IP geolocation algorithm based on router error training. Frontiers of Computer Science, 2022, 16(1): 161504

    Article  Google Scholar 

  18. Jiang H, Liu Y, Matthews J N. IP geolocation estimation using neural networks with stable landmarks. In: Proceedings of 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). 2016, 170–175

  19. Zhang F, Liu F, Luo X. Geolocation of covert communication entity on the internet for post-steganalysis. EURASIP Journal on Image and Video Processing, 2020, 2020(1): 15

    Article  Google Scholar 

  20. Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855–864

  21. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: Proceedings of the 1st International Conference on Learning Representations. 2013

  22. Niu F, Recht B, Re C, Wright S J. HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. In: Proceedings of the 24th International Conference on Neural Information Processing Systems. 2011, 693–701

  23. Schubert E, Sander J, Ester M, Kriegel H P, Xu X. DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Transactions on Database Systems, 2017, 42(3): 19

    Article  MathSciNet  Google Scholar 

  24. Lippmann R P. Pattern classification using neural networks. IEEE Communications Magazine, 1989, 27(11): 47–50

    Article  Google Scholar 

  25. Tao Z, Liu Y. A regional network topology construction algorithm based on sampling measurement. Journal of Physics: Conference Series, 2021, 1861(1): 012025

    Google Scholar 

  26. Augustin B, Cuvellier X, Orgogozo B, Viger F, Friedman T, Latapy M, Magnien C, Teixeira R. Avoiding traceroute anomalies with Paris traceroute. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement. 2006, 153–158

  27. Liu X, Yang W, Yin M, Liu F, Yun C. Street-level landmark mining algorithm based on radar search. Journal of Internet Technology, 2021, 22(2): 283–295

    Google Scholar 

  28. Li R, Liu Y, Qiao Y, Ma T, Wang B, Luo X. Street-level landmarks acquisition based on SVM classifiers. Computers, Materials & Continua, 2019, 59(2): 591–606

    Article  Google Scholar 

  29. Ma T, Liu F, Luo X, Yin M, Li R. An algorithm of street-level landmark obtaining based on yellow pages. Journal of Internet Technology, 2019, 20(5): 1415–1428

    Google Scholar 

  30. Spring N, Mahajan R, Wetherall D. Measuring ISP topologies with rocketfuel. ACM SIGCOMM Computer Communication Review, 2002, 32(4): 133–145

    Article  Google Scholar 

  31. Govindan R, Tangmunarunkit H. Heuristics for internet map discovery. In: Proceedings of IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies. 2000, 1371–1380

  32. Gunes M H, Sarac K. Resolving anonymous routers in internet topology measurement studies. In: Proceedings of IEEE INFOCOM 2008 - The 27th Conference on Computer Communications. 2008, 1076–1084

  33. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3111–3119

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. U1804263, U1736214, 62172435) and the Zhongyuan Science and Technology Innovation Leading Talent Project (No. 214200510019).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meijuan Yin.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Fan Zhang received his BS degree from Xiangtan University, China in 2017 and his MS degree from the State Key Laboratory of Mathematical Engineering and Advanced Computing, China in 2020. He has been with the State Key Laboratory of Mathematical Engineering and Advanced Computing since July 2017. His research interest includes network security, network measurement, and network geolocation. He received the support of the National Natural Science Foundation of China and the Basic and Frontier Technology Research Program of Henan Province.

Meijuan Yin was conferred a PhD in computer application by State Key Laboratory of Mathematical Engineering and Advanced Computing, China in 2012. She is now an associate professor of the laboratory. Her current research interests include data mining, social network analysis, and information security.

Fenlin Liu received the BS degree from the Zhengzhou Science and Technology Institute, China in 1986, the MS degree from the Harbin Institute of Technology, China in 1992, and the PhD degree from Northeast University, China in 1998. He is currently a professor with the Zhengzhou Science and Technology Institute, China. He has authored or co-authored more than 90 refereed international journal and conference papers. His research interests include network topology and network geolocation.

Xiangyang Luo is currently a professor at Zhengzhou Science and Technology Institute and the State Key Laboratory of Mathematical Engineering and Advanced Computing, China. His research interests lie in multimedia security and cyberspace surveying and mapping. He is the author or co-author of more than 100 refereed international journal and conference papers. He has obtained the support of the National Natural Science Foundation of China and the National Key R&D Program of China.

Shuodi Zu received his BS and MS from the State Key Laboratory of Mathematical Engineering and Advanced Computing, China in 2016. He has been with the State Key Laboratory of Mathematical Engineering and Advanced Computing since July 2012. His research interest includes network security, network measurement and network geolocation. He received the support of the National Natural Science Foundation of China and the Basic and Frontier Technology Research Program of Henan Province.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, F., Yin, M., Liu, F. et al. IP2vec: an IP node representation model for IP geolocation. Front. Comput. Sci. 18, 186506 (2024). https://doi.org/10.1007/s11704-023-2616-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-2616-9

Keywords

Navigation