skip to main content
10.1145/3366424.3391264acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

Entity Resolution in Dynamic Heterogeneous Networks

Published: 20 April 2020 Publication History

Abstract

Networks evolve continuously over time not only with the addition and deletion of links and nodes but also with changes in the importance of edges. Even though many networks contain this type of temporal weightings, vast majority of research in network representation learning and classification has focused on static snapshots of the graph, while largely ignoring the temporal dynamics. In this work, we describe two approaches for incorporating weighted temporal information into network embedding methods such as Graph Convolutional Networks (GCNs). While the first approach aggregates time-weighted edges and nodes, the second approach uses temporal random walks to find relevant convolution nodes. With experiments on public and proprietary datasets, we demonstrate the effectiveness of the proposed TimeSage for link prediction tasks. By applying these predictions, we show improvements in our task of identifying fraudulent actors on a large e-commerce website selling software as subscriptions.

References

[1]
Nesreen K Ahmed, Ryan A Rossi, Rong Zhou, John Boaz Lee, Xiangnan Kong, Theodore L Willke, and Hoda Eldardiry. 2017. Inductive representation learning in large attributed graphs. arXiv preprint arXiv:1710.09471(2017).
[2]
Réka Albert, Hawoong Jeong, and Albert-László Barabási. 1999. Internet: Diameter of the world-wide web. nature 401, 6749 (1999), 130.
[3]
Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning. ACM, 41–48.
[4]
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. 2000. Graph structure in the web. Computer networks 33, 1-6 (2000), 309–320.
[5]
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203(2013).
[6]
Juan Camacho, Roger Guimerà, and Luís A Nunes Amaral. 2002. Robust patterns in food web structure. Physical Review Letters 88, 22 (2002), 228102.
[7]
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management. ACM, 891–900.
[8]
Sandro Cavallari, Vincent W Zheng, Hongyun Cai, Kevin Chen-Chuan Chang, and Erik Cambria. 2017. Learning community embedding with community detection and node embedding on graphs. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 377–386.
[9]
William W Cohen and Jacob Richman. 2002. Learning to match and cluster large high-dimensional data sets for data integration. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 475–480.
[10]
Anirban Dasgupta, Maxim Gurevich, Liang Zhang, Belle Tseng, and Achint O Thomas. 2012. Overcoming browser cookie churn with clustering. In Proceedings of the fifth ACM international conference on Web search and data mining. ACM, 83–92.
[11]
Jennifer A Dunne, Richard J Williams, and Neo D Martinez. 2002. Food-web structure and network theory: the role of connectance and size. Proceedings of the National Academy of Sciences 99, 20 (2002), 12917–12922.
[12]
Peter Eckersley. 2010. How unique is your web browser?. In International Symposium on Privacy Enhancing Technologies Symposium. Springer, 1–18.
[13]
Gamaleldin Elsayed, Dilip Krishnan, Hossein Mobahi, Kevin Regan, and Samy Bengio. 2018. Large margin deep networks for classification. In Advances in neural information processing systems. 842–852.
[14]
Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology. In ACM SIGCOMM computer communication review, Vol. 29. ACM, 251–262.
[15]
Lise Getoor and Ashwin Machanavajjhala. 2013. Entity resolution for big data. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1527–1527.
[16]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 855–864.
[17]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024–1034.
[18]
Hawoong Jeong, Sean P Mason, A-L Barabási, and Zoltan N Oltvai. 2001. Lethality and centrality in protein networks. Nature 411, 6833 (2001), 41.
[19]
Hawoong Jeong, Bálint Tombor, Réka Albert, Zoltan N Oltvai, and A-L Barabási. 2000. The large-scale organization of metabolic networks. Nature 407, 6804 (2000), 651.
[20]
Di Jin, Mark Heimann, Ryan A. Rossi, and Danai Koutra. 2019. Node2BITS: Compact Time- and Attribute-aware Node Representations for User Stitching. In ECML/PKDD. 22.
[21]
Sungchul Kim, Nikhil Kini, Jay Pujara, Eunyee Koh, and Lise Getoor. 2017. Probabilistic visitor stitching on cross-device web logs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1581–1589.
[22]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907(2016).
[23]
Adam Kleczkowski and Bryan T Grenfell. 1999. Mean-field-type equations for spread of epidemics: The ‘small world’model. Physica A: Statistical Mechanics and its Applications 274 (1999), 355–360.
[24]
Valdis E Krebs. 2002. Mapping networks of terrorist cells. Connections 24, 3 (2002), 43–52.
[25]
Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications 390, 6(2011), 1150–1170.
[26]
Sergei Maslov and Kim Sneppen. 2002. Specificity and stability in topology of protein networks. Science 296, 5569 (2002), 910–913.
[27]
Robert M May and Alun L Lloyd. 2001. Infection dynamics on scale-free networks. Physical Review E 64, 6 (2001), 066112.
[28]
Amy McGovern, Lisa Friedland, Michael Hay, Brian Gallagher, Andrew Fast, Jennifer Neville, and David Jensen. 2003. Exploiting relational structure to understand publication patterns in high-energy physics. SIGKDD Explorations 5, 2 (2003), 165–172.
[29]
Cristopher Moore and Mark EJ Newman. 2000. Epidemics and percolation in small-world networks. Physical Review E 61, 5 (2000), 5678.
[30]
Jennifer Neville, Özgür Şimşek, David Jensen, John Komoroske, Kelly Palmer, and Henry Goldberg. 2005. Using relational knowledge discovery to prevent securities fraud. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 449–458.
[31]
Mark EJ Newman. 2001. The structure of scientific collaboration networks. Proceedings of the national academy of sciences 98, 2 (2001), 404–409.
[32]
Giang Hoang Nguyen, John Boaz Lee, Ryan A Rossi, Nesreen K Ahmed, Eunyee Koh, and Sungchul Kim. 2018. Continuous-time dynamic network embeddings. In Companion Proceedings of the The Web Conference 2018. International World Wide Web Conferences Steering Committee, 969–976.
[33]
Romualdo Pastor-Satorras and Alessandro Vespignani. 2001. Epidemic spreading in scale-free networks. Physical review letters 86, 14 (2001), 3200.
[34]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701–710.
[35]
Leonardo FR Ribeiro, Pedro HP Saverese, and Daniel R Figueiredo. 2017. struc2vec: Learning node representations from structural identity. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 385–394.
[36]
Ryan Rossi and Nesreen Ahmed. 2015. The network data repository with interactive graph analytics and visualization. In Twenty-Ninth AAAI Conference on Artificial Intelligence.
[37]
Ryan Rossi and Jennifer Neville. 2010. Modeling the evolution of discussion topics and communication to improve relational classification. In Proceedings of the First Workshop on Social Media Analytics. ACM, 89–97.
[38]
Rishiraj Saha Roy, Ritwik Sinha, Niyati Chhaya, and Shiv Saini. 2015. Probabilistic deduplication of anonymous web traffic. In Proceedings of the 24th International Conference on World Wide Web. ACM, 103–104.
[39]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web. International World Wide Web Conferences Steering Committee, 1067–1077.
[40]
Andreas Wagner and David A Fell. 2001. The small world inside large metabolic networks. Proceedings of the Royal Society of London. Series B: Biological Sciences 268, 1478(2001), 1803–1810.
[41]
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of ‘small-world’networks. nature 393, 6684 (1998), 440.
[42]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 974–983.

Cited By

View all
  • (2024)Enriching Relations with Additional Attributes for ERProceedings of the VLDB Endowment10.14778/3681954.368198717:11(3109-3123)Online publication date: 30-Aug-2024
  • (2023)A relation-aware heterogeneous graph convolutional network for relationship predictionInformation Sciences10.1016/j.ins.2022.12.059623(311-323)Online publication date: Apr-2023
  • (2023)TE-DyGE: Temporal Evolution-Enhanced Dynamic Graph Embedding NetworkDatabase Systems for Advanced Applications10.1007/978-3-031-30675-4_13(183-198)Online publication date: 17-Apr-2023

Index Terms

  1. Entity Resolution in Dynamic Heterogeneous Networks
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '20: Companion Proceedings of the Web Conference 2020
      April 2020
      854 pages
      ISBN:9781450370240
      DOI:10.1145/3366424
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 April 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. entity resolution
      2. graph representation
      3. neural networks

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      WWW '20
      Sponsor:
      WWW '20: The Web Conference 2020
      April 20 - 24, 2020
      Taipei, Taiwan

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)121
      • Downloads (Last 6 weeks)22
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Enriching Relations with Additional Attributes for ERProceedings of the VLDB Endowment10.14778/3681954.368198717:11(3109-3123)Online publication date: 30-Aug-2024
      • (2023)A relation-aware heterogeneous graph convolutional network for relationship predictionInformation Sciences10.1016/j.ins.2022.12.059623(311-323)Online publication date: Apr-2023
      • (2023)TE-DyGE: Temporal Evolution-Enhanced Dynamic Graph Embedding NetworkDatabase Systems for Advanced Applications10.1007/978-3-031-30675-4_13(183-198)Online publication date: 17-Apr-2023
      • (2022)Entity Resolution in graph databases: comparison study2022 3rd International Conference on Embedded & Distributed Systems (EDiS)10.1109/EDiS57230.2022.9996480(111-116)Online publication date: 2-Nov-2022

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media