Skip to main content

A Survey of Link Mining Tasks for Analyzing Noisy and Incomplete Networks

  • Chapter
  • First Online:

Abstract

Many data sets of interest today are best described as networks or graphs of interlinked entities. Examples include Web and text collections, social networks and social media sites, information, transaction and communication networks, and all manner of scientific networks, including biological networks. Unfortunately, often the data collection and extraction process for gathering these network data sets is imprecise, noisy, and/or incomplete. In this chapter, we review a collection of link mining algorithms that are well suited to analyzing and making inferences about networks, especially in the case where the data is noisy or missing.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. J. Abello, A. L. Buchsbaum, and J. R. Westbrook. A functional approach to external graph algorithms. In Proceedings of the 6th Annual European Symposium on Algorithms, Venice, Italy, 1998.

    Google Scholar 

  2. S. F. Adafre and M. de Rijke. Discovering missing links in wikipedia. In Proceedings of the 3rd International Workshop on Link Discovery, Chicago, IL, 2005.

    Google Scholar 

  3. R. D. Alba. A graph-theoretic definition of a sociometric clique. Journal of Mathematical Sociology, 3:113–126, 1973.

    Article  Google Scholar 

  4. R. Albert, B. DasGupta, R. Dondi, S. Kachalo, E. Sontag, A. Zelikovsky, and K. Westbrook. A novel method for signal transduction network inference from indirect experimental evidence. Journal of Computational Biology, 14:407–419, 2007.

    Article  Google Scholar 

  5. C. Alpert, A. Kahng, and S. Yao. Spectral partitioning: The more eigenvectors, the better. Discrete Applied Math, 90:3–26, 1999.

    Article  Google Scholar 

  6. R. Ananthakrishna, S. Chaudhuri, and V. Ganti. Eliminating fuzzy duplicates in data warehouses. In Proceedings of the 28th International Conference on Very Large Databases, Hong Kong, China, 2002.

    Google Scholar 

  7. P. Andritsos, A. Fuxman, and R. J. Miller. Clean answers over dirty databases: A probabilistic approach. In Proceedings of the 22nd International Conference on Data Engineering, Hong Kong, China, 2006.

    Google Scholar 

  8. A. Arenas, L. Danon, A. Daz-Guilera, P. M. Gleiser, and R. Guimer. Community analysis in social networks. The European Physical Journal B, 38(2):373–380, 2004.

    Article  CAS  Google Scholar 

  9. A. Arenas, A. Daz-Guilera, and C. J. Prez-Vicente. Synchronization reveals topological scales in complex networks. Physical Review Letters, 96(11):114102, 2006.

    Article  PubMed  Google Scholar 

  10. R. Balasubramanyan, V. R. Carvalho, and W. Cohen. Cutonce- recipient recommendation and leak detection in action. In Workshop on Enhanced Messaging, Chicago, IL, 2009.

    Google Scholar 

  11. A.-L. Barabasi and R. Albert. Emergence of Scaling in Random Networks. Science, 286(5439):509–512, 1999.

    Article  PubMed  Google Scholar 

  12. J. Barber. Modularity and community detection in bipartite networks. Physical Review E, 76:066102, 2007.

    Article  Google Scholar 

  13. A. Ben-Hur and W. Noble. Choosing negative examples for the prediction of protein-protein interactions. BMC Bioinformatics, 7:S2, 2006.

    Article  PubMed  Google Scholar 

  14. I. Bhattacharya and L. Getoor. Iterative record linkage for cleaning and integration. In Data Mining and Knowledge Discovery, Paris, France, 2004.

    Google Scholar 

  15. I. Bhattacharya and L. Getoor. Relational clustering for multi-type entity resolution. In ACM SIGKDD Workshop on Multi Relational Data Mining, Chicago, Illinois, 2005.

    Google Scholar 

  16. I. Bhattacharya and L. Getoor. A latent dirichlet model for unsupervised entity resolution. In SIAM Conference on Data Mining, Bethesda, MD 2006.

    Google Scholar 

  17. I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1:1–36, 2007.

    Article  Google Scholar 

  18. M. Bilenko and R. J. Mooney. Adaptive duplicate detection using learnable string similarity measures. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., 2003.

    Google Scholar 

  19. U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. N. Z, and D. Wagner. On finding graph clusterings with maximum modularity. In Proceedings of 33rd International Workshop on Graph-Theoretical Concepts in Computer Science, Dornburg, Germany, 2007.

    Google Scholar 

  20. V. R. Carvalho and W. W. Cohen. Preventing information leaks in email. In SIAM Conference on Data Mining, Minneapolis, MN, 2007.

    Google Scholar 

  21. P. Chaiwanarom and C. Lursinsap. Link completion using prediction by partial matching. In International Symposium on Communications and Information Technologies, Vientiane, Lao, 2008.

    Google Scholar 

  22. S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In ACM SIGMOD International Conference on Management of Data, Seattle, WA, 1998.

    Google Scholar 

  23. J. Chen and B. Yuan. Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics, 22(18):2283–2290, 2006.

    Article  PubMed  CAS  Google Scholar 

  24. A. Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98, 2008.

    Article  PubMed  CAS  Google Scholar 

  25. A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review, 70(6):066111, 2004.

    PubMed  Google Scholar 

  26. W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In Proceedings of the International Joint Conference on Artificial Intelligence Workshop on Information Integration, Acapulco, Mexico, 2003.

    Google Scholar 

  27. A. Culotta, M. Wick, R. Hall, M. Marzilli, and A. McCallum. Canonicalization of database records using adaptive similarity measures. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, 2007.

    Google Scholar 

  28. A. P. Dempster, N. M. Laird, and D. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society Series B, 39(1):1–38, 1977.

    Google Scholar 

  29. M. Deng, S. Mehta, F. Sun, and T. Chen. Inferring domain-domain interactions from protein-protein interactions. Genome Research, 12(10):1540–1548, October 2002.

    Article  PubMed  CAS  Google Scholar 

  30. C. Diehl, G. M. Namata, and L. Getoor. Relationship identification for social network discovery. In Proceedings of the 22nd National Conference on Artificial Intelligence, Vancouver, Canada, 2007.

    Google Scholar 

  31. L. Donetti and M. A. Muoz. Detecting network communities: A new systematic and efficient algorithm. Journal of Statistical Mechanics, 10:10012, 2004.

    Article  Google Scholar 

  32. X. Dong, A. Halevy, and J. Madhavan. Reference reconciliation in complex information spaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, MD, 2005.

    Google Scholar 

  33. P. Erdos and A. Renyi. On the evolution of random graphs. Mathematics Institute Hungarian Academy of Science, 5:17–61, 1960.

    Google Scholar 

  34. M. G. Everett and S. P. Borgatti. Analyzing clique overlap. Connections, 21(1):49–61, 1998.

    Google Scholar 

  35. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, Cambridge, MA, 1999.

    Google Scholar 

  36. S. Farrell, C. Campbell, and S. Myagmar. Relescope: an experiment in accelerating relationships. In Extended Abstracts on Human Factors in Computing Systems, 2005.

    Google Scholar 

  37. I. P. Fellegi and A. B. Sunter. A theory for record linkage. Journal of the American Statistical Association, 64(328):1183–1210, 1969.

    Article  Google Scholar 

  38. G. W. Flake, S. Lawrence, C. L. Giles, and F. Coetzee. Self-organization and identification of web communities. IEEE Computer, 35:66–71, 2002.

    Article  Google Scholar 

  39. S. Fortunato, V. Latora, and M. Marchiori. Method to find community structures based on information centrality. Physical Review E, 70(5):056104, 2004.

    Article  Google Scholar 

  40. L. Getoor. Advanced Methods for Knowledge Discovery from Complex Data, chapter Link-based classification. Springer, London, 2005.

    Google Scholar 

  41. L. Getoor and C. P. Diehl. Link mining: a survey. SIGKDD Explorations Newsletter, 7:3–12, 2005.

    Article  Google Scholar 

  42. L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. Machine Learning, 3:679–707, 2003.

    Google Scholar 

  43. L. Getoor, E. Segal, B. Taskar, and D. Koller. Probabilistic models of text and link structure for hypertext classification. In International Joint Conferences on Artificial Intelligence Workshop on Text Learning: Beyond Supervision, 2001.

    Google Scholar 

  44. M. Girvan and M. E. J. Newman. Community structure in social and biological networks. In Proceedings of National Academy of Science, 2002.

    Google Scholar 

  45. A. Goldenberg, J. Kubica, P. Komarek, A. Moore, and J. Schneider. A comparison of statistical and machine learning algorithms on the task of link completion. In Conference on Knowledge Discovery and Data Mining, Workshop on Link Analysis for Detecting Complex Behavior, Washington, D.C., 2003.

    Google Scholar 

  46. R. Guimera, M. Sales-Pardo, and L. A. N. Amaral. Module identification in bipartite and directed networks. Physical Review E, 76:036102, 2007.

    Article  Google Scholar 

  47. J. A. Hartigan. Clustering Algorithms. Wiley, New York NY, 1975.

    Google Scholar 

  48. O. Hassanzadeh, M. Sadoghi, and R. J. Miller. Accuracy of approximate string joins using grams. In 5th International Workshop on Quality in Databases at VLDB, Vienna, Austria, 2007.

    Google Scholar 

  49. M. A. Hernández and S. J. Stolfo. The merge/purge problem for large databases. In Proc. of the ACM Sigmod International Conference on Management of Data, San Jose, CA, 1995.

    Google Scholar 

  50. H. Huang and J. S. Bader. Precision and recall estimates for two-hybrid screens. Bioinformatics, 25(3):372–378, 2009.

    Article  PubMed  CAS  Google Scholar 

  51. Z. Huang, X. Li, and H. Chen. Link prediction approach to collaborative filtering. In ACM/IEEE-CS Joint Conference on Digital Libraries, 2005.

    Google Scholar 

  52. Z. Huang and D. K. J. Lin. The Time-Series Link Prediction Problem with Applications in Communication Surveillance. Informs Journal On Computing, 21:286–303, 2008.

    Article  Google Scholar 

  53. Z. Huang and D. D. Zeng. A link prediction approach to anomalous email detection. In IEEE International Conference on Systems, Man, and Cybernetics, Taipei, Taiwan, 2006.

    Google Scholar 

  54. P. Jaccard. Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37:547–579, 1901.

    Google Scholar 

  55. A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.

    Article  Google Scholar 

  56. M. A. Jaro. Probabilistic linkage of large public health data files. Statistics in Medicine, 14:491–498, 1995.

    Article  PubMed  CAS  Google Scholar 

  57. D. Jensen, J. Neville, and B. Gallagher. Why collective inference improves relational classification. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, 2004.

    Google Scholar 

  58. T. Joachims. Learning to Classify Text Using Support Vector Machines. PhD thesis, University of Dortmund, 2002.

    Google Scholar 

  59. D. V. Kalashnikov, S. Mehrotra, and Z. Chen. Exploiting relationships for domain-independent data cleaning. In SIAM International Conference on Data Mining, Newport Beach, CA, 2005.

    Google Scholar 

  60. C. Kalyan and K. Chandrasekaran. Information leak detection in financial e-mails using mail pattern analysis under partial information. In Proceedings of the 7th Conference on WSEAS International Conference on Applied Informatics and Communications, Athens, Greece, 2007.

    Google Scholar 

  61. A. E. Krause, K. A. Frank, D. M. Mason, R. E. Ulanowicz, and W. W. Taylor. Compartments revealed in food-web structure. Nature, 426(6964):282–285, 2003.

    Article  PubMed  CAS  Google Scholar 

  62. J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning, Williamstown, MA, 2001.

    Google Scholar 

  63. A. Lancichinetti, S. Fortunato, and J. Kertesz. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 11:033015, 2009.

    Article  Google Scholar 

  64. V. Latora and M. Marchiori. Efficient behavior of small-world networks. Physical Review Letters, 87(19):198701, 2001.

    Article  PubMed  CAS  Google Scholar 

  65. J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, 2008.

    Google Scholar 

  66. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data, 1(1):2, 2007.

    Article  Google Scholar 

  67. V. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10:707, 1966.

    Google Scholar 

  68. X. Li, P. Morie, and D. Roth. Semantic integration in text: From ambiguous names to identifiable entities. AI Magazine Special Issue on Semantic Integration, 26(1):45–58, 2005.

    Google Scholar 

  69. D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In International Conference on Information and Knowledge Management, New Orleans, LA, 2003.

    Google Scholar 

  70. Q. Lu and L. Getoor. Link-based classification. In Proceedings of the International Conference on Machine Learning, 2003.

    Google Scholar 

  71. D. Lusseau and M. E. J. Newman. Identifying the role that animals play in their social networks. In Proceedings of the Royal Society of London, 2004.

    Google Scholar 

  72. J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967.

    Google Scholar 

  73. S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research, 8:935–983, 2007.

    Google Scholar 

  74. A. McCallum, K. Nigam, and L. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6th International Conference On Knowledge Discovery and Data Mining, Boston, MA, 2000.

    Google Scholar 

  75. A. McCallum and B. Wellner. Toward conditional models of identity uncertainty with application to proper noun coreference. In International Workshop on Information Integration on the Web, 2003.

    Google Scholar 

  76. L. McDowell, K. M. Gupta, and D. W. Aha. Cautious inference in collective classification. In Association for the Advancement of Artificial Intelligence, 2007.

    Google Scholar 

  77. D. Milne and I. H. Witten. Learning to link with wikipedia. In Proceedings of the 17th ACM conference on Information and Knowledge Management, Napa Valley, CA, 2008.

    Google Scholar 

  78. A. E. Monge and C. P. Elkan. The field matching problem: Algorithms and applications. In Proceedings of the 2nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 1996.

    Google Scholar 

  79. A. E. Monge and C. P. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records. In Proceedings of the Special Interest Group on Management of Data Workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, AZ, 1997.

    Google Scholar 

  80. J. Neville, M. Adler, and D. Jensen. Clustering relational data using attribute and link information. In Proceedings of the Text Mining and Link Analysis Workshop, 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003.

    Google Scholar 

  81. J. Neville and D. Jensen. Iterative classification in relational data. In Association for the Advancement of Artificial Intelligence Workshop on Learning Statistical Models from Relational Data, 2000.

    Google Scholar 

  82. J. Neville and D. Jensen. Relational dependency networks. Journal of Machine Learning Research, 8:653–692, 2007.

    Google Scholar 

  83. H. B. Newcombe and J. M. Kennedy. Record linkage: making maximum use of the discriminating power of identifying information. Communications ACM, 5(11):563–566, 1962.

    Article  Google Scholar 

  84. H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James. Automatic linkage of vital records. Science, 130:954–959, October 1959.

    Article  PubMed  CAS  Google Scholar 

  85. M. E. J. Newman. Fast algorithm for detecting community structure in networks. Physical Review E, 69(6):066133, 2004.

    Article  CAS  Google Scholar 

  86. M. E. J. Newman, A. L. Barabasi, and D. J. Watts. The Structure and Dynamics of Networks. Princeton University Press, Princeton, NJ, 2006.

    Google Scholar 

  87. M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69:026113, 2004.

    Article  CAS  Google Scholar 

  88. M. E. J. Newman and E. A. Leicht. Mixture models and exploratory analysis in networks. In Proceedings of National Academy of Science, 2007.

    Google Scholar 

  89. J. O’Madadhain, J. Hutchins, and P. Smyth. Prediction and ranking algorithms for event-based network data. SIGKDD Explorations Newsletter, 7(2):23–30, 2005.

    Article  Google Scholar 

  90. M. Opper and D. Saad, editors. Advanced Mean Field Methods. Neural Information Processing Series. MIT Press, Cambridge, MA, 2001. Theory and practice, Papers from the workshop held at Aston University, Birmingham, 1999, A Bradford Book.

    Google Scholar 

  91. G. Palla, I. Dernyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814–818, 2005.

    Article  PubMed  CAS  Google Scholar 

  92. H. Pasula, B. Marthi, B. Milch, S. Russell, and I. Shpitser. Identity uncertainty and citation matching. In Neural Information Processing Systems, Vancouver, Canada, 2003.

    Google Scholar 

  93. H. Poon and P. Domingos. Joint unsupervised coreference resolution with markov logic. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, 2008.

    Google Scholar 

  94. A. Popescul and L. H. Ungar. Statistical relational learning for link prediction. In International Joint Conferences on Artificial Intelligence Workshop on Learning Statistical Models from Relational Data, Acapulco, Mexico, 2003.

    Google Scholar 

  95. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA, USA, 1993.

    Google Scholar 

  96. M. J. Rattigan and D. Jensen. The case for anomalous link discovery. SIGKDD Explorations Newsletter, 7:41–47, 2005.

    Article  Google Scholar 

  97. J. Reichardt and S. Bornholdt. Statistical mechanics of community detection. Physical Review E, 74(1):016110, 2006.

    Article  Google Scholar 

  98. M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62:107–136, 2006.

    Article  Google Scholar 

  99. M. Rosvall and C. T. Bergstrom. An information-theoretic framework for resolving community structure in complex networks. In Proceedings of National Academy of Science, 2007.

    Google Scholar 

  100. M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. In Proceedings of National Academy of Science, 2008.

    Google Scholar 

  101. P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93–106, 2008.

    Google Scholar 

  102. C. R. Shalizi, M. F. Camperi, and K. L. Klinkner. Discovering functional communities in dynamical networks. Statistical Network Analysis: Models, Issues, and New Directions, pages 140–157, 2007.

    Google Scholar 

  103. P. Singla and P. Domingos. Entity resolution with markov logic. IEEE International Conference on Data Mining, 21:572–582, Hong Kong, China, 2006.

    Google Scholar 

  104. S. Slattery and M. Craven. Combining statistical and relational methods for learning in hypertext domains. In Proceedings of the 8th international Conference on Inductive Logic Programming, Madison, Wisconsin, 1998.

    Google Scholar 

  105. N. Spring, D. Wetherall, and T. Anderson. Reverse engineering the internet. SIGCOMM Computer Communication Review, 34(1):3–8, 2004.

    Article  Google Scholar 

  106. E. Sprinzak, Y. Altuvia, and H. Margalit. Characterization and prediction of protein-protein interactions within and between complexes. Proceedings of the National Academy of Sciences, 103(40):14718–14723, 2006.

    Article  CAS  Google Scholar 

  107. A. Szilagyi, V. Grimm, A. K. Arakaki, and J. Skolnick. Prediction of physical protein-protein interactions. Physical Biology, 2(2):S1–S16, 2005.

    Article  PubMed  CAS  Google Scholar 

  108. C. Tantipathananandh and T. Y. Berger-Wolf. Algorithms for identifying dynamic communities. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 2009.

    Google Scholar 

  109. B. Taskar, A. Pieter, and D. Koller. Discriminative probabilistic models for relational data. In Conference on Uncertainty in Artificial Intelligence, Alberta, Canada, 2002.

    Google Scholar 

  110. B. Taskar, M.-F. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. In Advances in Neural Information Processing Systems, Vancouver, Canada, 2003.

    Google Scholar 

  111. S. Tejada, C. A. Knoblock, and S. Minton. Learning object identification rules for information integration. Information Systems, 26:2001, 2001.

    Article  Google Scholar 

  112. I. Vragovic and E. Louis. Network community structure and loop coefficient method. Physical Review E, 74(1):016105, 2006.

    Article  CAS  Google Scholar 

  113. S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press, Cambridge November 1994.

    Book  Google Scholar 

  114. D. J. Watts and S. H. Strogatz. Collective dynamics of ‘small-world’ networks. Nature, 393(6684):440–442, June 1998.

    Article  PubMed  CAS  Google Scholar 

  115. Y. Weiss. Segmentation using eigenvectors: A unifying view. In Proceedings of International Conference on Computer Vision, 1999.

    Google Scholar 

  116. M. L. Wick, K. Rohanimanesh, K. Schultz, and A. McCallum. A unified approach for schema matching, coreference and canonicalization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, 2008.

    Google Scholar 

  117. W. E. Winkler. The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Census Bureau, 1999.

    Google Scholar 

  118. H. Yu, A. Paccanaro, V. Trifonov, and M. Gerstein. Predicting interactions in protein networks by completing defective cliques. Bioinformatics, 22(7):823–829, 2006.

    Article  PubMed  CAS  Google Scholar 

  119. E. Zheleva, L. Getoor, J. Golbeck, and U. Kuter. Using friendship ties and family circles for link prediction. In 2nd ACM SIGKDD Workshop on Social Network Mining and Analysis, Las Vegas, Nevada, 2008.

    Google Scholar 

  120. J. Zhu. Mining Web Site Link Structure for Adaptive Web Site Navigation and Search. PhD thesis, University of Ulster at Jordanstown, UK, 2003.

    Google Scholar 

Download references

Acknowledgments

The work was supported by NSF Grant #0746930.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lise Getoor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Namata, G.M., Sharara, H., Getoor, L. (2010). A Survey of Link Mining Tasks for Analyzing Noisy and Incomplete Networks. In: Yu, P., Han, J., Faloutsos, C. (eds) Link Mining: Models, Algorithms, and Applications. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6515-8_4

Download citation

Publish with us

Policies and ethics