Skip to main content
Log in

Lagrangian relaxations for multiple network alignment

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We propose a principled approach for the problem of aligning multiple partially overlapping networks. The objective is to map multiple graphs into a single graph while preserving vertex and edge similarities. The problem is inspired by the task of integrating partial views of a family tree (genealogical network) into one unified network, but it also has applications, for example, in social and biological networks. Our approach, called Flan, introduces the idea of generalizing the facility location problem by adding a non-linear term to capture edge similarities and to infer the underlying entity network. The problem is solved using an alternating optimization procedure with a Lagrangian relaxation. Flan has the advantage of being able to leverage prior information on the number of entities, so that when this information is available, Flan is shown to work robustly without the need to use any ground truth data for fine-tuning method parameters. Additionally, we present three multiple-network extensions to an existing state-of-the-art pairwise alignment method called Natalie. Extensive experiments on synthetic, as well as real-world datasets on social networks and genealogical networks, attest to the effectiveness of the proposed approaches which clearly outperform a popular multiple network alignment method called IsoRankN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The code is available at: https://github.com/ekQ/flan.

  2. Like in the case of Natalie, we assume an ordering of graphs and consider aligning vertex i with itself or any vertex from graphs \(g'=1,\ldots ,g-1\). In other words, we avoid considering simultaneously vertex i as an entity for vertex j and j as an entity for i, which we have observed to result in larger duality gaps.

  3. The implementation of the feasibility heuristics is available at: https://github.com/ekQ/flan.

  4. For simplicity, we write “\(\min _{B} \text {objective}\)” although the objective is being minimized only w.r.t. elements \(B_{jl}\), where \((j,\ell ) \notin E_I\).

  5. The implementation is available at https://www.cs.purdue.edu/homes/dgleich/codes/netalign/ and has been used in Bayati et al. (2013) and Malmi et al. (2016).

References

  • Althaus E, Canzar S (2008) A Lagrangian relaxation approach for the multiple sequence alignment problem. J Comb Optim 16(2):127–154

    Article  MathSciNet  MATH  Google Scholar 

  • Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  MATH  Google Scholar 

  • Bayati M, Gleich DF, Saberi A, Wang Y (2013) Message-passing algorithms for sparse network alignment. ACM Trans Knowl Discov Data 7(1):3

    Article  Google Scholar 

  • Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11(4):351–368

    MathSciNet  MATH  Google Scholar 

  • Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Trans Knowl Discov Data 1(1):5

    Article  Google Scholar 

  • Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer, Berlin

    Book  Google Scholar 

  • Christen P, Vatsalan D, Fu Z (2015) Advanced record linkage methods and privacy aspects for population reconstruction—a survey and case studies. In: Population reconstruction. Springer, pp 87–110

  • Clark C, Kalita J (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics 30(16):2351–2359

    Article  Google Scholar 

  • Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. IJPRAI 18(3):265–298

    Google Scholar 

  • Cornuejols G, Fisher ML, Nemhauser GL (1977) Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Manag Sci 23(8):789–810

    Article  MathSciNet  MATH  Google Scholar 

  • Efremova J, Ranjbar-Sahraei B, Rahmani H, Oliehoek FA, Calders T, Tuyls K, Weiss G (2015) Multi-source entity resolution for genealogical data. In: Population reconstruction. Springer, pp 129–154

  • El-Kebir M, Heringa J, Klau GW (2015) Natalie 2.0: sparse global network alignment as a special case of quadratic assignment. Algorithms 8(4):1035–1051

    Article  MathSciNet  Google Scholar 

  • Elmsallati A, Clark C, Kalita J (2015) Global alignment of protein–protein interaction networks: a survey. IEEE/ACM Trans Comput Biol Bioinform PP(99):1-1. doi:10.1109/TCBB.2015.2474391

    Google Scholar 

  • Fisher ML (1981) The Lagrangian relaxation method for solving integer programming problems. Manag Sci 27:1–18

    Article  MathSciNet  MATH  Google Scholar 

  • Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi KP (2015) On the reliability of profile matching across large online social networks. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1799–1808

  • Hochbaum DS (1982) Heuristics for the fixed cost median problem. Math Program 22(1):148–162

    Article  MathSciNet  MATH  Google Scholar 

  • Hu J, Kehr B, Reinert K (2013) NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks. Bioinformatics 30(4):540–548

    Article  Google Scholar 

  • Klau GW (2009) A new graph-based method for pairwise global network alignment. BMC Bioinform 10(Suppl 1):S59

    Article  Google Scholar 

  • Kouki P, Marcum C, Koehly L, Getoor L (2016) Entity resolution in familial networks. In: Proceedings of the 12th workshop on mining and learning with graphs

  • Liao CS, Lu K, Baym M, Singh R, Berger B (2009) IsoRankN: spectral methods for global alignment of multiple protein networks. Bioinformatics 25(12):i253–i258. doi:10.1093/bioinformatics/btp203

    Article  Google Scholar 

  • Magnani M, Micenkova B, Rossi L (2013) Combinatorial analysis of multiple networks. arXiv:1303.4986

  • Malmi E, Terzi E, Gionis A (2016) Active network alignment: a matching-based approach. arXiv:1610.05516

  • Sahraeian SME, Yoon BJ (2013) SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks. PLOS ONE 8(7):e67,995

    Article  Google Scholar 

  • Shor NZ (2012) Minimization methods for non-differentiable functions, vol 3. Springer, New York

    Google Scholar 

  • Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763–12768

    Article  Google Scholar 

  • Singla P, Domingos P (2006) Entity resolution with markov logic. In: Proceedings of the sixth international conference on data mining, ICDM’06. IEEE, pp 572–582

  • Vazirani VV (2001) Approximation algorithms. Springer, New York

    MATH  Google Scholar 

  • Winkler WE (1990) String comparator metrics and enhanced decision rules in the fellegi–sunter model of record linkage. In: Proceedings of the section on survey research methods. American Statistical Association, pp 354–359

  • Zhai Y, Liu B (2005) Web data extraction based on partial tree alignment. In: Proceedings of the 14th international conference on world wide web. ACM, pp 76–85

  • Zhang J, Yu PS (2015) Multiple anonymized social networks alignment. In: Proceedings of the IEEE international conference on data mining, ICDM’15. IEEE

Download references

Acknowledgements

The authors are grateful to Pekka Valta and the Genealogical Society of Finland for providing the family tree dataset, to Jukka Suomela for useful discussions on Flan, to Gunnar W. Klau for his advice on extending Natalie to multiple networks, and to the anonymous reviewers for their constructive comments. This work was supported by Academy of Finland Project “Nestor” (286211).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric Malmi.

Additional information

Responsible editors: Thomas Gärtner, Mirco Nanni, Andrea Passerini and Celine Robardet.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malmi, E., Chawla, S. & Gionis, A. Lagrangian relaxations for multiple network alignment. Data Min Knowl Disc 31, 1331–1358 (2017). https://doi.org/10.1007/s10618-017-0505-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-017-0505-2

Keywords

Navigation