Beyond rankings: comparing directed acyclic graphs

Malmi, Eric; Tatti, Nikolaj; Gionis, Aristides

doi:10.1007/s10618-015-0406-1

Beyond rankings: comparing directed acyclic graphs

Published: 14 March 2015

Volume 29, pages 1233–1257, (2015)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Eric Malmi¹,
Nikolaj Tatti¹ &
Aristides Gionis¹

1261 Accesses
3 Altmetric
Explore all metrics

Abstract

Defining appropriate distance measures among rankings is a classic area of study which has led to many useful applications. In this paper, we propose a more general abstraction of preference data, namely directed acyclic graphs (DAGs), and introduce a measure for comparing DAGs, given that a vertex correspondence between the DAGs is known. We study the properties of this measure and use it to aggregate and cluster a set of DAGs. We show that these problems are $\mathbf {NP}$-hard and present efficient methods to obtain solutions with approximation guarantees. In addition to preference data, these methods turn out to have other interesting applications, such as the analysis of a collection of information cascades in a network. We test the methods on synthetic and real-world datasets, showing that the methods can be used to, e.g., find a set of influential individuals related to a set of topics in a network or to discover meaningful and occasionally surprising clustering structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Different Ranking Methods

Distance and Consensus for Preference Relations Corresponding to Ordered Partitions

Article 30 April 2019

Position Weighted Decision Trees for Ranking Data

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Most often the Kendall-tau distance is defined to be a value between 0 and 1 by normalizing with the total number of vertex pairs ${{|V|} \atopwithdelims ()2}$.
The dataset can be downloaded at http://users.ics.aalto.fi/emalmi/artist_preference_data.zip.

References

Ailon N (2010) Aggregation of partial rankings, p-ratings and top-$m$ lists. Algorithmica 57(2):284–300
Article MathSciNet MATH Google Scholar
Ailon N, Charikar M, Newman A (2008) Aggregating inconsistent information: ranking and clustering. J ACM 55(5):23
Article MathSciNet Google Scholar
Anagnostopoulos A, Kumar R, Mahdian M (2008) Influence and correlation in social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. pp 7–15
Barbieri N, Bonchi F, Manco G (2013) Cascade-based community detection. In: Proceedings of the sixth ACM international conference on Web search and data mining. pp 33–42
Bender MA, Fineman JT, Gilbert S, Tarjan RE (2011) A new approach to incremental cycle detection and related problems. arXiv:1112.0784
Borda J (1781) Mémoire sur les élections au scrutin. Histoire de l’Académie Royale des Sciences
Brandenburg F, Gleißner A, Hofmeier A (2012) Comparing and aggregating partial orders with Kendall tau distances. In: WALCOM: algorithms and computation. Lecture notes in computer science, vol 7157. Springer Berlin Heidelberg, pp 88–99
Brandenburg F, Gleißner A, Hofmeier A (2013) The nearest neighbor Spearman footrule distance for bucket, interval, and partial orders. J Comb Optim 26(2):310–332
Article MathSciNet MATH Google Scholar
Bunke H, Shearer K (1998) A graph distance metric based on the maximal common subgraph. Pattern Recognit Lett 19(3):255–259
Article MATH Google Scholar
Dinur I, Safra S (2005) On the hardness of approximating minimum vertex cover. Ann Math 162(1):439–485
Article MathSciNet MATH Google Scholar
Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank aggregation methods for the web. In: Proceedings of the 10th international conference on World Wide Web. pp 613–622
Even G, Naor J, Schieber B, Sudan M (1995) Approximating minimum feedback sets and multi-cuts in directed graphs. In: Proceedings of the 4th international conference on integer programming and combinatorial optimization. pp 14–28
Fagin R, Kumar R, Mahdian M, Sivakumar D, Vee E (2006) Comparing partial rankings. SIAM J Discrete Math 20(3):628–648
Article MathSciNet MATH Google Scholar
Fagin R, Kumar R, Sivakumar D (2003) Comparing top-$k$ lists. SIAM J Discrete Math 17(1):134–160
Article MathSciNet MATH Google Scholar
Friedman JH, Rafsky LC (1979) Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann Stat 7(4):697–717
Article MathSciNet MATH Google Scholar
Gomez-Rodriguez M, Balduzzi D, Schölkopf B (2011) Uncovering the temporal dynamics of diffusion networks. In: Proceedings of the 28th international conference on machine learning. pp 561–568
Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. ACM Trans Knowl Discov Data 5(4):21
Article Google Scholar
Goodman LA, Kruskal WH (1972) Measures of association for cross classifications, iv: simplification of asymptotic variances. J Am Stat Assoc 67(338):415–421
Article MATH Google Scholar
Goyal A, Bonchi F, Lakshmanan LVS (2008) Discovering leaders from community actions. In: Proceedings of the 17th ACM conference on information and knowledge management. pp 499–508
Goyal A, Bonchi F, Lakshmanan LVS (2010) Learning influence probabilities in social networks. In: Proceedings of the third ACM international conference on Web search and data mining. pp 241–250
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Article Google Scholar
Jiang X, Munger A, Bunke H (2001) An median graphs: properties, algorithms, and applications. IEEE Trans Pattern Anal Mach Intell 23(10):1144–1151
Article Google Scholar
Kann V (1992) On the approximability of np-complete optimization problems. Ph.D. thesis, KTH
Karp RM (1972) Reducibility among combinatorial problems. In: Complexity of computer computations. Springer, New York
Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. pp 137–146
Kendall M (1938) A new measure of rank correlation. Biometrika 30:81–93
Article MathSciNet MATH Google Scholar
Kendall M (1976) Rank correlation methods, 4th edn. Hodder Arnold, London
Google Scholar
Kenyon-Mathieu C, Schudy W (2007) How to rank with few errors. In: Proceedings of the 39th annual ACM symposium on theory of computing. pp 95–103
Laming D (2003) Human judgment: the eye of the beholder. Cengage Learning EMEA
Macchia L, Bonchi F, Gullo F, Chiarandini L (2013) Mining summaries of propagations. In: Proceedings of the 13th IEEE international conference on data mining. pp 498–507
Madden JI (1995) Analyzing and modeling rank data. Chapman & Hall, London
Google Scholar
Murphy TB, Martin D (2003) Mixtures of distance-based models for ranking data. Comp Stat Data Anal 41(3–4):645–655
Article MathSciNet MATH Google Scholar
Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. In: Knowledge-based intelligent information and engineering systems. pp 67–75
Su H, Gionis A, Rousu J (2014) Structured prediction of network response. In: Proceedings of the 31st international conference on machine learning. pp 442–450

Download references

Acknowledgments

The authors are grateful to Nicola Barbieri for providing the Last.fm dataset. We also thank the anonymous reviewers for their constructive feedback. This work was supported by Academy of Finland grant 118653 (ALGODAN).

Author information

Authors and Affiliations

HIIT and Department of Computer Science, Aalto University, Espoo, Finland
Eric Malmi, Nikolaj Tatti & Aristides Gionis

Authors

Eric Malmi
View author publications
You can also search for this author inPubMed Google Scholar
Nikolaj Tatti
View author publications
You can also search for this author inPubMed Google Scholar
Aristides Gionis
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Eric Malmi.

Additional information

Responsible editors: Joao Gama, Indre Zliobaite, Alipio Jorge, Concha Bielza.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Malmi, E., Tatti, N. & Gionis, A. Beyond rankings: comparing directed acyclic graphs. Data Min Knowl Disc 29, 1233–1257 (2015). https://doi.org/10.1007/s10618-015-0406-1

Download citation

Received: 31 August 2014
Accepted: 09 February 2015
Published: 14 March 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s10618-015-0406-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beyond rankings: comparing directed acyclic graphs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Different Ranking Methods

Distance and Consensus for Preference Relations Corresponding to Ordered Partitions

Position Weighted Decision Trees for Ranking Data

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now