Abstract
Submodular functions appear in a considerable number of important natural language processing problems such as text summarization and dataset selection. Current graph-based approaches to solving such problems do not pay special attention to the submodularity and simplistically do not learn the graph model. Instead, they roughly set the edge weights in the graph proportional to the similarity of their two endpoints. We argue that such a shallow modeling needs to be replaced by a deeper approach which learns the graph edge weights. As such, we propose a new method for learning the graph model corresponding the submodular function that is going to be maximized. In a number of real-world networks, our method leads to a 50% error reduction compared to the previously used baseline methods. Furthermore, we apply our proposed method followed by an influence maximization algorithm to two NLP tasks: text summarization and k-means initialization for topic selection. Using these case studies, we experimentally show the significance of our learning method over the previous shallow methods.







Similar content being viewed by others
References
Agirre E, Martínez D, de Lacalle OL, Soroa A (2006) Two graph-based algorithms for state-of-the-art WSD. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 585–593
Alexandrescu A, Kirchhoff K (2007) Data-driven graph construction for semi-supervised graph-based learning in NLP. In: Proceedings of the main conference human language technologies 2007: the conference of the North American chapter of the association for computational linguistics, pp 204–211
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035
Badanidiyuru A, Mirzasoleiman B, Karbasi A, Krause A (2014) Streaming submodular maximization. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’14, pp 671–680
Baralis E, Cagliero L, Mahoto N, Fiori A (2013) GRAPHSUM: discovering correlations among multiple terms for graph-based summarization. Inf Sci 249:96–109
Beliga S, Mestrovic A, Martincic-Ipsic S (2015) An overview of graph-based keyword extraction methods and approaches. J Inf Organ Sci 39(1):1–20
Berton L, Valverde-Rebaza J, de Andrade Lopes A (2015) Link prediction in graph construction for supervised and semi-supervised learning. In: 2015 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117
Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst Appl 40(1):200–210
Chekuri C, Jayram TS, Vondrak J (2015) On multiplicative weight updates for concave and submodular function maximization. In: Proceedings of the 2015 conference on innovations in theoretical computer science—ITCS’15, pp 201–210
Chen W, Wang C, Wang Y (2010) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1029–1038
Cieri C, Graff D, Liberman M, Martey N, Strassel S (1999) The TDT-2 text and speech corpus. In: Proceedings of the broadcast news workshop’99, p 57
Erkan G, Radev DR (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Galluccio L, Michel O, Comon P, Hero AO III (2012) Graph based k-means clustering. Signal Process 92(9):1970–1984
Granovetter MS (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380
Herings P, Van der Laan G, Talman D (2001) Measuring the power of nodes in digraphs. Technical report, Tinbergen Institute
Huang B, Yang Y, Mahmood A, Wang H (2012) Microblog topic detection based on LDA model and single-pass clustering. Int Conf Rough Sets Curr Trends Comput 2012:166–171
Kazemi E, Zadimoghaddam M, Karbasi A (2018) Scalable deletion-robust submodular maximization: data summarization with privacy and fairness constraints. In: International conference on machine learning, pp 2549–2558
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Klimt B, Yang Y (2004) Introducing the Enron corpus. In: CEAS
Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. Found Trends® Mach Learn 5(2–3):123–286
Leskovec J (2018) Stanford large network dataset collection. https://snap.stanford.edu/data/index.html. Accessed 1 May 2019.
Leskovec J, Grobelnik M, Milic-Frayling N (2004) Learning semantic graph mapping for document summarization. In: Proceedings of ECML/PKDD-2004 workshop on knowledge discovery and ontologies
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution. ACM Trans Knowl Discov Data 1(2):1–39
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123
Lewis DD (2004) Reuters-21578 text categorization test collection. http://www.daviddlewis.com/resources/testcollections/reuters21578/. Accessed 1 May 2019
Li W, Joo J, Qi H, Zhu S-C (2017) Joint image-text news topic detection and tracking by multimodal topic and-or graph. IEEE Trans Multimed 19(2):367–381
Lin C-Y (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the workshop on text summarization branches out (WAS 2004), Barcelona, Spain, July 25–26
Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: HLT’10 human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics, pp 912–920
Lin H, Bilmes J (2011) A class of submodular functions for document summarization. Comput Linguist 1:510–520
Lin H, Bilmes J (2012) Learning mixtures of submodular shells with application to document summarization. arXiv preprint arXiv:1210.4871
Matsuo Y, Sakaki T, Uchiyama K, Ishizuka M (2006) Graph-based word clustering using a web search engine. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 542–550
Mei Q, Guo J, Radev D (2010) DivRank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1009–1018
Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions. Association for Computational Linguistics, p 20
Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. Proc EMNLP 85:404–411
Mirzasoleiman B, Sarkar R, Krause A (2013) Distributed submodular maximization: identifying representative elements in massive data. Adv Neural Inf Process Syst 26:2049–2057
Mirzasoleiman B, Karbasi A, Sarkar R, Krause A (2016) Distributed submodular maximization. J Mach Learn Res 17(1):8330–8373
Mirzasoleiman B, Jegelka S, Krause A (2018) Streaming non-monotone submodular maximization: personalized video summarization on the fly. In: Thirty-second AAAI conference on artificial intelligence
Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions—I. Math Program 14(1):265–294
Pilehvar MT, Navigli R (2015) From senses to texts: an all-in-one graph-based approach for measuring semantic similarity. Artif Intell 228:95–128
Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence, pp 487–494
Spina D, Gonzalo J, Amigó E (2014) Learning similarity functions for topic detection in online reputation monitoring. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval—SIGIR’14, pp 527–536
Tang Y, Xiao X, Shi Y (2014) Influence maximization: near-optimal time complexity meets practical efficiency. Kdd 2014:75–86
Tixier AJ, Meladianos P, Vazirgiannis M (2017) Combining graph degeneracy and submodularity for unsupervised extractive summarization. In: Proceedings of the workshop on new frontiers in summarization, pp 48–58
Vardasbi A, Faili H, Asadpour M (2017) SWIM: stepped weighted shell decomposition influence maximization for large-scale networks. ACM Trans Inf Syst 36(1):1–33
Véronis J (2004) Hyperlex: lexical cartography for information retrieval. Comput Speech Lang 18(3):223–252
Wang D, Zhu S, Li T, Gong Y (2009) Multi-document summarization using sentence-based topic models. In: Proceedings of the ACL-IJCNLP 2009 conference short papers, pp 297–300
Wang C, Yu X, Li Y, Zhai C, Han J (2013) Content coverage maximization on word networks for hierarchical topic summarization. In: Proceedings of the 22nd ACM international conference on conference on information and knowledge management—CIKM’13, pp 249–258
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442
Weng, J, Yao Y, Leonardi E, Lee F (2011) Event detection in Twitter. Development, pp 401–408
Xie P, Xing EP (2013) Integrating document clustering and topic modeling. arXiv preprint arXiv:1309.6874
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213
Yang Y, Pierce T, Carbonell J (1998) A study of retrospective and on-line event detection. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval—SIGIR’98, pp 28–36
Yasunaga, M, Zhang R, Meelu K, Pareek A, Srinivasan K, Radev D (2017) Graph-based neural multi-document summarization. arXiv preprint arXiv:1706.06681
Zheng M, Bu J, Chen C, Wang C, Zhang L, Qiu G, Cai D (2011) Graph regularized sparse coding for image representation. IEEE Trans Image Process 20(5):1327–1336
Zhou, T, Ouyang H, Chang Y, Bilmes J, Guestrin C (2016) Scaling submodular maximization via pruned submodularity graphs. arXiv preprint arXiv:1606.00399
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vardasbi, A., Faili, H. & Asadpour, M. Solving submodular text processing problems using influence graphs. Soc. Netw. Anal. Min. 9, 21 (2019). https://doi.org/10.1007/s13278-019-0559-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-019-0559-9