Abstract
We are on the cusp of analyzing a variety of data being collected in every walk of life - social, biological, health-care, corporate, climate, to name a few. We are also in search for models and analytical techniques that can accommodate more complex and increasingly large size data (scalability). Our ability to analyze large complex, disparate data for a broad set of analysis objectives differentiates big data analytics from mining which is narrow in scope. Hence, flexibility of analysis (different from scalability) is important. Concomitantly, efficiency is important due to large number of analysis needs. Our ultimate goal is to go from vertical analysis of data individually (corresponding to one of the 4 V’s) to holistically (also termed fusion-based) analyze that corresponds to all or a subset of V’s!
In order to accomplish the above, we are always in search for more effective models to represent data and different analysis techniques that support flexibility of analysis, efficiency, and scalability. We want to use techniques that have worked well – whether it is for modeling, efficiency or scalability. We also want to extend these techniques and/or develop new and improved ones to accommodate more complex, diverse, and larger size data.
The goal of this paper is to provide the reader an understanding of data analysis approaches using graphs. Our thesis is that there are several ways in which a graph representation can be used – both for modeling and analysis. We will take the reader through the evolution of graph usage and relevance leading to the current state of the use of multilayer Networks (MLNs) or multiplexes for modeling and analysis. Graphs are not new, but how they are used for big data analytics is going through a transformation which is important to understand. The hope is that the reader understands the path that has led us to this juncture and how graph usage is extended!
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Other aggregation approaches have the same problem.
References
Adaikkalavan, R., Chakravarthy, S.: Event specification and processing for advanced applications: generalization and formalization. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 369–379. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74469-6_37
Adaikkalavan, R., Chakravarthy, S.: Events must be complete in event processing! In: Proceedings, Annual ACM SIG Symposium On Applied Computing, pp. 1038–1039 (2008)
Aery, M., Chakravarthy, S.: eMailSift: mining-based approaches to email classification. In: SIGIR, pp. 580–581 (2004)
Aery, M., Chakravarthy, S.: eMailSift: email classification based on structure and content. In: ICDM, pp. 18–25 (2005)
Aery, M., Chakravarthy, S.: InfoSift: adapting graph mining techniques for text classification. In: FLAIRS Conference, pp. 277–282 (2005)
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)
Balachandran, R., Padmanabhan, S., Chakravarthy, S.: Enhanced DB-Subdue: supporting subtle aspects of graph mining using a relational approach. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 673–678. Springer, Heidelberg (2006). https://doi.org/10.1007/11731139_77
Berenstein, A., Magarinos, M.P., Chernomoretz, A., Aguero, F.: A multilayer network approach for guiding drug repositioning in neglected diseases. PLOS 10, e0004300 (2016)
Bodra, J.: Processing Queries Over Partitioned Graph Databases: An Approach and it’s Evaluation. Master’s thesis, The University of Texas at Arlington, May 2016. http://itlab.uta.edu/students/alumni/MS/Jay_D_Bodra/JBod_MS2016.pdf
Bodra, J., Das, S., Santra, A., Chakravarthy, S.: Query processing on large graphs: scalability through partitioning. In: Ordonez, C., Bellatreche, L. (eds.) DaWaK 2018. LNCS, vol. 11031, pp. 271–288. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98539-8_21
Chakravarthy, S., Aved, A., Shirvani, S., Annappa, M., Blasch, E.: Adapting stream processing framework for video analysis. Procedia Comput. Sci. 51, 2648–2657 (2015)
Chakravarthy, S.: Divide and conquer: a basis for augmenting a conventional query optimizer with multiple query processing capabilities. In: ICDE, pp. 482–490 (1991)
Chakravarthy, S., Adaikkalavan, R.: Ubiquitous nature of event-driven approaches: a retrospective view. In: Chandy, M., Etzion, O., von Ammon, R. (eds.) Event Processing. No. 07191 in Dagstuhl Seminar Proceedings, Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany, Dagstuhl, Germany (2007). http://drops.dagstuhl.de/opus/volltexte/2007/1150
Chakravarthy, S., Adaikkalavan, R.: Event and streams: harnessing and unleashing their synergy. In: International Conference on Distributed Event-Based Systems, pp. 1–12, July 2008
Chakravarthy, S., Beera, R., Balachandran, R.: DB-Subdue: database approach to graph mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 341–350. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_42
Chakravarthy, S., Jiang, Q.: Stream Data Processing: A Quality of Service Perspective. Advances in Database Systems, vol. 36. Springer, Heidelberg (2009). https://doi.org/10.1007/978-0-387-71003-7
Chakravarthy, S., Pajjuri, V.: Scheduling strategies and their evaluation in a data stream management system. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 220–231. Springer, Heidelberg (2006). https://doi.org/10.1007/11788911_19
Chakravarthy, S., Sanka, A., Jacob, J., Pandrangi, N.: A learning-based approach for fetching pages in WebVigiL. In: Proceedings, Annual ACM SIG Symposium On Applied Computing, pp. 1725–1731 (2004)
Chakravarthy, S., Venkatachalam, A., Telang, A.: A graph-based approach for multi-folder email classification. In: ICDM, pp. 78–87 (2010)
Chakravarthy, S., Zhang, H.: Visualization of association rules over relational DBMSs. In: Proceedings, Annual ACM SIG Symposium On Applied Computing, pp. 922–926 (2003)
Chakravarthy, U.S., Grant, J., Minker, J.: Foundations of semantic query optimization for deductive databases. In: Foundations of Deductive Databases and Logic Programming, pp. 243–273. Morgan Kaufmann (1988)
Chakravarthy, U.S., Grant, J., Minker, J.: Logic-based approach to semantic query optimization. ACM Trans. Database Syst. 15(2), 162–207 (1990)
Chakravarthy, U.S., Minker, J.: Multiple query processing in deductive databases using query graphs. In: VLDB, pp. 384–391 (1986)
Chakravarthy, U.S., Minker, J., Grant, J.: Semantic query optimization: additional constraints and control strategies. In: Expert Database Conference, pp. 345–379 (1986)
Chamakura, S., Sachde, A., Chakravarthy, S., Arora, A.: WEBVIGIL: monitoring multiple web pages and presentation of XML pages. In: ICDE Workshops, p. 1276 (2006)
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. 1, 231–255 (1994)
Cuzzocrea, A., Chakravarthy, S.: Event-based compression and mining of data streams. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008. LNCS (LNAI), vol. 5178, pp. 670–681. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85565-1_83
Das, S.: Divide and Conquer Approach to Scalable Substructure Discovery: Partitioning Schemes, Algorithms, Optimization and Performance Analysis Using Map/Reduce Paradigm. Ph.D. thesis, The University of Texas at Arlington, May 2017. http://itlab.uta.edu/students/alumni/PhD/Soumyava_Das/SDas_PhD2017.pdf
Das, S., Chakravarthy, S.: Partition and conquer: map/reduce way of substructure discovery. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 365–378. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22729-0_28
Das, S., Chakravarthy, S.: Duplicate reduction in graph mining: approaches, analysis, and evaluation. IEEE Trans. Knowl. Data Eng. 30(8), 1454–1466 (2018). https://doi.org/10.1109/TKDE.2018.2795003
Das, S., Goyal, A., Chakravarthy, S.: Plan before you execute: a cost-based query optimizer for attributed graph databases. In: Madria, S., Hara, T. (eds.) DaWaK 2016. LNCS, vol. 9829, pp. 314–328. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43946-4_21
De Domenico, M., Solé-Ribalta, A., Gómez, S., Arenas, A.: Navigability of interconnected networks under random failures. Proc. Natl. Acad. Sci. 111, 8351–8356 (2014). https://www.pnas.org/content/early/2014/05/21/1318469111
Domenico, M.D., Nicosia, V., Arenas, A., Latora, V.: Layer aggregation and reducibility of multilayer interconnected networks. CoRR abs/1405.0425 (2014). http://arxiv.org/abs/1405.0425
Dudgikar, M., Chakravarthy, S., Liuzzi, R.A., Wong, L.: A layered optimizer for mining association rules over relational database management systems. In: IKE, pp. 422–430 (2003)
Elkhalifa, L., Adaikkalavan, R., Chakravarthy, S.: InfoFilter: a system for expressive pattern specification and detection over text streams. In: Proceedings, Annual ACM SIG Symposium On Applied Computing, pp. 1084–1088 (2005)
Eppili, A., Jacob, J., Sachde, A., Chakravarthy, S.: Expressive profile specification and its semantics for a web monitoring system. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, T.-W. (eds.) ER 2004. LNCS, vol. 3288, pp. 420–433. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30464-7_33
Garg, V., Adaikkalavan, R., Chakravarthy, S.: Extensions to stream processing architecture for supporting event processing. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 945–955. Springer, Heidelberg (2006). https://doi.org/10.1007/11827405_92
Gilani, A., Sonune, S., Kendai, B., Chakravarthy, S.: The anatomy of a stream processing system. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 232–239. Springer, Heidelberg (2006). https://doi.org/10.1007/11788911_20
Goyal, A.: QP-SUBDUE: Processing Queries Over Graph Databases. Master’s thesis, The University of Texas at Arlington, December 2015. http://itlab.uta.edu/students/alumni/MS/Ankur_Goyal/AGoy_MS2015.pdf
Hong, D., Chakravarthy, S., Johnson, T.: Locking based concurrency control for integrated real-time database systems. In: Proceedings, International Workshop on Real-Time Databases (RTDB), pp. 138–143 (1996)
Hong, D., Johnson, T., Chakravarthy, S.: Real-time transaction scheduling: a cost conscious approach. In: Proceedings, International Conference on Management of Data (SIGMOD), pp. 197–206 (1993)
Hong, D.K., Kim, M.J., Chakravarthy, S.: Incorporating load factor into the scheduling of soft real-time transactions for main memory databases. In: Proceedings, IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pp. 60–66 (1996)
Jacob, J., Sachde, A., Chakravarthy, S.: CX-DIFF: a change detection algorithm for XML content and change presentation issues for WebVigiL. In: Jeusfeld, M.A., Pastor, Ó. (eds.) ER 2003. LNCS, vol. 2814, pp. 273–284. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39597-3_28
Jiang, Q., Adaikkalavan, R., Chakravarthy, S.: MavEStream: synergistic integration of stream and event processing. In: International Conference on Digital Communications, p. 29 (2007)
Jiang, Q., Chakravarthy, S.: Scheduling strategies for processing continuous queries over streams. In: Williams, H., MacKinnon, L. (eds.) BNCOD 2004. LNCS, vol. 3112, pp. 16–30. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27811-5_3
Jiang, Q., Chakravarthy, S.: Anatomy of a data stream management system. ADBIS Res. Commun. 215, 654–655 (2006)
Kendai, B., Chakravarthy, S.: Load shedding in MavStream: analysis, implementation, and evaluation. In: Gray, A., Jeffery, K., Shao, J. (eds.) BNCOD 2008. LNCS, vol. 5071, pp. 100–112. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70504-8_10
Kendai, B., Chakravarthy, S.: Runtime optimization of continuous queries. In: COMAD, pp. 104–115 (2008)
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multilayer networks. CoRR abs/1309.7233 (2013). http://arxiv.org/abs/1309.7233
Kona, H., Chakravarthy, S.: An SQL-based approach to incremental association rule mining. Special issue of the Foundations of Computing and Decision Sciences Journal (2006)
Kona, H., Chakravarthy, S.: Partitioned approach to association rule mining over multiple databases. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 320–330. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30076-2_32
Mishra, P., Chakravarthy, S.: Performance evaluation and analysis of K-way join variants for association rule mining. In: James, A., Younas, M., Lings, B. (eds.) BNCOD 2003. LNCS, vol. 2712, pp. 95–114. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45073-4_9
Mishra, P., Chakravarthy, S.: Performance evaluation of SQL-OR variants for association rule mining. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 288–298. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_29
Mishra, P.: Performance Evaluation and Analysis of SQL-based Approaches for Association Rule Mining. Master’s thesis, The University of Texas at Arlington (December 2002)
Padmanabhan, S.: HDB-Subdue: A Relational Database Approach to Graph Mining and Hierarchical Reduction. Master’s thesis, The University of Texas at Arlington (December 2005)
Pandrangi, N., Jacob, J., Sanka, A., Chakravarthy, S.: WebVigiL: user profile-based change detection for HTML/XML documents. In: James, A., Younas, M., Lings, B. (eds.) BNCOD 2003. LNCS, vol. 2712, pp. 38–57. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45073-4_5
Rosenthal, A., Chakravarthy, U.S.: Anatomy of a mudular multiple query optimizer. In: VLDB, pp. 230–239 (1988)
Santra, A., Bhowmick, S., Chakravarthy, S.: Efficient community re-creation in multilayer networks using Boolean operations. In: International Conference on Computational Science, Zurich, Switzerland, pp. 58–67 (2017). https://doi.org/10.1016/j.procs.2017.05.246
Santra, A., Bhowmick, S., Chakravarthy, S.: Hubify: efficient estimation of central entities across multiplex layer compositions. In: IEEE International Conference on Data Mining Workshops (2017)
Santra, A., Bhowmick, S.: Holistic analysis of multi-source, multi-feature data: modeling and computation challenges. In: Reddy, P.K., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) BDA 2017. LNCS, vol. 10721, pp. 59–68. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72413-3_4
Santra, A., Bhowmick, S., Chakravarthy, S.: Efficient community detection in Boolean composed multiplex networks. University of Texas at Arlington, June 2019. http://itlab.uta.edu/research/current/Multi%20Source%20Data%20Analysis/ArXiv2019-HoMLN-Final.pdf
Santra, A., Komar, K.S., Bhowmick, S., Chakravarthy, S.: An efficient framework for computing structure and semantics-preserving community in a heterogeneous multilayer network. University of Texas at Arlington, June 2019. http://itlab.uta.edu/research/current/Multi%20Source%20Data%20Analysis/ArXiv2019-HeMLN-Final.pdf
Santra, A., Komar, K.S., Bhowmick, S., Chakravarthy, S.: Making a case for mlns for data-driven analysis: Modeling, efficiency, and versatility. University of Texas at Arlington, August 2019. http://itlab.uta.edu/research/current/Multi%20Source%20Data%20Analysis/BigData_2019-final.pdf
Santra, A., Komar, K.S., Bhowmick, S., Chakravarthy, S.: Structure-preserving community in a multilayer network: definition, detection, and analysis. arXiv preprint arXiv:1903.02641 (2019)
Savla, S., Chakravarthy, S.: A single pass algorithm for detecting significant intervals in time-series data. In: ADMKD, pp. 49–60 (2006)
Savla, S., Chakravarthy, S.: An efficient single pass approach to frequent episode discovery in sequence data. In: International Conference on Intelligent Environments (IE08) (2008)
Solé-Ribalta, A., De Domenico, M., Gómez, S., Arenas, A.: Centrality rankings in multiplex networks. In: Proceedings of the 2014 ACM Conference on Web Science, pp. 149–155. ACM (2014)
Srinivasan, A., Bhatia, D., Chakravarthy, S.: Discovery of interesting episodes in sequence data. In: Proceedings, Annual ACM SIG Symposium On Applied Computing, pp. 598–602 (2006)
Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Explor. Newslett. 14(2), 20–28 (2013)
Telang, A., Mishra, R., Chakravarthy, S.: Ranking issues for information integration. In: ICDE Workshops, pp. 257–260 (2007)
Thomas, S.: Architectures and Optimizations for Integrating Data Mining Algorithms with Database Systems. Ph.D. thesis, The University of Florida at Gainesville, December 1998
Thomas, S., Chakravarthy, S.: Performance evaluation and optimization of join queries for association rule mining. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 241–250. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48298-9_26
Thomas, S., Chakravarthy, S.: Incremental mining of constrained associations. In: Valero, M., Prasanna, V.K., Vajapeyam, S. (eds.) HiPC 2000. LNCS, vol. 1970, pp. 547–558. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44467-X_50
Vu, X.S., Santra, A., Chakravarthy, S., Jiang, L.: Generic multilayer network data analysis with the fusion of content and structure. In: CICLing 2019, La Rochelle, France (2019)
Wilson, J.D., Palowitch, J., Bhamidi, S., Nobel, A.B.: Community extraction in multilayer networks with heterogeneous community structure. J. Mach. Learn. Res. 18(1), 5458–5506 (2017). http://dl.acm.org/citation.cfm?id=3122009.3208030
Zhang, H., Wang, C.D., Lai, J.H., Philip, S.Y.: Modularity in complex multilayer networks with multiple aspects: a static perspective. Appl. Inform. 4, 7 (2017)
Acknowledgment
We would like to thank Dr. Sanjukta Bhowmick on her collaboration with us on the multilayer network analysis.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chakravarthy, S., Santra, A., Komar, K.S. (2019). Why Multilayer Networks Instead of Simple Graphs? Modeling Effectiveness and Analysis Flexibility and Efficiency!. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P. (eds) Big Data Analytics. BDA 2019. Lecture Notes in Computer Science(), vol 11932. Springer, Cham. https://doi.org/10.1007/978-3-030-37188-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-37188-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37187-6
Online ISBN: 978-3-030-37188-3
eBook Packages: Computer ScienceComputer Science (R0)