Khanan: Performance Comparison and Programming $$\alpha $$ -Miner Algorithm in Column-Oriented and Relational Database Query Languages

Sachdev, Astha; Gupta, Kunal; Sureka, Ashish

doi:10.1007/978-3-319-27057-9_12

Astha Sachdev¹⁵,
Kunal Gupta¹⁵ &
Ashish Sureka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9498))

Included in the following conference series:

International Conference on Big Data Analytics

1798 Accesses

Abstract

Process-Aware Information Systems (PAIS) support business processes and generate large amounts of event logs from the execution of business processes. An event log is represented as a tuple of CaseID, Timestamp, Activity and Actor. Process Mining is a new and emerging field that aims at analyzing the event logs to discover, enhance and improve business processes and check conformance between run time and design time business processes. The large volume of event logs generated are stored in the databases. Relational databases perform well for a certain class of applications. However, there is a certain class of applications for which relational databases are not able to scale well. To address the challenges of scalability, NoSQL database systems emerged. Discovering a process model (workflow) from event logs is one of the most challenging and important Process Mining tasks. The $\alpha $-miner algorithm is one of the first and most widely used Process Discovery techniques. Our objective is to investigate which of the databases (Relational or NoSQL) performs better for a Process Discovery application under Process Mining. We implement the $\alpha $-miner algorithm on relational (row-oriented) and NoSQL (column-oriented) databases in database query languages so that our application is tightly coupled to the database. We conduct a performance benchmarking and comparison of the $\alpha $-miner algorithm on row-oriented database and NoSQL column-oriented database. We present the comparison on various aspects like time taken to load large datasets, disk usage, stepwise execution time and compression technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Carlos, O.: Programming the k-means clustering algorithm in SQL. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–828 (2004)
Google Scholar
Ordonez, C., Cereghini, P.: SQLEM: fast clustering in SQL using the EM Algorithm. In: International Conference on Management of Data, pp. 559–570 (2000)
Google Scholar
Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOID (2008)
Google Scholar
Rana, D.P., Mistry, N.J., Raghuwanshi, M.M.: Association rule mining analyzation using column oriented database. Int. J. Adv. Comput. Res. 3(3), 88–93 (2013)
Google Scholar
Finn, M.A.: Fighting impedance mismatch at the database level. White paper (2001)
Google Scholar
Gupta, K., Sachdev, A., Sureka, A.: Pragamana: performance comparison and programming alpha-miner algorithm in relational database query language and NoSQL column-oriented using apache phoenix. In: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering, C3S2E 2015, pp. 113–118 (2008)
Google Scholar
Joishi, J., Sureka, A.: Vishleshan: performance comparison and programming process mining algorithms in graph-oriented and relational database query languages. In: Proceedings of the 19th International Database Engineering and Applications Symposium, IDEAS 2015, pp. 192–197 (2014)
Google Scholar
Sattler, K.-U., Dunemann, O.: SQL database primitives for decision tree classifiers. In: Conference on Information and Knowledge Management, pp. 379–386 (2001)
Google Scholar
Suresh, L., Simha, J., Velur, R.: Implementing k-means algorithm using row store and column store databases-a case study. Int. J. Recent Trends Eng. 4(2) (2009)
Google Scholar
Plattner, H.: A common database approach for OLTP and OLAP using an in-memory column database. In: ACM SIGMOD International Conference on Management of Data (2009)
Google Scholar
Russell, N.C.: Foundation of process-aware information systems. Dissertation (2007)
Google Scholar
Sharma, V., Dave, M.: SQL and NoSQL database. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(8), 20–27 (2012)
Google Scholar
Weerapong, S., Porouhan, P., Premchaiswadi, W.: Process mining using $\alpha $-algorithm as a tool. IEEE (2012)
Google Scholar
Aalst, W.V.D.: Process mining: overview and opportunities. ACM Trans. Manage. Inf. Syst. 3(2), 1–17 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indraprastha Institute of Information Technology, Delhi (IIITD), New Delhi, India
Astha Sachdev & Kunal Gupta
Software Analytics Research Lab (SARL), New Delhi, India
Ashish Sureka

Authors

Astha Sachdev
View author publications
You can also search for this author in PubMed Google Scholar
Kunal Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Ashish Sureka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashish Sureka .

Editor information

Editors and Affiliations

University of Delhi, Delhi, India
Naveen Kumar
University of Delhi, Delhi, India
Vasudha Bhatnagar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sachdev, A., Gupta, K., Sureka, A. (2015). Khanan: Performance Comparison and Programming $\alpha $-Miner Algorithm in Column-Oriented and Relational Database Query Languages. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-27057-9_12
Published: 25 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27056-2
Online ISBN: 978-3-319-27057-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Khanan: Performance Comparison and Programming \(\alpha \)-Miner Algorithm in Column-Oriented and Relational Database Query Languages

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Khanan: Performance Comparison and Programming \(\alpha \)-Miner Algorithm in Column-Oriented and Relational Database Query Languages

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation