A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions

Shabtai, Asaf; Shahar, Yuval; Elovici, Yuval

doi:10.1007/s10844-011-0190-3

A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions

Published: 24 December 2011

Volume 39, pages 249–286, (2012)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Asaf Shabtai¹,
Yuval Shahar¹ &
Yuval Elovici¹

288 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

Today, data storage capabilities as well as computational power are rapidly increasing. On the one hand, this improvement makes it possible to generate and store a great amount of temporal (time-oriented) data for future query, analysis and discovery of new knowledge. On the other hand, systems and experts are encountering new problems in processing this increased amount of data. The rapid growth in stored time-oriented data necessitates the development of new methods for handling, processing, and interpreting large amounts of temporal data. One approach is to use an automatic summarization process based on predefined knowledge, such the Knowledge-Based Temporal-Abstraction (KBTA) method. This method enables one to summarize and reduce the amount of raw data by creating higher level interpretations based on predefined domain knowledge. Unfortunately, the task of temporal abstraction is inherently computationally expensive, especially when an enormous volume of multivariate data has to be handled and when complex patterns need to be considered. In this research, we address the scalability problem of a temporal-abstraction task that involves processing significantly large amounts of raw data. We propose a new computational framework, the Distributed KBTA (DKBTA), which efficiently distributes the abstraction process among several parallel computational nodes, in order to achieve an acceptable computation time. The DKBTA framework distributes the temporal-abstraction process along one or more computational axes, each of which enables parallelization of one or more temporal-abstraction tasks into which the main temporal-abstraction task is decomposed, such as by different subject groups, concepts types, or abstraction types. We have implemented the DKBTA framework and have evaluated it in a preliminary fashion in the medical and the information security domains, with encouraging results. In our small-scale evaluation, only distribution along the subjects axis and sometimes along the concept-type axis seemed to consistently enhance performance, and only for computations involving individual subjects and not functions of sets of subjects; but this observation might depend on the number of processing units. Additionally, since the communication between the processing units was based on the TCP protocol, we could not observe any speedup even when using two processing units on the same machine. In our further evaluations we plan to use a shared memory architecture in order to exchange data between processing units.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Distributed Temporal Reasoning

Aura: A Flexible Dataflow Engine for Scalable Data Processing

Efficient Temporal Reasoning on Streams of Events with DOTR

References

Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.
Article MATH Google Scholar
Azulay, R., Moskovitch, R., Stopel, D., Verduijn, M., Jonge, E., & Shahar, Y. (2007). Temporal discretization of medical time series—A comparative study. IDAMAP.
Batal, I., Sacchi, L., Bellazzi, R., & Hauskrecht, M. (2009). Multivariate time series classification with temporal abstractions. In The Florida AI Research Society conference.
Bellazzi, R., & Zupan, B (2008). Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77, 81–97.
Article Google Scholar
Blaise, B. (2011). Introduction to parallel computing. Lawrence Livermore National Laboratory. http://www.llnl.gov/computing/tutorials/parallel_comp/
Boaz, D., & Shahar, Y (2005). A distributed temporal-abstraction mediation architecture for medical databases. Artificial Intelligence in Medicine, 34(1), 3–24.
Article Google Scholar
Chakravarty, S., & Shahar, Y (2000). CAPSUL: A constraint-based specification of repeating patterns in time-oriented data. Annals of Mathematics and Artificial Intelligence, 30(1–4), 3–22.
Article MATH Google Scholar
Datar, M., Gionis, A., Indyk, P., & Motwani, R. (2002). Maintaining stream statistics over sliding windows. In ACM-SIAM symposium on discrete algorithms (SODA).
Grama, A., Karypis, G., Kumar, V., & Gupta, A. (2003). An introduction to parallel computing: Design and analysis of algorithms. Addison Wesley.
Klimov, D., Shahar, Y., & Taieb-Maimon, M. (2010a). Intelligent selection and retrieval of multiple time-oriented records. Journal of Intelligent Information Systems, 35(2), 261–300.
Article Google Scholar
Klimov, D., Shahar, Y., & Taieb-Maimon, M. (2010b). Intelligent visualization and exploration of time-oriented data of multiple patients. Artificial Intelligence in Medicine, 49(1), 11–31.
Article Google Scholar
Kou, Y., Lu, C. T., Sirwongwattana, S., & Huang, Y. P. (2004). Survey of fraud detection techniques. In IEEE international conference on networking, sensing and control.
Lavrač, N., Kononenko, I., Keravnou, E., Kukar, M., & Zupan, B. (1998). Intelligent data analysis for medical diagnosis: Using machine learning and temporal abstraction. AI Communications, 11, 191–218.
Google Scholar
Martin, J. O., William, E. G., Samson, W. T., & Musen, M. A. (2001). RASTA: A distributed temporal abstraction system to facilitate knowledge-driven monitoring of clinical databases. In Proc. of 10th world congress on medical informatics.
McKendrick, J. (2002). Make room for the monster databases. Database trends and applications, 15(12).
Moskovitch, R., & Shahar, Y (2009). Medical temporal-knowledge discovery via temporal abstraction. In The 2009 proceedings of the AMIA annual symposium (pp. 452–456).
Moskovitch, R., Peek Niels, N., & Shahar, Y. (2009). Classificaton of ICU patients via temporal abstractions and temporal pattern mining. In IDAMAP-2009.
Nguyen, J. H., Shahar, Y., Tu, S. W., Das, A. K., & Musen, M. A. (1999). Integration of temporal reasoning and temporal-data maintenance into a reusable database mediator to answer abstract, timeoriented queries: The tzolkin system. Journal of Intelligent Information Systems, 13, 121–145.
Article Google Scholar
Park, B. H., & Kargupta, H (2003). Distributed data mining: Algorithms, systems and applications. The handbook of data mining. Lawrence Erlbaum Associates, Inc.
Sacchi, L., Larizza, C., Combi, C., & Bellazzi, R. (2007). Data mining with temporal abstractions: Learning rules from time series. Data Mining and Knowledge Discover, 15(2), 217–247.
Article MathSciNet Google Scholar
Shabtai, A., Shahar, Y., & Elovici, Y. (2006). Using the knowledge-based temporal-abstraction (KBTA) method for detection of electronic threats. In Proc. of the 5th European conference on information warfare and security (ECIW2006).
Shabtai, A., Maor, A., Shahar, Y., & Elovici, Y. (2007). Evaluation of a new temporal-abstraction knowledge-acquisition tool in the network security domain. In Proc. of 4th international conference on knowledge capture (K-CAP2007).
Shabtai, A., Fledel, Y., Elovici, Y., & Shahar, Y. (2009). Using the KBTA method for inferring computer and network security alerts from time-stamped, raw system metrics. Computer Virology, 6(3), 239–259. doi:10.1007/s11416-009-0125-5.
Article Google Scholar
Shabtai, A., Kanonov, U., & Elovici, Y. (2010a). Intrusion detection on mobile devices using the knowledge based temporal-abstraction method. Journal of Systems and Software, 83(8), 1524–1537.
Article Google Scholar
Shabtai, A., Potashnik, D., Fledel, Y., Moskovitch, R., & Elovici, Y. (2010b). Monitoring, analysis and filtering system for purifying network traffic of known and unknown malicious content. Security and Communication Networks. doi:10.1002/sec.229.
Google Scholar
Shahar, Y. (1997). A framework for knowledge-based temporal abstraction. Artificial Intelligence, 90(1–2), 79–133.
Article MATH Google Scholar
Shahar, Y. (1998). Dynamic temporal interpretation contexts for temporal abstraction. Annals of Mathematics and Artificial Intelligence, 22(1–2), 159–192.
Article MATH Google Scholar
Shahar, Y. (1999a). Knowledge-based temporal interpolation. Experimental and Theoretical Artificial Intelligence, 11, 123–144.
Article MATH Google Scholar
Shahar, Y. (1999b). Knowledge-based temporal interpolation. Journal of Experimental & Theoretical Artificial Intelligence, 11(1), 123–144.
Article MATH Google Scholar
Shahar, Y., & Musen, M. A. (1993). RÉSUMÉ: A temporal-abstraction system for patient monitoring. Computers and Biomedical Research, 26, 255–273.
Article Google Scholar
Shahar, Y., & Musen, M. A. (1996). Knowledge-based temporal abstraction in clinical domains. Artificial Intelligence in Medicine, 8(3), 267–298.
Article Google Scholar
Shahar, Y., Chen, H., Stites, D., Basso, L., Kaizer, H., Wilson, D., et al. (1999). Semiautomated acquisition of clinical temporal-abstraction knowledge. Journal of the American Medical Informatics Association, 6, 494–511.
Article Google Scholar
Shahar, Y., Klimov, D., & Taieb-Maimon, M. (2009). Intelligent visualization of temporal associations for multiple time-oriented patient records. Methods of Information in Medicine, 48(3), 254–262.
Article Google Scholar
Shiguo, W. (2010). A comprehensive survey of data mining-based accounting-fraud detection research. In International conference on intelligent computation technology and automation (ICICTA).
Spokoiny, A., & Shahar, Y (2007). An active database architecture for knowledge-based incremental abstraction of complex concepts from continuously arriving time-oriented raw data. Journal of Intelligent Information Systems, 28(3), 199–231.
Article Google Scholar
Spokoiny, A., & Shahar, Y (2008). Incremental application of knowledge to continuously arriving time-oriented data. Intelligent Information Systems, 31(1), 1–33.
Article Google Scholar
Stacey, M., & McGregor, C (2007). Temporal abstraction in intelligent clinical data analysis: a survey. Artificial Intelligence in Medicine, 39, 1–24.
Article Google Scholar
Vila, L. (1994). A survey on temporal reasoning in artificial intelligence. Journal of AI Communications, 7(1), 4–28.
Google Scholar
Zaki, M. J. (1999). Parallel and distributed association mining: A survey. IEEE Concurrency, 7(4), 14–25.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems Engineering and Deutsche Telekom Laboratories at Ben-Gurion University, Ben-Gurion University, Beer-Sheva, Israel
Asaf Shabtai, Yuval Shahar & Yuval Elovici

Authors

Asaf Shabtai
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Shahar
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Elovici
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asaf Shabtai.

Appendix A—Examples of the KBTA’s ontological entities

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shabtai, A., Shahar, Y. & Elovici, Y. A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions. J Intell Inf Syst 39, 249–286 (2012). https://doi.org/10.1007/s10844-011-0190-3

Download citation

Received: 18 January 2011
Revised: 17 September 2011
Accepted: 25 November 2011
Published: 24 December 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s10844-011-0190-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions

Abstract

Access this article