Skip to main content
Log in

A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Today, data storage capabilities as well as computational power are rapidly increasing. On the one hand, this improvement makes it possible to generate and store a great amount of temporal (time-oriented) data for future query, analysis and discovery of new knowledge. On the other hand, systems and experts are encountering new problems in processing this increased amount of data. The rapid growth in stored time-oriented data necessitates the development of new methods for handling, processing, and interpreting large amounts of temporal data. One approach is to use an automatic summarization process based on predefined knowledge, such the Knowledge-Based Temporal-Abstraction (KBTA) method. This method enables one to summarize and reduce the amount of raw data by creating higher level interpretations based on predefined domain knowledge. Unfortunately, the task of temporal abstraction is inherently computationally expensive, especially when an enormous volume of multivariate data has to be handled and when complex patterns need to be considered. In this research, we address the scalability problem of a temporal-abstraction task that involves processing significantly large amounts of raw data. We propose a new computational framework, the Distributed KBTA (DKBTA), which efficiently distributes the abstraction process among several parallel computational nodes, in order to achieve an acceptable computation time. The DKBTA framework distributes the temporal-abstraction process along one or more computational axes, each of which enables parallelization of one or more temporal-abstraction tasks into which the main temporal-abstraction task is decomposed, such as by different subject groups, concepts types, or abstraction types. We have implemented the DKBTA framework and have evaluated it in a preliminary fashion in the medical and the information security domains, with encouraging results. In our small-scale evaluation, only distribution along the subjects axis and sometimes along the concept-type axis seemed to consistently enhance performance, and only for computations involving individual subjects and not functions of sets of subjects; but this observation might depend on the number of processing units. Additionally, since the communication between the processing units was based on the TCP protocol, we could not observe any speedup even when using two processing units on the same machine. In our further evaluations we plan to use a shared memory architecture in order to exchange data between processing units.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.

    Article  MATH  Google Scholar 

  • Azulay, R., Moskovitch, R., Stopel, D., Verduijn, M., Jonge, E., & Shahar, Y. (2007). Temporal discretization of medical time series—A comparative study. IDAMAP.

  • Batal, I., Sacchi, L., Bellazzi, R., & Hauskrecht, M. (2009). Multivariate time series classification with temporal abstractions. In The Florida AI Research Society conference.

  • Bellazzi, R., & Zupan, B (2008). Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77, 81–97.

    Article  Google Scholar 

  • Blaise, B. (2011). Introduction to parallel computing. Lawrence Livermore National Laboratory. http://www.llnl.gov/computing/tutorials/parallel_comp/

  • Boaz, D., & Shahar, Y (2005). A distributed temporal-abstraction mediation architecture for medical databases. Artificial Intelligence in Medicine, 34(1), 3–24.

    Article  Google Scholar 

  • Chakravarty, S., & Shahar, Y (2000). CAPSUL: A constraint-based specification of repeating patterns in time-oriented data. Annals of Mathematics and Artificial Intelligence, 30(1–4), 3–22.

    Article  MATH  Google Scholar 

  • Datar, M., Gionis, A., Indyk, P., & Motwani, R. (2002). Maintaining stream statistics over sliding windows. In ACM-SIAM symposium on discrete algorithms (SODA).

  • Grama, A., Karypis, G., Kumar, V., & Gupta, A. (2003). An introduction to parallel computing: Design and analysis of algorithms. Addison Wesley.

  • Klimov, D., Shahar, Y., & Taieb-Maimon, M. (2010a). Intelligent selection and retrieval of multiple time-oriented records. Journal of Intelligent Information Systems, 35(2), 261–300.

    Article  Google Scholar 

  • Klimov, D., Shahar, Y., & Taieb-Maimon, M. (2010b). Intelligent visualization and exploration of time-oriented data of multiple patients. Artificial Intelligence in Medicine, 49(1), 11–31.

    Article  Google Scholar 

  • Kou, Y., Lu, C. T., Sirwongwattana, S., & Huang, Y. P. (2004). Survey of fraud detection techniques. In IEEE international conference on networking, sensing and control.

  • Lavrač, N., Kononenko, I., Keravnou, E., Kukar, M., & Zupan, B. (1998). Intelligent data analysis for medical diagnosis: Using machine learning and temporal abstraction. AI Communications, 11, 191–218.

    Google Scholar 

  • Martin, J. O., William, E. G., Samson, W. T., & Musen, M. A. (2001). RASTA: A distributed temporal abstraction system to facilitate knowledge-driven monitoring of clinical databases. In Proc. of 10th world congress on medical informatics.

  • McKendrick, J. (2002). Make room for the monster databases. Database trends and applications, 15(12).

  • Moskovitch, R., & Shahar, Y (2009). Medical temporal-knowledge discovery via temporal abstraction. In The 2009 proceedings of the AMIA annual symposium (pp. 452–456).

  • Moskovitch, R., Peek Niels, N., & Shahar, Y. (2009). Classificaton of ICU patients via temporal abstractions and temporal pattern mining. In IDAMAP-2009.

  • Nguyen, J. H., Shahar, Y., Tu, S. W., Das, A. K., & Musen, M. A. (1999). Integration of temporal reasoning and temporal-data maintenance into a reusable database mediator to answer abstract, timeoriented queries: The tzolkin system. Journal of Intelligent Information Systems, 13, 121–145.

    Article  Google Scholar 

  • Park, B. H., & Kargupta, H (2003). Distributed data mining: Algorithms, systems and applications. The handbook of data mining. Lawrence Erlbaum Associates, Inc.

  • Sacchi, L., Larizza, C., Combi, C., & Bellazzi, R. (2007). Data mining with temporal abstractions: Learning rules from time series. Data Mining and Knowledge Discover, 15(2), 217–247.

    Article  MathSciNet  Google Scholar 

  • Shabtai, A., Shahar, Y., & Elovici, Y. (2006). Using the knowledge-based temporal-abstraction (KBTA) method for detection of electronic threats. In Proc. of the 5th European conference on information warfare and security (ECIW2006).

  • Shabtai, A., Maor, A., Shahar, Y., & Elovici, Y. (2007). Evaluation of a new temporal-abstraction knowledge-acquisition tool in the network security domain. In Proc. of 4th international conference on knowledge capture (K-CAP2007).

  • Shabtai, A., Fledel, Y., Elovici, Y., & Shahar, Y. (2009). Using the KBTA method for inferring computer and network security alerts from time-stamped, raw system metrics. Computer Virology, 6(3), 239–259. doi:10.1007/s11416-009-0125-5.

    Article  Google Scholar 

  • Shabtai, A., Kanonov, U., & Elovici, Y. (2010a). Intrusion detection on mobile devices using the knowledge based temporal-abstraction method. Journal of Systems and Software, 83(8), 1524–1537.

    Article  Google Scholar 

  • Shabtai, A., Potashnik, D., Fledel, Y., Moskovitch, R., & Elovici, Y. (2010b). Monitoring, analysis and filtering system for purifying network traffic of known and unknown malicious content. Security and Communication Networks. doi:10.1002/sec.229.

    Google Scholar 

  • Shahar, Y. (1997). A framework for knowledge-based temporal abstraction. Artificial Intelligence, 90(1–2), 79–133.

    Article  MATH  Google Scholar 

  • Shahar, Y. (1998). Dynamic temporal interpretation contexts for temporal abstraction. Annals of Mathematics and Artificial Intelligence, 22(1–2), 159–192.

    Article  MATH  Google Scholar 

  • Shahar, Y. (1999a). Knowledge-based temporal interpolation. Experimental and Theoretical Artificial Intelligence, 11, 123–144.

    Article  MATH  Google Scholar 

  • Shahar, Y. (1999b). Knowledge-based temporal interpolation. Journal of Experimental & Theoretical Artificial Intelligence, 11(1), 123–144.

    Article  MATH  Google Scholar 

  • Shahar, Y., & Musen, M. A. (1993). RÉSUMÉ: A temporal-abstraction system for patient monitoring. Computers and Biomedical Research, 26, 255–273.

    Article  Google Scholar 

  • Shahar, Y., & Musen, M. A. (1996). Knowledge-based temporal abstraction in clinical domains. Artificial Intelligence in Medicine, 8(3), 267–298.

    Article  Google Scholar 

  • Shahar, Y., Chen, H., Stites, D., Basso, L., Kaizer, H., Wilson, D., et al. (1999). Semiautomated acquisition of clinical temporal-abstraction knowledge. Journal of the American Medical Informatics Association, 6, 494–511.

    Article  Google Scholar 

  • Shahar, Y., Klimov, D., & Taieb-Maimon, M. (2009). Intelligent visualization of temporal associations for multiple time-oriented patient records. Methods of Information in Medicine, 48(3), 254–262.

    Article  Google Scholar 

  • Shiguo, W. (2010). A comprehensive survey of data mining-based accounting-fraud detection research. In International conference on intelligent computation technology and automation (ICICTA).

  • Spokoiny, A., & Shahar, Y (2007). An active database architecture for knowledge-based incremental abstraction of complex concepts from continuously arriving time-oriented raw data. Journal of Intelligent Information Systems, 28(3), 199–231.

    Article  Google Scholar 

  • Spokoiny, A., & Shahar, Y (2008). Incremental application of knowledge to continuously arriving time-oriented data. Intelligent Information Systems, 31(1), 1–33.

    Article  Google Scholar 

  • Stacey, M., & McGregor, C (2007). Temporal abstraction in intelligent clinical data analysis: a survey. Artificial Intelligence in Medicine, 39, 1–24.

    Article  Google Scholar 

  • Vila, L. (1994). A survey on temporal reasoning in artificial intelligence. Journal of AI Communications, 7(1), 4–28.

    Google Scholar 

  • Zaki, M. J. (1999). Parallel and distributed association mining: A survey. IEEE Concurrency, 7(4), 14–25.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asaf Shabtai.

Appendix A—Examples of the KBTA’s ontological entities

Appendix A—Examples of the KBTA’s ontological entities

figure a

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shabtai, A., Shahar, Y. & Elovici, Y. A distributed architecture for efficient parallelization and computation of knowledge-based temporal abstractions. J Intell Inf Syst 39, 249–286 (2012). https://doi.org/10.1007/s10844-011-0190-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0190-3

Keywords

Navigation