Skip to main content
Log in

Application of density-based outlier detection to database activity monitoring

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

To prevent internal data leakage, database activity monitoring uses software agents to analyze protocol traffic over networks and to observe local database activities. However, the large size of data obtained from database activity monitoring has presented a significant barrier to effective monitoring and analysis of database activities. In this paper, we present database activity monitoring by means of a density-based outlier detection method and a commercial database activity monitoring solution. In order to provide efficient computing of outlier detection, we exploited a kd-tree index and an Approximated k-nearest neighbors (ANN) search method. By these means, the outlier computation time could be significantly reduced. The proposed methodology was successfully applied to a very large log dataset collected from the Korea Atomic Energy Research Institute (KAERI). The results showed that the proposed method can effectively detect outliers of database activities in a shorter computation time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. It should be noted that the LOF algorithm application explained following section is not restricted to a specific solution; provided that database transaction logs are collected, any solution can be utilized.

References

  • Agyemang, M., & Ezeife, C. I. (2004). LSC-Mine: algorithm for mining local outliers. Proceedings of the 15th Information Resource Management Association (IRMA) International Conference, New Orleans, vol. 1 (pp. 5–8). Hershey: IRM Press.

  • Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., & Wu, A. (1998). An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45(6), 891–923.

    Article  Google Scholar 

  • Barnett, V., & Lewis, T. (1994). Outliers in statistical data. John Wiley.

  • Bentley, J. L. (1990). K-d trees for semidynamic point sets. Proceedings of 6th Annual ACM Symposium Computational Geometry (pp. 187–197). New York: ACM Press.

  • Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: identifying density based local outliers. Proceedings of the ACM SIGMOD Conference, Dallas, Texas (pp. 93–104). New York: ACM.

  • Chaudhary, A., Szalay, A. S., Szalay, E. S., & Moore, A. W. (2002). Very fast outlier detection in large multidimensional data sets. The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery Madison, Wisconsin. New York: ACM Press.

  • Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3), 209–226.

    Article  Google Scholar 

  • Gartner (2007). Hype cycle for regulations and related standards, ID: G00141115. http://www.gartner.com/DisplayDocument?id=500178.

  • Gartner (2009). Hype cycle for governance, risk and compliance technologies, ID: G00168610. http://www.gartner.com/DisplayDocument?id=1080715.

  • Hawkins, D. (1980). Identification of outliers. Chapman and Hall.

  • Lazarevic, A., Ertoz, L., Kumar, V., Ozgur, A., & Srivastava, J. (2003). A comparative study of anomaly detection schemes in network intrusion detection. Proceedings of the Third SIAM International Conference on Data Mining, San Francisco, CA.

  • Pokrajac, D., Lazarevic, A., & Latecki, L. J. (2007). Incremental local outlier detection for data streams. IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii (pp. 504–515). New York: IEEE Press.

  • Richardson, R. (2008). CSI/FBI Computer crime and security survey, available at http://www.goscsi.com.

  • Somansa (2009). Electronics data and management solution. http://www.somansatech.com.

  • Yuhanna, N., Heffner, R., & Schwaber, C. (2005). Comprehensive Database Security Requires Native DBMS Features and Third-Party Tools, Forrester. http://www.forrester.com/Research/Document/Excerpt/0,7211,36301,00.html.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nam Wook Cho.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, S., Cho, N.W., Lee, Y.J. et al. Application of density-based outlier detection to database activity monitoring. Inf Syst Front 15, 55–65 (2013). https://doi.org/10.1007/s10796-010-9266-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-010-9266-9

Keywords

Navigation