Abstract
Log files are an important by product of any computing systems including database systems. They contain a huge amount of historical data. Although many algorithms have been designed to utilize the information stored in such files, many of them can still be further improved in terms of execution time and memory usage. In this research paper, a binary-based approach for mining frequency of data items in database transaction log files is introduced. Both the data structures and the algorithms used will be presented according to the sequence of the methodology stages carried out in this research work. The stages are pre, during and post scanning of the log file. Initial experimentation of the approach reveals a significant improvement in terms of the execution time taken to perform frequency analysis of a database transaction log file. To validate the approach, performance comparison was also done against the popular Apriori algorithm. Initial result has shown enhancement in terms of execution time using the binary-based approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Nagappan, M., Robinson, B.: Creating Operational Profiles of Software Systems by Transforming their Log Files to Directed Cyclic Graphs. In: Proceedings of the 6th International Workshop on Traceability in Emerging Forms of Software Engineering, pp. 54–57. ACM, New York, NY, USA (2011).
Fronza, G.I., Sillitti, A.: Failure Prediction based on Log Files using Random Indexing and Support Vector Machines. Journal of System and Software, Elsevier B.V. (2012).
Gao, L., Zhang, Z., Towsley, D.: Catching and Selective Catching: Efficient Latency Reduction Techniques for Delivering Continuous Multimedia Streams. In: Proc. 1999 ACM Multimedia Conf, pp. 203–206. ACM, NY, USA (1999).
Jeswani, D., Gupta, M., De, P., Malani, A., Bellur, U.: Minimizing Latency in Serving Requests through Differential Template Caching in a Cloud. In: Proc. 2012 IEEE Fifth International Conference on Cloud Computing, pp. 269–276. IEEE (2012).
Sharifi, A., Kultursay, E., Kandemir, M., Das, C.R.: Addressing End-to-End Memory Access Latency in NoC-Based Multicores. In: Proc. MICRO ‘12 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 294–304. IEEE Computer Society Washington DC, USA (2012).
Murayama, D., Oota, N., Suzuki, K-I., Yoshimoto, N.: Low-Latency Dynamic Bandwidth Allocation for 100 km Long-Reach EPONs. Journal of Optical Communications and Networking, vol. 5, no. 1, p. 48 (2012).
Yu, X.: A Novel Approach to Mining Access Patterns. In: Proc. 3rd International Conference on Awareness Science and Technology (iCAST), pp.350-355. IEEE, (2011).
Agrawal, R.: Fast Algorithms for Mining Association Rules. In: Proc. 20th Int. Conf. Very Large Data Bases (VLDB), vol. 1215, pp. 1–32 (1994).
Wahab, M. H. A., Mohd, M. N. H., Hanafi, H. F., & Mohsin, M. F. M. : Data Pre-processing on Web Server Logs for Generalized Association Rules Mining Algorithm. World Academy of Science, Engineering and Technology, vol. 48 (2008).
Suneetha, K. R.: Identifying User Behavior by Analyzing Web Server Access Log File. IJCSNS International Journal of Computer Science and Network Security, vol. 9, no. 4, pp. 327–332 (2009).
Stermsek, G., Strembeck, M., Neumann, G.: A User Profile Derivation Approach based on Log-File Analysis. In: Proc. of IKE, pp. 258–264 (2007).
Ilayaraja, M.: Mining Medical Data to Identify Frequent Diseases using Apriori Algorithm. In: Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering, pp. 194–199. IEEE (2013).
Singh, J., Ram, H.: Improving Efficiency of Apriori Algorithm Using Transaction Reduction. International Journal of Scientific and Research Publications, vol. 3, no. 1, pp. 1–4. (2013).
Luan, R., Sun, S., Zhang, J., Yu, F., Zhang, Q.: A Dynamic Improved Apriori Algorithm and its Experiments in Web Log Mining. In: 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 1261–1264. IEEE (2012).
Mishra, R.: Discovery of Frequent Patterns from Web Log Data by using FP-Growth algorithm for Web Usage Mining. International Journal of Advanced Research in Computer Science and Software Engineering, vol. 2, no. 9, pp. 311–318. IJARCSSE, India (2012).
Vyas, Z.V., Ganatra, A.P., Kosta, Y.P., Bhesadadia, C.K.: Modified RAAT (Reduced Apriori Algorithm Using Tag) for Efficiency Improvement with EP(Emerging Patterns) and JEP(Jumping EP). In: Proc. 2010 International Conference on Advances in Computer Engineering, pp. 238–240. IEEE (2010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Singapore
About this paper
Cite this paper
Fageeri, S.O., Ahmad, R., Baharum, B. (2014). A Log File Analysis Technique Using Binary-Based Approach. In: Herawan, T., Deris, M., Abawajy, J. (eds) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). Lecture Notes in Electrical Engineering, vol 285. Springer, Singapore. https://doi.org/10.1007/978-981-4585-18-7_1
Download citation
DOI: https://doi.org/10.1007/978-981-4585-18-7_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-4585-17-0
Online ISBN: 978-981-4585-18-7
eBook Packages: EngineeringEngineering (R0)