Abstract
In order to improve the quality of web data mining algorithm, this paper summarizes the advantages and disadvantages of several web data source models, including web log, application server log, Client-side log, Packet sniffer, and 5-gram united events model. Based on this analysis, a new 4-gram united events model (UEM4) is proposed in this paper. Simulation experiments were conducted to verify the performance of UEM4, compared with web log and 5-gram united events model. The experiment results show that web log has the worst session identification performance; UEM5 has high accuracy, best online and offline performance, but it needs the application system support the ability to identify the session; UEM4 does not require the application system to support session identification, and also has a good accuracy and performance of session identification. Therefore, this model can be used in e-commerce, which can provide high quality data sources for web mining algorithms and improve the quality of intelligent services.
Similar content being viewed by others
References
Tourassi, G., Yoon, H.J., Xu, S.H., Han, X.S.: The utility of web mining for epidemiological research: studying the association between parity and cancer risk. J. Am. Med. Inf. Assoc. 23(3), 588–595 (2016)
Zhao, J.S., Zhao, S.Y.: Business analytics programs offered by AACSB-accredited U.S. colleges of business: a web mining study. J. Educ. Bus. 91(6), 327–337 (2016)
Panda, B., Tripathy, S.N., Sethi, N., Samantray, O.P.: A comparative study on serial and parallel web content mining. Int. J. Adv. Netw. Appl. 7(5), 2882–2886 (2016)
Patil, Swapnil S., Khandagale, Hridaynath P.: Enhancing web navigation usability using web usage mining techniques. Int. Res. J. Eng. Technol. 4(6), 2828–2834 (2016)
Asha, K.N., Rajkumar, R.: Survey on web mining techniques and challenges of e-commerce in online social networks. Indian J. Sci. Technol. 9(13) (2016)
Siddiqui, A.T., Aljahdali, S.: Web mining techniques in e-commerce applications. Int. J. Comput. Appl. 69(8), 39–43 (2013)
Xu, Z., Luo, X., Zhang, S., Wei, X., Mei, L., Hu, C.: Mining temporal explicit and implicit semantic relations between entities using web search engines. Future Gener. Comput. Syst. 37, 468–477 (2014)
Satish, B., Sunil, P.: Study and evaluation of user’s behavior in e-Commerce using data mining. Res. J. Recent Sci. 1, 375–387 (2012)
Jafari, M., Sabzchi, F.S., Rani, A.J.: Applying web usage mining techniques to design effective web recommendation systems: a case study. ACSIJ Adv. Comput. Sci. Int. J. 3(2), 78–90 (2014)
Kathirvel, P.: A survey on online shopping recommendation based on customer transactions. Int. J. Sci. Eng. Technol. Res. 4(3), 564–566 (2015)
Asha, K.N., Rajkumar, R.: Survey on web mining techniques and challenges of e-commerce in online social networks. Indian J. Sci. Technol. 9(13), 1–5 (2016)
Tesfaye, B., Atique, S., Elias, N., et al.: Determinants and development of a web-based child mortality prediction model in resource-limited settings: a data mining approach. Comput. Methods Progr. Biomed. 140(3), 45–51 (2017)
Iyer, N., Dcunha, A., Desai, A., Jain, K.: Survey on online recommendation using web usage mining. Int. J. Comput. Sci. Inf. Technol. 6(2), 1465–1467 (2015)
Xuan, J.Y., Luo, X.F., Zhang, G.Q., Liu, J., Xu, Z.: Uncertainty analysis for the keyword system of web events. IEEE Trans. Syst. Man Cybern. Syst. 46(6), 829–842 (2016)
Ambili, P.S.: Varghese Paul. Enhanced user personalization by web log mining and link structure display. Middle-east. J. Sci. Res. 24(3), 628–631 (2016)
Alessandra, M., Piercesare, S.: Statistical analysis of complex and spatially dependent data: a review of object oriented spatial statistics. Eur. J. Oper. Res. 258(2), 401–410 (2017)
Zhang, W., Pan, X.F., Yan, Y.B., Pan, X.Y.: Convergence analysis of regional energy efficiency in china based on large-dimensional panel data model. J. Clean. Product. 142(2), 801–808 (2017)
Jana, M., Jan-Philipp, M., Karsten, R., Fabian, E.: Retrieving chromatin patterns from deep sequencing data using correlation functions. Biophys. J. 112(3), 473–490 (2017)
Mahajan, R., Sodhi, J.S., Mahajan, V.: Usage patterns discovery from a web log in an Indian e-learning site: a case study. Educ. Inf. Technol. 21(1), 123–148 (2016)
Parthiban, P., Selvakumar, S.: Big data architecture for capturing, storing, analyzing and visualizing of web server logs. Indian J. Sci. Technol. 9(4), 1–9 (2016)
Girdhar, Palak, Malik, Vikas: A study on detecting packet using sniffing method. J. Netw. Commun. Emerg. Technol. 6(7), 45–46 (2016)
Zou, X.Y.: 5-gram united event model. Appl. Mech. Mater. 1319–1322 (2010)
Kohavi R.: Mining e-commerce data: the good, the bad, and the ugly. In: Provost, F., Srikant R. (Eds.) Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press: USA, pp. 8–13 (2001)
Ha, S.H.: Helping online customers decide through web personalization. IEEE Intell. Syst. 17(6), 34–43 (2002)
More, A., Joshi, P.P.: Survey on inferring user image-search goals using click through logs. Int. Res. J. Eng. Technol. 3(3), 149–152 (2016)
Liao, Z., Song, Y., Huang, Y.L., et al.: An effective segmentation of user search behavior. IEEE Trans. Knowl. Data Eng. 26(12), 3090–3102 (2014)
Gaikwad, Pravin, Kulkarni, Jyoti: Inconsistency extraction using advanced FP-growth algorithm. Int. J. Comput. Appl. 105(5), 6–10 (2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zou, X. A four-gram unified event model for web mining. Cluster Comput 21, 967–975 (2018). https://doi.org/10.1007/s10586-017-0988-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0988-z