Abstract
With the emergence of the World Wide Web, Web sites have become a key communication channel for organizations. In this context, analyzing and improving Web communication is essential to better satisfy the objectives of the target audience. Web communication analysis is traditionnally performed by Web analytics software, which produce long lists of audience metrics. These metrics contain little semantics and are too detailed to be exploited by organization managers and chief editors, who need summarized and conceptual information to take decisions. Our solution to obtain such conceptual metrics is to analyze the content of the Web pages output by the Web server. In this paper, we first present a list of methods that we conceived to mine the output Web pages. Then, we explain how term weights in these pages can be used as audience metrics, and how they can be aggregated using OLAP tools to obtain concept-based metrics. Finally, we present the concept-based metrics that we obtained with our prototype WASA and SQL Server OLAP tools.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.S.: On Disk Caching of Web Objects in Proxy Servers. In: Proc. of the 6th Int. Conf. on Information and Knowledge Management, CIKM 1997, pp. 238–245 (1997)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Büchner, A.G., Mulvenna, M.D.: Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining. SIGMOD Record 27(4), 54–61 (1998)
Chen, M.-S., Han, J., Yu, P.S.: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)
Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.E.: Using Information Scent to Model User Information Needs and Actions and the Web. In: Proc. of the SIGCHI on Human Factors in Computing Systems, pp. 490–497 (2001)
Facca, F.M., Lanzi, P.L.: Mining Interesting Knowledge from Weblogs: a Survey. Data Knowl. Eng. 53(3), 225–241 (2005)
Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, Heidelberg (2000)
Lozano-Tello, A., Gómez-Pérez, A.: Ontometric: A Method to Choose the Appropriate Ontology. J. Database Manag. 15(2), 1–18 (2004)
Malinowski, E., Zimányi, E.: OLAP Hierarchies: A Conceptual Perspective. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 477–491. Springer, Heidelberg (2004)
March, J.G., Simon, H.A., Guetzkow, H.S.: Organizations, 2nd edn. Blackwell, Cambridge (1983)
Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Communications of the ACM 43(8), 142–151 (2000)
Moeller, M., Cicaterri, C., Presser, A., Wang, M.: Measuring e-Business Web Usage, Performance, and Availability. IBM Press (2003)
Perkowitz, M., Etzioni, O.: Towards Adaptive Web Sites: Conceptual Framework and Case Study. Artif. Intell. 118(1-2), 245–275 (2000)
Pirolli, P., Pitkow, J.E.: Distributions of Surfers’ Paths through the World Wide Web: Empirical Characterizations. World Wide Web 2(1-2), 29–45 (1999)
Ríos, S.A., Velásquez, J.D., Vera, E.S., Yasuda, H., Aoki, T.: Using SOFM to Improve Web Site Text Content. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3611, pp. 622–626. Springer, Heidelberg (2005)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Srivastava, J., Cooley, R., Deshpande, M., Pang-Ning, T.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD 1(2) (2000)
Steinberger, R., Pouliquen, B., Ignat, C.: Exploiting Multilingual Nomenclatures and Language-Independent Text Features as an Interlingua for Cross-Lingual Text Analysis Applications. In: Proc. of the 4th Slovenian Language Technology Conf., Information Society 2004 (2004)
Sterne, J.: Web Metrics: Proven Methods for Measuring Web Site Success. John Wiley & Sons, Chichester (2002)
Stumme, G., Maedche, A.: Fca-Merge: Bottom-up Merging of Ontologies. In: Proc. of the 17th Int. Joint Conf. on Artificial Intelligence, IJCAI 2001, pp. 225–234 (2001)
Wahli, U., Norguet, J.P., Andersen, J., Hargrove, N., Meser, M.: Websphere Version 5 Application Development Handbook. IBM Press (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Norguet, JP., Zimányi, E., Steinberger, R. (2006). Improving Web Sites with Web Usage Mining, Web Content Mining, and Semantic Analysis. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2006: Theory and Practice of Computer Science. SOFSEM 2006. Lecture Notes in Computer Science, vol 3831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611257_41
Download citation
DOI: https://doi.org/10.1007/11611257_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31198-0
Online ISBN: 978-3-540-32217-7
eBook Packages: Computer ScienceComputer Science (R0)