Abstract
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent itemsets (sets of common words) into topics, produce a hierarchical list, and then generate topics sequence from a collection of documents. The framework will regenerate a next sequence when users click a topic. Consider browsing to any topic as a kind of searching for that topic, the framework makes an inquiry using feature terms within the document representation of selected topic as query keywords. Our ranking method in searching process considers content analysis that still retaining spatial information of search keywords and link analysis of documents. Utilizing implementation of navigation generating system the experiments show that a navigation list from clustering results can be settled with regard to variance ratio of between and within distances. Agglomerative clustering is used in restructuring the extracted topics in order to produce a hierarchical navigation list.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Mukherjea, S.: Organizing Topic-specific Web Information. In: HYPERTEXT 2000: Proc. of the Eleventh ACM Conf. on Hypertext and Hypermedia, pp. 133–141 (2000)
Zhu, J., Hong, J., Hughes, J.G.: PageCluster: Mining Conceptual Link Hierarchies from Web Log Files for Adaptive Web site Navigation. ACM Trans. Inter. Tech. 4(2), 185–208 (2004)
Halkidi, M., Nguyen, B., Varlamis, I., Vazirgiannis, M.: THESUS: Organizing Web Document Collections based on Link Semantics. The VLDB Journal 12(4), 320–332 (2003)
Reinhold, S.: WikiTrails: Augmenting Wiki Structure for Collaborative, Interdisciplinary Learning. In: WikiSym 2006: Proc. of the 2006 Intl. Symp. on Wikis, pp. 47–58 (2006)
Clifton, C., Cooley, R., Rennie, J.: TopCat: Data Mining for Topic Identification in a Text Corpus. IEEE Trans. on Knowledge and Data Engineering 16(8), 949–964 (2004)
Yates, R.B., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Geng, L., Hamilton, H.J.: Interestingness Measures for Data Mining: A Survey. ACM Comput. Surv. 38(3) (2006)
Piatetsky-Shapiro, G., Frawley, W.J.: Discovery, Analysis, and Presentation of Strong Rules. In: Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)
Tan, P., Kumar, V.: Interestingness Measures for Association Patterns: A Perspective. Technical Report TR00-036, Department of Computer Science, University of Minnesota (2000)
Karypis, G.: Multilevel Hypergraph Partitioning. Technical Report 02-25, Comput. Sci. and Eng. Dept., Univ. Minnesota, Minneapolis (2002)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical report, Stanford Digital Library Technologies Project (1998)
Park, L.A., Ramamohanarao, K., Palaniswami, M.: Fourier Domain Scoring: A Novel Document Ranking Method. IEEE Trans. on Knowledge and Data Engineering 16(5), 529–539 (2004)
Purwitasari, D., Okazaki, Y., Watanabe, K.: A Study on Web Resources’ Navigation for e-Learning: Usage of Fourier Domain Scoring on Web Pages Ranking Method. In: ICICIC 2007: Proc. of the Second Intl. Conf. on Innovative Computing, Information and Control (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Purwitasari, D., Okazaki, Y., Watanabe, K. (2008). Data Mining for Navigation Generating System with Unorganized Web Resources. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_76
Download citation
DOI: https://doi.org/10.1007/978-3-540-85563-7_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85562-0
Online ISBN: 978-3-540-85563-7
eBook Packages: Computer ScienceComputer Science (R0)