Abstract
The objective of this work is to provide, as a search engine, latent semantic indexing (LSI), which is a classical method to produce optimal approximations of a term-document matrix and has been used for textual information mining. The use of this technique is examining mine content which based web document, using keyword features of documents. Experimental results show that together with both textual and latent features LSI can extract the underlying semantic structure of web documents, thus improve the search engine performance significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pringle, G., Allison, L., Dowe, D.L.: What is a tall poppy among web pages? In: Proc. 7th IWWWC, Brisbane, Australia, pp. 369–377 (1998)
Hawking, D.: Results and challenges in web search evaluation. In: Proc. 8th IWWWC, Toronto, ON, Canada, pp. 1321–1330 (1999)
Schechter, S., Krishnam, M., Smith, M.D.: Using path profiles to predict HTTP requests. In: Proc. 7th IWWWC, Brisbane, Australia, pp. 457–467 (1998)
Gudivada, V.N., Raghavan, V.V.: Content-based image retrieval systems. IEEE Comput. 28, 18–22 (1995)
Dumais, S., Nielsen, J.: Automating the Assignment of Submitted Manuscripts to Reviewers. In: Proceedings of the Fifteenth Annual International Conference on Research and Development in Information Retrieval, pp. 233–244 (1992)
Yang, J., Watada, J.: Wise Mining Method with Ant Colony Optimization. In: IEEE International Conference on Systems, Man, and Cybernetics, USA, San Antonio, October 2009, pp. 1902–1908 (2009)
Karypis, G., Han, E.: Fast Supervised Dimensionality Reduction Algorithm with Applications to Document Categorization and Retrieval. In: Proceedings of CIKM-00, 9th ACM Conference on Information and Knowledge Management, pp. 12–19 (2000)
Bradford, R.: An Empirical Study of Required Dimensionality for Large-scale Latent Semantic Indexing Applications. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, Napa Valley, California, USA, pp. 153–162 (2008)
Landauer, T.K., Dumais, S.T.: Latent Semantic Analysis. Scholarpedia 3(11), 43–56 (2008)
Landauer, T., et al.: Learning Human-like Knowledge by Singular Value Decomposition: A Progress Report. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems, vol. 10, pp. 45–51. MIT Press, Cambridge (1998)
Homayouni, R., Heinrich, K., Wei, L., Berry Michael, W.: Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts, pp. 104–115 (August 2004)
Ding, C.: A Similarity-based Probability Model for Latent Semantic Indexing. In: Proceedings of the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–65 (1999)
Bartell, B., Cottrell, G., Belew, R.: Latent Semantic Indexing is an Optimal Special Case of Multidimensional Scaling. In: Proceedings, ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 161–167 (1992)
Graesser, A., Karnavat, A.: Latent Semantic Analysis Captures Causal, Goal-oriented, and Taxonomic Structures. In: Proceedings of CogSci 2000, pp. 184–189 (2000)
Aleman-Meza, B., Halaschek, C., Arpinar, I., Sheth, A.: A Context-Aware Semantic Association Ranking. In: Proc. First Int’l Workshop Semantic Web and Databases (SWDB ’03), pp. 33–50 (2003)
Baeza-Yates, R., Caldero’n-Benavides, L., Gonza’lez-Caro, C.: The Intention behind Web Queries. In: Proc. 13th Int’l Conf. String Processing and Information Retrieval (SPIRE ’06), pp. 98–109 (2006)
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: Proc. Seventh Int’l Conf. World Wide Web (WWW ’98), pp. 107–117 (1998)
Junghoo, C., Garcia-Molina, H., Page, L.: Efficient Crawling through URL Ordering. Computer Networks and ISDN Systems 30(1), 161–172 (1998)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Library Technologies Project (1998)
Pisharody, A., Michel, H.E.: Search Engine Technique Using Keyword Relations. In: Proc. Int’l Conf. Artificial Intelligence (ICAI ’05), pp. 300–306 (2005)
Priebe, T., Schlager, C., Pernul, G.: A Search Engine for RDF Metadata. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 168–172. Springer, Heidelberg (2004)
Rocha, C., Schwabe, D., Aragao, M.P.: A Hybrid Approach for Searching in the Semantic Web. In: Proc. 13th Int’l Conf. World Wide Web (WWW ’04), pp. 374–383 (2004)
Stojanovic, N., Studer, R., Stojanovic, L.: An Approach for the Ranking of Query Results in the Semantic Web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 500–516. Springer, Heidelberg (2003)
Sun, R., Cui, H., Li, K., Kan, M.Y., Chua, T.S.: Dependency Relation Matching for Answer Selection. In: Proc. ACM SIGIR ’05, pp. 651–652 (2005)
Tran, T., Cimiano, P., Rudolph, S., Studer, R.: Ontology-Based Interpretation of Keywords for Semantic Search. In: Proc. Sixth Int’l Semantic Web Conf., pp. 523–536 (2007)
Lei, Y., Uren, V., Motta, E.: SemSearch: A Search Engine for the Semantic Web. In: Staab, S., Svátek, V. (eds.) EKAW 2006. LNCS (LNAI), vol. 4248, pp. 238–245. Springer, Heidelberg (2006)
Li, Y., Wang, Y., Huang, X.: A Relation-Based Search Engine in Semantic Web. IEEE Trans. Knowledge and Data Eng. 19(2), 273–282 (2007)
Stojanovic, N.: An Explanation-Based Ranking Approach for Ontology-Based Querying. In: Proc. 14th Int’l Workshop Database and Expert Systems Applications, pp. 167–175 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jianxiong, Y., Watada, J. (2010). Wise Search Engine Based on LSI. In: Cao, L., Bazzan, A.L.C., Gorodetsky, V., Mitkas, P.A., Weiss, G., Yu, P.S. (eds) Agents and Data Mining Interaction. ADMI 2010. Lecture Notes in Computer Science(), vol 5980. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15420-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-15420-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15419-5
Online ISBN: 978-3-642-15420-1
eBook Packages: Computer ScienceComputer Science (R0)