Summary
The term soft computing refers to a family of techniques consisting of methods and procedures based on fuzzy logic, evolutionary computing, artificial neural networks, probabilistic reasoning, rough sets, chaotic computing. With the discovery that the Web is structured according to social networks exhibiting the small world property, the idea of using taxonomy principles has appeared as a complementary alternative to traditional keyword searching. One technique which has emerged from this principle was the “web-as-brain” metaphor. It is yielding new, associative, artificial neural networks- (ANN-) based retrieval techniques. The present paper proposes a unified formal framework for three major methods used for Web retrieval tasks: PageRank, HITS, I2R. The paper shows that these three techniques, albeit they stem originally from different paradigms, can be integrated into one unified formal view. The conceptual and notational framework used is given by ANNs and the generic network equation. It is shown that the PageRank, HITS and I2R methods can be formally obtained from the generic equation as different particular cases by making certain assumptions reflecting the corresponding underlying paradigm. The unified formal view sheds a new light upon the understanding of these methods: it may be said that they are only seemingly different from each other, they are particular ANNs stemming from the same equation and differing from one another in whether they are dynamic (a page’s importance varies in time) or static (a page’s importance is constant in time), and in the way they connect the pages to each other. The paper also gives a detailed mathematical analysis of the computational complexity of WTA-based IR techniques using the I2R method for illustration. The importance of this analysis consists in that it shows that (i) intuition may be misleading (contrary to intuition, a WTA-based algorithm yielding circles is not always “hard”), and (ii) this analysis can serve as a model that may be followed in the analysis of other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arasu, A. (2002). PageRank Computation and the Structure of the Web: Experiments and Algorithms. Proceedings of the World Wide Web 2002 Conference, Honolulu, Hawaii, USA, 7–11 May, http://www2002.org/CDROM/poster (visited: 4 Nov 2002)
Bartell, B. T. (1994). Optimizing Ranking Functions: A Connectionist Approach to Adaptive Information Retrieval. Ph.D. Thesis, Department of Computer Science and Engineering, University of California, San Diego, 1994. http://www.cs.ucsd.edu/groups/guru/publications.html (visited: 10 May 2004)
Belew, R.K. (1987). A Connectionist Approach to Conceptual Information Retrieval. Proceedings of the International Conference on Artificial Intelligence and Law (pp. 116–126). Baltimore, ACM Press.
Belew, R.K. (1989). Adaptive information retrieval: Using a connectionist representation to retrieve and learn about documents. Proceedings of the SIGIR 1989 (pp. 11–20). Cambridge, MA, ACM Press.
Bienner, F., Giuvarch, M. and Pinon, J.M. (1990). Browsing in hyperdocuments with the assistance of a neural network. Proceedings of the European Conference on Hypertext (pp. 288–297). Versailles, France.
Brin, S., and Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the 7th World Wide Web Conference, Brisbane, Australia, 14–18 April, pp: 107–117
Chang, E. and Li, B. (2003). MEGA — The Maximizing Expected Generalization Algorithm for Learning Complex Query Concepts. ACM Transactions on Information Systems, 21(4), pp: 347–382.
Chen, H. (2003a). Introduction to the JASIST special topic section on Web retrieval and mining: a machine learning perspective. Journal of the American Society for Information Science and Technology, vol. 54, no. 7, pp: 621–624.
Chen, H. (2003b). Web retrieval and mining. Decision Support Systems, vol. 35, pp: 1–5.
Chen, H., Fan, H., Chau, M., Zeng, D. (2001). MetaSpider: Meta-Searching and Categorisation on the Web. Journal of the American Society for Information Science and Technology, vol. 52, no. 13, pp: 1134–1147.
Cheun, S. S. and Zakhor, A. (2001). Video Similarity Detection with Video Signature Clustering. Proceedings of the 8th IEEE International Conference on Image Processing, vol. 1. pp: 649–652.
Cohen, P., and Kjeldson, R. (1987). Information retrieval by constrained spreading activation in semantic networks. Information Processing and Management, 23, 255–268.
Cordon, O., Herrera-Viedma, E. (2003). Editorial: Special issue on soft computing applications to intelligent information retrieval. International Journal of Approximate Reasoning, vol. 34, pp: 89–95.
Crestani, F., Lee, P. L. (2000). Searching the web by constrained spreading activation. Information Processing and Management, vol. 36, pp: 585–605.
Cunningham S.J., Holmes G., Littin J., Beale R., and Witten I.H. (1997). Applying connectionist models to information retrieval. In Amari, S. and Kasobov, N. (Eds.) Brain-Like Computing and Intelligent Information Systems (pp 435–457). Springer-Verlag.
De Wilde, Ph. (1996). Neural Network Models. Springer Verlag.
Ding, C., He, X., Husbands, P., Zha, H., Simon, H.D. (2002). PageRank, HITS, and a unified framework for link analysis. Proceedings of the ACM SIGIR 2002, Tampere, Finland, pp: 353–354.
Dominich, S. (1994). Interaction Information Retrieval. Journal of Documentation, 50(3), 197–212.
Dominich, S. (2001). Mathematical Foundations of Information Retrieval. Kluwer Academic Publishers, Dordrecht, Boston, London.
Dominich, S. (2004). Connectionist Interaction Information retrieval. Information Processing and Management, vol 39, no.2, pp: 167–194
Doszkocs, T., Reggia, J., and Lin, X. (1990). Connectionist models and information retrieval. Annual Review of Information Science & Technology, 25, 209–260.
Feldman, J.A., and Ballard, D.H. (1982). Connectionist models and their properties. Cognitive Science, vol. 6, pp: 205–254
Fuhr, N. and Buckley, C. (1991). A probabilistic learning approach for document indexing. ACM Transactions on Information Systems, 9(3), 223–248.
Garfield, E. (1955). Citation indexes for science. Science, p. 108
Grossberg, S. (1976). Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors. Biological Cybertnetics, vol. 23, pp: 121–134
Haveliwala, T.H. (1999). Efficient Computation of PageRank. Stanford University, http://dbpubs. stanford.edu:8090/pub/1998-31 (visited: 27 Febr 2004)
Hopfield, J.J. (1984). Neurons with graded response have collective computational properties like those of two-states neurons. Proceedings of the National Academy of Sciences, vol. 81, pp: 3088–3092
Huang, Z., Chen, H., Zeng, D. (2004). Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering. ACM Transactions on Information Systems, vol. 22, no. 1, pp: 116–142.
James, W. (1890). Psychology (Briefer Course). New York: Holt, Chapter XVI, “Association”, pp: 253–279
Johnson, A., and Fotouhi, F. (1996). Adaptive clustering of hypermedia documents. Information Systems, 21, 549–473.
Johnson, A., Fotouhi, F., and Goel, N. (1994). Adaptive clustering of scientific data. Proceedings of the 13th IEEE International Phoenix Conference on Computers and Communication (pp. 241–247). Tempe, Arizona.
Kim, S.J., and Lee, S.H. (2002). An Improved Computation of the PageRank Algorithm. In: Crestani, F., Girolamo, M., and van Rijsbergen, C.J. (eds.) Proceedings of the European Colloquium on Information Retrieval. Springer LNCS 2291, pp: 73–85
Kleiberg, J. M. (1999). Authoritative Sources in a Hyperlinked Environment. Journal of the ACM, vol. 46, no. 5, pp: 604–632.
Kohonen, T. (1988). Self-Organization and Associative Memory. New York: Springer Verlag.
Kraft, D.H., Bordogna, P. and Pasi, G. (1998). Fuzzy Set Techniques in Information Retrieval. In: Didier, D. and Prade, H. (Eds.) Handbook of Fuzzy Sets and Possibility Theory. Approximate Reasoning and Fuzzy Infomation Systems, (Chp. 8). Kluwer Academic Publishers, AA Dordrecht, The Netherlands.
Kwok, K.L. (1989). A Neural Network for the Probabilistic Information Retrieval. In Belkin, N.J. and van Rijsbergen, C.J. (Eds.) Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, Cambridge, MA, USA, pp: 21–29.
Kwok, K.L. (1990). Application of Neural Networks to Information Retrieval. In Caudill, M. (Ed.) Proceedings of the International Joint Conference on Neural Networks, Vol. II (pp. 623–626). Hilldale, NJ, Lawrance Erlbaum Associates, Inc.
Kwok, K.L. (1995). A network approach to probabilistic information retrieval. ACM Transactions on Information Systems, 13(3), 243–253.
Layaida, R., Boughanem, M. and Caron, A. (1994). Constructing an Information Retrieval System with Neural Networks. Lecture Notes in Computer Science, 856, Springer, pp: 561–570.
Lempel, R., Moran, S. (2001). SALSA: the stochastic approach for link-structure analysis. ACM Transactions on Information Systems, vol. 19, no. 2, pp: 131–160.
Martin, W. T., Reissner, E. (1961). Elementary Differential Equations. Addison-Wesley, Reading-Massachusetts, U.S.A.
Niki, K. (1997). Sel-organizing Information Retrieval System on the Web: Sir-Web. In Kasabov, N. et al. (Eds.) Progress in Connectionist-based Information Systems. Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, vol. 2, Springer Verlag, Singapore, pp: 881–884.
Orponen, P. (1995). Computational Complexity of Neural Networks: A Survey. Nordic Journal of Computing, vol. 1, pp: 94–110.
Rose, D. E. (1994). A symbolic and connectionist approach to legal information retrieval. Hillsdale, NJ, Erlbaum.
Rose, D.E. and Belew, R.K. (1991). A connectionist and symbolic hybrid for improving legal research. International Journal of Man-Machine Studies, 35(1), 1–33.
Roussinov, D.G., Chen, H. (2001). Information navigation on the Web by clustering and summarizing query results. Information Processing and Management, vol. 37, pp: 789–816.
Ruiz, M.E., Srinivasan, P. (1999). Hierarchical Neural Networks for Text Categorization. Proceedings of the 22nd ACM SIGIR International Conference on Research and Development in Information Retrieval, Berkeley, California, USA, pp: 281–282.
Schlieder, T. (2002). Schema-Driven Evaluation of ApproXQL Queries. Technical Report B02-01, Freie Universität Berlin, January 2002. http://www.inf.fuberlin. de/inst/ag-db/publications/2002/report-B-02-01.pdf (visited: 10 May 2004)
Sheikholeslami, G., Chang, W. and Zhang, A. (2002). SemQuery: Semantic Clustering and Querying on Heterogeneous Features for Visual Data. IEEE Transactions on Knowledge and Data Engineering, 14(5), pp: 988–1003.
Sima, J., Orponen, P. (2003). General-Purpose Computation with Neural Networks: A Survey of Complexity Theoretic Results. Neural Computation, vol. 15, pp: 2727–2778.
Van Rijsbergen, C.J. (2004). The Geometry of IR. Cambridge University Press.
Weiss, M.A. (1995). Data Structures and Algorithm Analysis. The Benjamin/Cummings Publishing Company, Inc., New York, Amsterdam.
Wermter S. (2000). Neural Network Agents for Learning Semantic Text Classification. Information Retrieval, 3(2), 87–103.
Wong, S.K.M., Cai, Y.J. (1993). Computation of Term Association by Neural Networks. Proceedings of the 16th ACM SIGIR International Conference on Research and Development in Information Retrieval, Pittsburgh, PA, USA, pp: 107–115.
Yang, C.C., Yen, J., Chen, H. (2000). Intelligent internet searching agent based on hybrid simulated annealing. Decision Support Systems, vol. 28, pp: 269–277.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Dominich, S., Skrop, A., Tuza, Z. (2006). Formal Theory of Connectionist Web Retrieval. In: Herrera-Viedma, E., Pasi, G., Crestani, F. (eds) Soft Computing in Web Information Retrieval. Studies in Fuzziness and Soft Computing, vol 197. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31590-X_9
Download citation
DOI: https://doi.org/10.1007/3-540-31590-X_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31588-9
Online ISBN: 978-3-540-31590-2
eBook Packages: EngineeringEngineering (R0)