Abstract
On the heterogeneous web information spaces, users have been suffering from efficiently searching for relevant information. This paper proposes a mediator agent system to estimate the semantics of unknown web spaces by learning the fragments gathered during the users' focused crawling. This process is organized as the following three tasks; (i) gathering semantic information about web spaces from personal agents while focused crawling in unknown spaces, (ii) reorganizing the information by using ontology alignment algorithm, and (iii) providing relevant semantic information to personal agents right before focused crawling. It makes the personal agent possible to recognize the corresponding user's behaviors in semantically heterogeneous spaces and predict his searching contexts. For the experiments, we implemented comparison-shopping system with heterogeneous web spaces. As a result, our proposed method efficiently supported the users, and then, network traffic was also reduced.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In J. B. Bocca, M. Jarke, & C. Zaniolo (Eds.), Proceedings of the 20th International Conference on Very Large Data Bases (VLDB) (pp. 487–499). Morgan Kaufmann, 12–15.
Amitay, E., Carmel, D., Herscovici, M., Lempel, R., & Soffer, A. (2004). Trend detection through temporal link analysis. Journal of the American Society for Information Science and Technology, 55(14), 1270–1281.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Addison Wesley.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 285(5), 34–43.
Bussler, C., Fensel, D., & Maedche, A. (2002). A conceptual architecture for semantic web enabled web services. SIGMOD Records, 31(4), 24–29.
Chakrabarti, S., van den Berg, M., & Dom, B. (1999). Focused crawling: A new approach to topic-specific web resource discovery. Computer Networks, 31(11–16), 1623–1640.
Chang, C.-H., & Hsu, C.-C. (1999). Enabling concept-based relevance feedback for information retrieval on the www. IEEE Transactions on Knowledge and Data Engineering, 11(4), 595–609.
Claypool, M., Brown, D., Le, P., & Waseda, M. (2001). Inferring user interest. IEEE Internet Computing, 5(6), 32–39.
Cothey, V. (2002). A longitudinal study of world wide web users’ information-searching behavior. Journal of the American Society for Information Science and Technology, 53(2), 67–78.
Curbera, F., Sheth, A. P., & Verma, K. (2004). Services oriented architecture and semantic web processes. In Proceedings of the IEEE International Conference on Web Services (ICWS’04), June 6–9, California, USA, San Diego (pp. 583–590). IEEE Computer Society.
De Bra, P. M. E., & Post, R. D. J. (1994). Information retrieval in the World-Wide Web: Making client-based searching feasible. Computer Networks and ISDN Systems, 27(2), 183–192.
Deters, R. (2001). Scalability and information agents. ACM SIGAPP Applied Computing Review, 9(3), 13–20.
Dieng, R., & Hug, S. (1998). Comparison of “personal ontologies” represented through conceptual graphs. In Proceedings of the 13th European Conference on Artificial Intelligence (pp. 341–345). Brighton, UK.
Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R. S., Peng, Y., Reddivari, P., Doshi, V., & Sachs, J. (2004). Swoogle: A search and metadata engine for the semantic web. In Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management (CIKM ’04) (pp. 652–659). New York, NY, USA: ACM Press.
Doan, A., Madhavan, J., Domingos, P., & Halevy, A. Y. (2004). Ontology matching: A machine learning approach. In S. Staab & R. Studer (Eds.), Handbook on ontologies, international handbooks on information systems (pp. 385–404). Springer Verlag.
Euzenat, J. (1994). Brief overview of t-tree: The tropes taxonomy building tool. In Proceedings of the 4th ASIS SIG/CR Workshop on Classification Research, Columbus, USA (pp. 69–87).
Euzenat, J. (1995). Building consensual knowledge bases: Context and architecture. In Proceedings of the 2nd International Conference on Building and Sharing Very Large-Scale Knowledge Bases (KBKS) (pp. 143–155). IOS Press.
Euzenat, J. (2004). An API for ontology alignment. In S. A. McIlraith, D. Plexousakis, & F. van Harmelen (Eds.), Proceedings of the 3rd International Semantic Web Conference, vol. 3298 of Lecture Notes in Computer Science (pp. 698–712). Springer.
Euzenat, J., & Valtchev, P. (2004). Similarity-based ontology alignment in OWL-Lite. In Proceedings of the 16th European Conference on Artificial Intelligence (pp. 333–337).
Giunchiglia, F., & Shvaiko, P. (2003). Semantic matching. In Proceedings of the IJCAI 2003 Workshop on Ontologies and Distributed Systems (pp. 139–146).
Graupmann, J., Cai, J., & Schenkel, R. (2005). Automatic query refinement using mined semantic relations. In Proceedings of the 2005 International Workshop on Challenges in Web Information Retrieval and Integration (WIRI 2005), 8–9 April 2005, Tokyo, Japan (pp. 205–213). IEEE Computer Society.
Gudivada, V. N., Raghavan, V. V., Grosky, W. I., & Kasanagottu, R. (1997). Information retrieval on the world wide web. IEEE Internet Computing, 1(5), 58–68.
Hargittai, E. (2002). Beyond logs and surveys: In-depth measures of people’s web use skills. Journal of the American Society for Information Science and Technology, 53(14), 1239–1244.
Henzinger, M. R. (2000). Link analysis in web information retrieval. IEEE Data Engineering Bulletin, 23(3), 3–8.
Jung, J. J. (2004). An application of collaborative web browsing based on ontology learning from user activities on the web. Computing and Informatics, 23(4), 337–353.
Jung, J. J. (2005). Collaborative web browsing based on semantic extraction of user interests with bookmarks. Journal of Universal Computer Science, 11(2), 213–228.
Jung, J. J., & Jo, G. (2003). Collaborative junk e-mail filtering based on multi-agent systems. In C.-W. Chung, C. K. Kim, W. Kim, T. W. Ling, & K. H. Song (Eds.), Proceedings of the 2nd International Conference on Human Society@Internet, Seoul, Korea, June 18–20, 2003, vol. 2713 of Lecture Notes in Computer Science (pp. 218–227). Springer.
Jung, J. J., Lee, K.-S., Park, S.-B., & Jo, G.-S. (2005). Efficient web browsing with semantic annotation: A case study of product images in e-commerce sites. IEICE Transactions on Information and Systems, E88-D(5), 843–850.
Kalfoglou, Y., & Schorlemmer, M. (2003). Ontology mapping: The state of the art. Knowledge Engineering Review, 18(1), 1–31.
Kelly, D., & Teevan, J. (2003). Implicit feedback for inferring user preference: A bibliography. SIGIR Forum, 37(2), 18–28.
Kim, J. (2005). Meta-level patterns for interactive knowledge capture. In Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP ’05) (pp. 207–208). New York, NY, USA: ACM Press.
Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 604–632.
Levenshtein, I. V. (1996). Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory, 10(8), 707–710.
Liu, H., Milios, E., & Janssen, J. (2004). Probabilistic models for focused web crawling. In Proceedings of the 6th Annual ACM International Workshop on Web Information and Data Management (WIDM ’04) (pp. 16–22). New York, NY, USA: ACM Press.
Madhavan, J., Bernstein, P., & Rahm, E. (2001). Generic schema matching using Cupid. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB’01), Roma, Italy (pp. 48–58).
Maes, P. (1994). Agents that reduce work and information overload. Communications of ACM, 37(7), 31–40.
Mukhopadhyay, S., Peng, S., Raje, R., Mostafa, J., & Palakal, M. (2005). Distributed multi-agent information filtering a comparative study. Journal of the American Society for Information Science and Technology, 56(8), 834–842.
Noy, N. F., & Musen, M. A. (2000). Prompt: Algorithm and tool for automated ontology merging and alignment. In Proceedings of the 17th National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30–August 3, 2000, Austin, Texas, USA (pp. 450–455). AAAI Press/The MIT Press.
Pires, P. F., Benevides, M. R. F., & Mattoso, M. (2003). Mediating heterogeneous web services. In Proceedings of the 2003 Symposium on Applications and the Internet (SAINT 2003), 27–31 January 2003—Orlando, FL, USA (pp. 344–347). IEEE Computer Society.
Plachouras, V., & Ounis, I. (2005). Dempster-shafer theory for a query-biased combination of evidence on the web. Information Retrieval, 8(2), 197–218.
Ruthven, I., Lalmas, M., & Rijsbergen, K. V. (2003). Incorporating user search behavior into relevance feedback. Journal of the American Society for Information Science and Technology, 54(6), 529–549.
Salton, G., & Buckley, C. (1998). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.
Shvaiko, P., & Euzenat, J. (2005). A survey of schema-based matching approaches. Journal on Data Semantics IV, 3730, 146–171.
Storey, V. C., Straub, D. W., Stewart, K. A., & Welke, R. J. (2000). A conceptual investigation of the e-commerce industry. Communications of the ACM, 43(7), 117–123.
Stumme, G., & Maedche, A. (2001). FCA-merge: Bottom-up merging of ontologies. In Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI’01), Seattle, USA (pp. 225–230).
Valarakos, A. G., Paliouras, G., Karkaletsis, V., & Vouros, G. A. (2004). Enhancing ontological knowledge through ontology population and enrichment. In E. Motta, N. Shadbolt, A. Stutt, & N. Gibbins (Eds.), Proceedings of the 14th International Conference on Knowledge Acquisition, Modeling and Management (EKAW 2004), Whittlebury Hall, UK, October 5–8, 2004, volume 3257 of Lecture Notes in Computer Science (pp. 144–156). Springer.
Wang, Q., & Ng, Y.-K. (2003). An ontology-based binary-categorization approach for recognizing multiple-record web documents using a probabilistic retrieval model. Information Retrieval, 6(3–4), 295–332.
White, R. W., Jose, J. M., & Ruthven, I. (2006). An implicit feedback approach for interactive information retrieval. Information Processing and Management, 42(1), 166–190.
Zaki, M. J. (2005). Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 17(8), 1021–1035.
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article can be found at http://dx.doi.org/10.1007/s10791-007-9024-x
Rights and permissions
About this article
Cite this article
Jung, J.J. Ontological framework based on contextual mediation for collaborative information retrieval. Inf Retrieval 10, 85–109 (2007). https://doi.org/10.1007/s10791-006-9013-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10791-006-9013-5