BiCWS: Mining Cognitive Differences from Bilingual Web Search Results

Huang, Xiaojiang; Wan, Xiaojun; Xiao, Jianguo

doi:10.1007/978-3-642-35063-4_5

Xiaojiang Huang²⁰,
Xiaojun Wan²⁰ &
Jianguo Xiao²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7651))

Included in the following conference series:

International Conference on Web Information Systems Engineering

2510 Accesses

Abstract

In this paper we propose a novel comparative web search system – BiCWS, which can mine cognitive differences from web search results in a multi-language setting. Given a topic represented by two queries (they are the translations of each other) in two languages, the corresponding web search results for the two queries are firstly retrieved by using a general web search engine, and then the bilingual facets for the topic are mined by using a bilingual search results clustering algorithm. The semantics in Wikipedia are leveraged to improve the bilingual clustering performance. After that, the semantic distributions of the search results over the mined facets are visually presented, which can reflect the cognitive differences in the bilingual communities. Experimental results show the effectiveness of our proposed system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anttila, R.: Historical and Comparative Linguistics (Current Issues in Linguistic Theory). John Benjamins Pub. Co., Amsterdam (1989)
Google Scholar
Weisstein, U.: Comparative Literature and Literary Theory: Survey and Introduction. Indiana University Press, Bloomington (1974)
Google Scholar
de Zepetnek, S.: Comparative Central European Culture. Purdue University Press, West Lafayette (2002)
Google Scholar
Jindal, N., Liu, B.: Mining Comparative Sentences and Relations. In: 21st National Conference on Artificial Intelligence, pp. 1331–1336. AAAI Press, Palo Alto (2006)
Google Scholar
Zhai, C., Velivelli, A., Yu, B.: A Cross-Collection Mixture Model for Comparative Text Mining. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 743–748. ACM, New York (2004)
Google Scholar
Kim, H.D., Zhai, C.: Generating Comparative Summaries of Contradictory Opinions in Text. In: 18th ACM Conference on Information and Knowledge Management, pp. 385–394. ACM, New York (2009)
Google Scholar
Liu, C., Huang, Q., Jiang, S., Xu, C.: The third eye: mining the visual cognition across multi-language communities. In: 18th International Conference on Multimedia, pp. 431–440. ACM, New York (2010)
Chapter Google Scholar
Carpineto, C., Osiński, S., Romano, G., Weiss, D.: A survey of Web clustering engines. ACM Comput. Surv. 41, 1–38 (2009)
Article Google Scholar
Zamir, O., Etzioni, O.: Grouper: a dynamic clustering interface to Web search results. Computer Networks, 1361–1374 (1999)
Google Scholar
Zeng, H.-J., He, Q.-C., Chen, Z., Ma, W.-Y., Ma, J.: Learning to cluster web search results. In: 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 210–217. ACM, New York (2004)
Google Scholar
Sun, J.-T., Wang, X., Shen, D., Zeng, H.-J., Chen, Z.: CWS: A Comparative Web Search System. In: 5th International Conference on World Wide Web, pp. 467–476. ACM, New York (2006)
Chapter Google Scholar
Barrachina, S., Vilar, J.M.: Bilingual clustering using monolingual algorithms. In: 8th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 1999), pp. 77–87 (1999)
Google Scholar
Kiran Kumar, N., Santosh, G.S.K., Varma, V.: Multilingual Document Clustering Using Wikipedia as External Knowledge. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds.) IRFC 2011. LNCS, vol. 6653, pp. 108–117. Springer, Heidelberg (2011)
Chapter Google Scholar
Li, B., Gaussier, E., Aizawa, A.: Clustering comparable corpora for bilingual lexicon extraction. In: 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers, vol. 2, pp. 473–478. Association for Computational Linguistics, Stroudsburg (2011)
Google Scholar
Dagan, I., Itai, A.: Word sense disambiguation using a second language monolingual corpus. Comput. Linguist. 20, 563–596 (1994)
Google Scholar
Khapra, M.M., Joshi, S., Chatterjee, A., Bhattacharyya, P.: Together we can: bilingual bootstrapping for WSD. In: 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 561–569. Association for Computational Linguistics, Stroudsburg (2011)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: 20th International Joint Conference on Artifical Intelligence, pp. 1606–1611. Morgan Kaufmann Publishers Inc., San Francisco (2007)
Google Scholar
Potthast, M., Stein, B., Anderka, M.: A Wikipedia-Based Multilingual Retrieval Model. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 522–530. Springer, Heidelberg (2008)
Chapter Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
MATH Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39, 1–38 (1977)
MathSciNet MATH Google Scholar
Kuhn, H.W.: The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly 2, 83–97 (1955)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science and Technology & The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, 100871, China
Xiaojiang Huang, Xiaojun Wan & Jianguo Xiao

Authors

Xiaojiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Wan
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Fudan University, 825 Zhangheng Rd., Shanghai, 201203, China
X. Sean Wang
Department of Computer Science, College of Engineering, Science and Engineering Offices, The University of Illinois at Chicago, 851 South Morgan Street (M/C 152), 60607-7053, Chicago, Illinois, USA
Isabel Cruz
Department of Informatics and Telecommunications, University of Athens, GR15784, Ilisia, Athens, Greece
Alex Delis
Centre for Applied Informatics, Victoria University, PO Box 14428, 8001, Melbourne, VIC, Australia
Guangyan Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, X., Wan, X., Xiao, J. (2012). BiCWS: Mining Cognitive Differences from Bilingual Web Search Results. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds) Web Information Systems Engineering - WISE 2012. WISE 2012. Lecture Notes in Computer Science, vol 7651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35063-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-35063-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35062-7
Online ISBN: 978-3-642-35063-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics