We propose a new way of browsing bilingual web sites through concurrent browsing with automatic similar-content synchronization and viewpoint retrieval facilities. Our prototype browser system is called the Bilingual Comparative Web Browser (B-CWB) and it concurrently presents bilingual web pages in a way that enables their contents to be automatically synchronized. The B-CWB allows users to browse multiple web news sites concurrently and compare their viewpoint of news articles written in different languages (English and Japanese). Our viewpoint retrieval is based on similar and different detection. We described categorizing pages in terms of viewpoint: the entire similarity, the content difference, and subject difference. Content synchronization means that user operation (scrolling or clicking) on one web page does not necessarily invoke the same operations on the other web page to preserve similarity of content between the multiple web pages. For example, scrolling a web page may invoke passage-level viewpoint retrieval on the other web page. Clicking a web page (and obtaining a new web page) invokes page-level viewpoint retrieval within the other site's pages through the use of an English-Japanese dictionary.
Similar content being viewed by others
J. Dean and M. R. Henzinger, “Finding related pages in the World Wide Web,” in The 8th International World Wide Web Conference (WWW8) in, Toronto, Canada, May 1999, http://www8.org/w8-papers/4a-search-mining/finding/finding.html
R. Goldman and J Widom, “DataGuides: Enabling query formulation and optimization in semistructured databases,” in Proc. 23rd Intl. Conf. on Very Large Data Bases (VLDB'23), August 1997, pp. 436–445.
S.-J. Lim and Y.-K. Ng, “An automated change-detection algorithm for HYML documents based on semantic hierarchies,” in Proc. the 17th Intl. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001 pp. 303–312.
B. Liu, Y. Ma, and P.S. Yu, “Discovering unexpected information from your competitor's Web Sites,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2001), San Francisco, CA, August 2001.
B. Liu, K. Zhao, and L. Yi, “Visualizing web site comparisons,” in The 11th International World Wide Web Conference (WWW2002), Honolulu, Hawaii, May 2002 (http://www2002.org/CDROM/refereed/571/index.html)
T. Matsukura, H. Kondo, Y. Hirata, and K. Tanaka, “Discovery of semantic relationship among web pages based on web topic structures,” in Proc. of 9th IFIP 2.6 Working Conference on Database Semantics, 2001.
M. Perkowitz and O. Etzioni, “Towards adaptive web sites: Conceptual framework and case study,” Artificial Intelligence 118: 2000, 245–275.
R. D. Roorenbos, O. Etzioni, and D. S. Weld, “A scalable comparison-shopping agent for the world-wide web,” in Proc. the 1st Intl. Conf. on Autonomous Agents, 1997.
P. Schauble and P. Sheridan, “Cross-language information retrieval (CLIR) track overview,” in proceedings of the Sixth Text Retrieval Conference (TREC-6), NIST, Gaithersburg, MD 1998.
M. Uchiyama and H Isahara, “Reliable Measures for Aligning Japanese-English News Articles and Sentences,” in Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics(ACL2003), 2003, pp. 72–79.
Asahi newspaper site homepage. http://www.asahi.com.
Brill's tagger homepage. http://www.cs.jhu.edu/brill/
CNN site homepage. http://www.cnn.com
EIJIRO's homepage (in Japanese). http://www.alc.co.jp/
Gomez homepage. http://www.gomez.com
MeCab homepage. http://cl.aist-nara.ac.jp/taku-ku/software/mecab/
Netscape site homepage. http://www.netscape.com/
TREC site homepage. http://trec.nist.gov/
UserLand site homepage. http://www.userland.com/
Yahoo news site homepage. http://news.yahoo.com/
Yahoo-Japan news site homepage. http://headlines.yahoo.co.jp/hl
Yomiuri newspaper site homepage. http://www.yomiuri.co.jp
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nadamoto, A., Ma, Q. & Tanaka, K. B-CWB: Bilingual Comparative Web Browser Based on Content-Synchronization and Viewpoint Retrieval. World Wide Web 8, 347–367 (2005). https://doi.org/10.1007/s11280-005-1316-8
Issue Date:
DOI: https://doi.org/10.1007/s11280-005-1316-8