ABSTRACT
In this paper, we investigate the contextualization of news documents with geographic and visual information. We propose a matrix factorization approach to analyze the location relevance for each news document. We also propose a method to enrich the document with a set of web images. For location relevance analysis, we first perform toponym extraction and expansion to obtain a toponym list from news documents. We then propose a matrix factorization method to estimate the location-document relevance scores while simultaneously capturing the correlation of locations and documents. For image enrichment, we propose a method to generate multiple queries from each news document for image search and then employ an intelligent fusion approach to collect a set of images from the search results. Based on the location relevance analysis and image enrichment, we introduce a news browsing system named NewsMap which can support users in reading news via browsing a map and retrieving news with location queries. The news documents with the corresponding enriched images are presented to help users quickly get information. Extensive experiments demonstrate the effectiveness of our approaches.
- E. Amitay, R. Sivan, and A. Soffer. Web-a-where: Geotagging web content. In Proceedings of ACM SIGIR, pages 273--280. Sheffield, UK, July 2004. Google ScholarDigital Library
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of ACM WWW, pages 107--117, April 1998. Google ScholarDigital Library
- L. Cao, J. Yu, J. Luo, and T. S. Huang. Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression. In Proceedings of ACM Multimedia, pages 125--134. China, 2009. Google ScholarDigital Library
- M. G. Christel, A. M. Olligschlaeege, and C. Huang. Interative maps for a digital video library. IEEE Multimedia, 7(1):60--67, March 2000. Google ScholarDigital Library
- R. L. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Trans. on Knowledge and Data Engineering, 19(3):370--383, March 2007. Google ScholarDigital Library
- B. Coyne and R. Sproat. Wordseye: An automatic text-to-scene conversion system. In Proceedings of Annual Conference on Computer Graphics and Interactive Techniques, pages 487--496. Los Angeles, USA, August 2001. Google ScholarDigital Library
- D. J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the world's photos. In Proceedings of ACM WWW, pages 761--770. Madrid, Spain, April 2009. Google ScholarDigital Library
- D. Delgado, J. Magalhäes, and N. Correia. Assisted news reading with automated illustrations. In Proceedings of ACM Multimedia, pages 1647--1650. Firenze, Italy, October 2010. Google ScholarDigital Library
- D. Delgado, J. Magalhäes, and N. Correia. Automated illustration of news stories - improving the readers experience. In Proceedings of IEEE International Conference on Semantic Computing, pages 73--78, September 2010. Google ScholarDigital Library
- J. Ding, L. Gravano, and N. Shivakumar. Computing geographical scopes of web sources. In Proceedings of International Conference on Very Large Data Bases, pages 545--556. San Francisco, USA, September 2000. Google ScholarDigital Library
- B. Geng, L. Yang, C. Xu, and X.-S. Hua. Content-aware ranking for visual search. In CVPR, pages 3400--3407, 2010.Google ScholarCross Ref
- F. Gey, R. Larson, M. Sanderson, H. Joho, P. Clough, and V. Petras. Geoclef: The clef 2005 cross-language geographic information retrieval track overview. In CLEF'05, pages 908--919, 2005. Google ScholarDigital Library
- J. Hays and A. A. Efros. Im2gps: estimating geographic information from a single image. In Proceedings of IEEE CVPR, pages 1--8, 2008.Google ScholarCross Ref
- S. Huston and W. B. Croft. Evaluating verbose query processing techniques. In Proceedings of ACM SIGIR, pages 291--298, July 2010. Google ScholarDigital Library
- K. J\"arvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. on Information Systems, 20(4):422--446, October 2002. Google ScholarDigital Library
- B. Jiao, L. Yang, J. Xu, and F. Wu. Visual summarization of web pages. In Proceedings of ACM SIGIR, pages 499--506. Geneva, Switzerland, July 2010. Google ScholarDigital Library
- K. S. Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments. Information Processing and Management, 36(6):779--808, November 2000. Google ScholarDigital Library
- D. Joshi, J. Z. Wang, and J. Li. The story picturing engine: Finding elite images to illustrate a story using mutual reinforcement. In Proceedings of ACM Workshop on Multimedia Information Retrieval, pages 119--126, 2004. Google ScholarDigital Library
- P. Kelm, S. Schmiedeke, and T. Sikora. Video2gps: Geotagging using collaborative systems, textual and visual features. In Proceedings of MediaEval. Pisa, Italy, 2010.Google Scholar
- B. M. King and E. M. Minium. Statistical Reasoning in Psychology and Education. Wiley, New York, 1999.Google Scholar
- G. Kumaran and V. R. Carvalho. Reducing long queries using query quality predictors. In Proceedings of ACM SIGIR, pages 564--571. Boston, USA, July 2009. Google ScholarDigital Library
- Z. Li, J. Liu, X. Zhu, and H. Lu. Multi-modal multi-correlation person-centric news retrieval. In Proceedings of ACM CIKM, 2010. Google ScholarDigital Library
- J. Luo, D. Joshi, J. Yu, and A. Gallagher. Geotagging in multimedia and computer vision--a survey. Multimedia Tools and Applications, 51(1):187--211, October 2010. Google ScholarDigital Library
- X. Olivares, M. Ciaramita, and R. van Zwol. Boosting image retrieval through aggregating search results based on visual annotations. In Proceedings of ACM Multimedia, pages 189--198. Canada, October 2008. Google ScholarDigital Library
- S. Overell and S. Rüger. Using co-occurrence models for placename disambiguation. International Journal of Geographical Information Science, 22(3):265--287, March 2008. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford Digital Library Technologies Project, 1999.Google Scholar
- R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems, pages 1257--1264, 2007.Google Scholar
- P. Serdyukov, V. Murdock, and R. van Zwol. Placing flickr photos on a map. In Proceedings of ACM SIGIR, pages 484--491. Boston, USA, July 2009. Google ScholarDigital Library
- J. F. Sturm. Site matters: The value of local newspaper web sites. Technical report, NAA, 2009. http://www.naa.org/TrendsandNumbers/Research.aspx.Google Scholar
- J. Teevan, E. Cutrell, D. Fisher, S. M. Drucker, G. Ramos, P. Andre, and C. Hu. Visual snippets: Summarizing web pages for search and revisitation. In Proceedings of International Conference on Human factors in computing systems, pages 2023--2032. Boston, USA, April 2009. Google ScholarDigital Library
- C. C. Vogt and G. W. Cottrell. Fusion via a linear combination of scores. Information Retrieval, 1(3):151--173, October 1999. Google ScholarDigital Library
- B. Wang, Z. Li, M. Li, and W.-Y. Ma. Large-scale duplicate detection for web image search. In Proceedings of IEEE International Conference on Multimedia Expo, pages 353--356. Toronto, Canada, July 2006.Google ScholarCross Ref
- M. Wang, X.-S. Hua, R. Hong, J. Tang, G.-J. Qi, and Y. Song. Unified video annotation via multi-graph learning. IEEE Trans. on Circuits and Systems for Video Technology, 19(5):733--766, March 2009. Google ScholarDigital Library
- M. Wang, X.-S. Hua, J. Tang, and R. Hong. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. on Multimedia, 11(3):465--473, February 2009. Google ScholarDigital Library
- R. Yan and A. G. Hauptmann. The combination limit in multimedia retrieval. In Proceedings of ACM Multimedia, pages 339--342, November 2003. Google ScholarDigital Library
- Y. Yang, D. Xu, F. Nie, J. Luo, and Y. Zhuang. Ranking with local regression and global alignment for cross media retrieval. In Proceedings of ACM Multimedia, pages 175--184, October 2009. Google ScholarDigital Library
- L. Zhang, L. Chen, F. Jing, K. Deng, and W.-Y. Ma. Enjoyphoto--a verticcal image search engine for enjoying high-quality photos. In Proceedings of ACM Multimedia, pages 367--376. USA, October 2006. Google ScholarDigital Library
- R. Zhao and W. I. Grosky. Narrowing the semantic gap--improved text-based web document retrieval using visual features. ACM Trans. on Multimedia, 4(2):189--200, June 2002. Google ScholarDigital Library
- Y. Zheng, Z. Zha, and T.-S. Chua. Research and applications on georeferenced multimedia: a survey. Multimedia Tools and Applications, 51(1):77--98, October 2010. Google ScholarDigital Library
- X. Zhu, A. B. Goldberg, M. Eldawy, C. R. Dyer, and B. Strock. A text-to-picture synthesis system for augmenting communication. In Proceedings of National Conference on Artificial Intelligence, pages 1590--1595. Vancouver, Canada, July 2007. Google ScholarDigital Library
- W. Zong, D. Wu, A. Sun, E.-P. Lim, and D. H.-L. Goh. On assigning place names to geography related web pages. In Proceedings of ACM/IEEE-CS joint conference on Digital libraries, pages 354--362. New York, USA, June 2005. Google ScholarDigital Library
Index Terms
- News contextualization with geographic and visual information
Recommendations
Enhancing news organization for convenient retrieval and browsing
To facilitate users to access news quickly and comprehensively, we design a news search and browsing system named GeoVisNews, in which the news elements of “Where”, “Who”, “What” and “When” are enhanced via news geo-localization, image enrichment and ...
Relevance and ranking in geographic information retrieval
FDIA'11: Proceedings of the Fourth BCS-IRSG conference on Future Directions in Information AccessGeographic Information Retrieval (GIR) is a specialized branch of traditional Information Retrieval (IR), which deals with the information related to geographic locations. One of the main challenges of GIR is to quantify the spatial relevance of ...
Geographic information extraction, disambiguation and ranking techniques
GIR '15: Proceedings of the 9th Workshop on Geographic Information RetrievalAn important part of textual information around the world contains some kind of geographic features. User queries with geographic references are becoming very common and human expectations from a search engine are even higher. Although several works ...
Comments