A learning-to-rank method for information updating task

Pham, Minh Quang Nhat; Nguyen, Minh Le; Ngo, Bach Xuan; Shimazu, Akira

doi:10.1007/s10489-012-0343-2

A learning-to-rank method for information updating task

Published: 11 March 2012

Volume 37, pages 499–510, (2012)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Minh Quang Nhat Pham¹,
Minh Le Nguyen¹,
Bach Xuan Ngo¹ &
…
Akira Shimazu¹

297 Accesses
5 Citations
Explore all metrics

Abstract

Our paper addresses the information updating task which is to determine the most appropriate location in an existing document to place a new piece of related information. We propose a new learning-to-rank method for the information updating task. The updating task is formalized as a learning-to-rank problem, and in training, a heuristic method of automatically assigning labels for training examples is proposed to exploit structural information of documents. With the proposed formulation, state-of-the-art learning-to-rank algorithms can be applied to the task. We deal with the problem of the lack of semantic information by incorporating semantic features derived from word clusters to further improve the performance of information updating. The proposed method is applied in updating Wikipedia biographical articles and Legal documents. Experimental results achieved on both Wikipedia biographical data set and Legal data set showed that our proposed learning-to-rank method with cluster-based features outperforms previously reported methods for information updating task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised Ranking via List-Wise Approach

Which noise affects algorithm robustness for learning to rank

Article 28 April 2015

Shuzi Niu, Yanyan Lan, … Xueqi Cheng

Exploiting Multiple Features for Learning to Rank in Expert Finding

Notes

http://stats.wikimedia.org/EN/TablesWikipediaEN.htm.
Data and Code used in experiments are available to download on: www.jaist.ac.jp/~s1020010/update_task.tar.gz.

References

Baker LD, McCallum AK (1998) Distributional clustering of words for text classification. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 96–103
Chapter Google Scholar
Bekkerman R, El-Yaniv R, Tishby N, Winter Y (2003) Distributional word clusters vs. words for text categorization. J Mach Learn Res 3:1183–1208
MATH Google Scholar
Brown PF, Della Pietra VJ, deSouza PV, Lai JC, Mercer RL (1992) Class-based n-gram models of natural language. Comput Linguist 18(4):467–479
Google Scholar
Burges C, Shaked T, Renshaw E, Lazier A, Deed M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning (ICML 2005), pp 89–96
Chapter Google Scholar
Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: From pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning (ICML 2007), pp 129–136
Chapter Google Scholar
Caruana R, Baluja S, Mitchell T (1996) Using the future to “sort out” the present: rankprop and multitask learning for medical risk prediction. In: Advances in neural information processing systems 8 (proceedings of NIPS-95), pp 959–965
Google Scholar
Chen E, Snyder B, Barzilay R (2007) Incremental text structuring with online hierarchical ranking. In: Proceedings of the 2007 conference on empirical methods in natural language processing (EMNLP 2007), pp 83–91
Google Scholar
Chen E (2008) Discourse models for collaboratively edited corpora. Master’s thesis, Massachusetts Institute of Technology
Crammer K, Singer Y (2002) Pranking with ranking. In: Advances on neural information processing systems 14, vol 14, pp 641–647
Google Scholar
Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37(3):277–296
Article MATH Google Scholar
Gonzalo J, Verdejo F, Chugur I, Cigarran J (1998) Indexing with Wordnet synsets can improve text retrieval. In: Proceedings of the COLING/ACL98 workshop on usage of WordNet for NLP
Google Scholar
Harrington EF (2003) Online ranking/collaborative filtering using the perceptron algorithm. In: Proceedings of the 20th international conference on machine learning (ICML 2003), pp 250–257
Google Scholar
Herbrich R, Graepel T, Obermayer K (2000) Large margin rank boundaries for ordinal regression. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 115–132
Google Scholar
Jiang D, Hu Y, Li H (2009) A ranking approach to keyphrase extraction. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval (SIGIR 2009), pp 756–757
Chapter Google Scholar
Jurafsky D, Martin JH (2008) Speech and language processing. Prentice-Hall, Englewood Cliffs
Google Scholar
Katayama T (2007) Legal engineering—an engineering approach to laws in e-Society age. In: Proceedings of the 1st intl workshop on JURISIN
Google Scholar
Koo T, Carreras X, Collins M (2008) Simple semi-supervised dependency parsing. In: Proceedings of the 46th annual meeting of the association for computational linguistics (ACL 2008), pp 595–603
Google Scholar
Li H (2009) Learning to rank. Tutorial given at ACL-IJCNLP, August. Retrieved from http://research.microsoft.com/en-us/people/hangli/li-acl-ijcnlp-2009-tutorial.pdf
Li W, McCallum A (2005) Semi-supervised sequence modelling with syntactic topic models. In: Proceedings of twentieth national conference on artificial intelligence, pp 813–818
Google Scholar
Liang P (2005) Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology
Mihalcea R, Moldovan D (2000) Semantic indexing using WordNet senses. In: Proceedings of the ACL-2000 workshop on recent advances in natural language processing and information retrieval: held in conjunction with the 38th annual meeting of the association for computational linguistics, October 08, 2000, Hong Kong
Google Scholar
Miller S, Guinness J, Zamanian A (2004) Name tagging with word clusters and discriminative training. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics (HLT/NAACL 2004), pp 337–342
Google Scholar
Ogawa Y, Inagaki S, Toyama K (2008) Automatic consolidation of Japanese statutes based on formalization of amendment sentences. In: Satoh K, Inokuchi A, Nagao K, Kawamura T (eds) JSAI 2007. LNCS, vol 4914. Springer, Heidelberg, pp 349–362
Google Scholar
Pham MQN, Nguyen ML, Shimazu A (2009) Incremental text structuring with word clusters. In: Proceedings of the conference of the pacific association for computational linguistics, 2009, Hokkaido, Japan, pp 109–114
Google Scholar
Pham MQN, Nguyen ML, Shimazu A (2010) Update legal documents using hierarchical ranking models and word clustering. In: Proceedings of the 23rd international conference on legal knowledge and information systems, Liverpool, UK
Google Scholar
Qiu Y, Frei H (1993) Concept based query expansion. In: Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval, pp 160–169
Chapter Google Scholar
Xia F, Liu T-Y, Wang J, Wang J, Zhang W, Li H (2008) Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the 25th international conference on machine learning (ICML 2008), pp 1192–1199
Chapter Google Scholar
Preliminary Recommendations on Semantic Encoding Interim Report (1998) Retrieved January 2010 from the Website of Expert Advisory Group on Language Engineering Standards (2012). http://www.ilc.cnr.it/EAGLES96/rep2/node37.html
e-Government (2012) Retrieved from the Wikipedia. http://en.wikipedia.org/wiki/E_government
United States Code (2012) Retrieved from the Website of the US Government Printing Office. http://www.gpoaccess.gov/uscode/about.html
Website of Office of the Law Revision Counsel (2012) The United States Code. Retrieved from http://uscode.house.gov/lawrevisioncounsel.shtml

Download references

Acknowledgements

This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Young Scientific Research 22700139.

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
Minh Quang Nhat Pham, Minh Le Nguyen, Bach Xuan Ngo & Akira Shimazu

Authors

Minh Quang Nhat Pham
View author publications
You can also search for this author in PubMed Google Scholar
Minh Le Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Bach Xuan Ngo
View author publications
You can also search for this author in PubMed Google Scholar
Akira Shimazu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minh Quang Nhat Pham.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, M.Q.N., Nguyen, M.L., Ngo, B.X. et al. A learning-to-rank method for information updating task. Appl Intell 37, 499–510 (2012). https://doi.org/10.1007/s10489-012-0343-2

Download citation

Published: 11 March 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10489-012-0343-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A learning-to-rank method for information updating task

Abstract

Access this article

Similar content being viewed by others

Semi-supervised Ranking via List-Wise Approach

Which noise affects algorithm robustness for learning to rank

Exploiting Multiple Features for Learning to Rank in Expert Finding

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A learning-to-rank method for information updating task

Abstract

Access this article

Similar content being viewed by others

Semi-supervised Ranking via List-Wise Approach

Which noise affects algorithm robustness for learning to rank

Exploiting Multiple Features for Learning to Rank in Expert Finding

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation