Automatic Indexing from a Thesaurus Using Bayesian Networks: Application to the Classification of Parliamentary Initiatives

de Campos, Luis M.; Fernández-Luna, Juan M.; Huete, Juan F.; Romero, Alfonso E.

doi:10.1007/978-3-540-75256-1_75

Luis M. de Campos²,
Juan M. Fernández-Luna²,
Juan F. Huete² &
…
Alfonso E. Romero²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4724))

Included in the following conference series:

European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty

1242 Accesses
4 Citations

Abstract

We propose a method which, given a document to be classified, automatically generates an ordered set of appropriate descriptors extracted from a thesaurus. The method creates a Bayesian network to model the thesaurus and uses probabilistic inference to select the set of descriptors having high posterior probability of being relevant given the available evidence (the document to be classified). We apply the method to the classification of parliamentary initiatives in the regional Parliament of Andalucía at Spain from the Eurovoc thesaurus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adami, G., Avesani, P., Sona, D.: Clustering documents in a web directory. In: Proceedings of Fifth ACM Int. Workshop on Web Information and Data Management, pp. 66–73. ACM Press, New York (2003)
Chapter Google Scholar
Adami, G., Avesani, P., Sona, D.: Clustering documents into a web directory for bootstrapping a supervised classification. Data & Knowledge Engineering 54, 301–325 (2006)
Article Google Scholar
Chakrabarti, S., Dom, B., Agrawal, R., Raghavan, P.: Using taxonomy, discriminants, and signatures for navigating in text databases. In: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 446–455 (1997)
Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: The BNR model: foundations and performance of a Bayesian network-based retrieval model. International Journal of Approximate Reasoning 34, 265–285 (2003)
Article MathSciNet MATH Google Scholar
Dumais, S., Chen, H.: Hierarchical classification of web document. In: Proceedings of the 23th ACM International Conference on Research and Development in Information Retrieval, pp. 256–263. ACM Press, New York (2000)
Google Scholar
Golub, K.: Automated subject classification of textual web documents. Journal of Documentation 62(3), 350–371 (2006)
Article Google Scholar
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Proceedings of the 14th International Conference on Machine Learning, pp. 170–178 (1997)
Google Scholar
Larson, R.R.: Experiments in automatic library of congress classification. Journal of the American Society for Information Science 43(2), 130–148 (1992)
Article Google Scholar
Lauser, B., Hotho, A.: Automatic multi-label subject indexing in a multilingual envirinment. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003, vol. 2769, pp. 140–151. Springer, Heidelberg (2003)
Chapter Google Scholar
Medelyan, O., Witten, I.: Thesaurus based automatic keyphrase indexing. In: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, pp. 296–297 (2006)
Google Scholar
Moskovitch, R., Cohen-Kashi, S., Dror, U., Levy, I.: Multiple hierarchical classification of free-text clinical guidelines. Artificial Intelligence in Medicine 37(3), 177–190 (2006)
Article Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan and Kaufmann, San Mateo (1988)
MATH Google Scholar
Ruiz, M., Srinivasan, P.: Hierarchical text categorization using neural networks. Information Retrieval 5(1), 87–118 (2002)
Article MATH Google Scholar
Sebastiani, F.: Machine Learning in automated text categorization. ACM Computing Surveys 34, 1–47 (2002)
Article Google Scholar
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1, 69–90 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Ciencias de la Computación e Inteligencia Artificial E.T.S.I. Informática y de Telecomunicaciones, Universidad de Granada, 18071, Granada, Spain
Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete & Alfonso E. Romero

Authors

Luis M. de Campos
View author publications
You can also search for this author in PubMed Google Scholar
Juan M. Fernández-Luna
View author publications
You can also search for this author in PubMed Google Scholar
Juan F. Huete
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso E. Romero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LARODEC, Institut Supérieur de Gestion de Tunis, 41 Avenue de la liberté, 2000, Le Bardo, Tunisie
Khaled Mellouli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Romero, A.E. (2007). Automatic Indexing from a Thesaurus Using Bayesian Networks: Application to the Classification of Parliamentary Initiatives. In: Mellouli, K. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2007. Lecture Notes in Computer Science(), vol 4724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75256-1_75

Download citation

DOI: https://doi.org/10.1007/978-3-540-75256-1_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75255-4
Online ISBN: 978-3-540-75256-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics