Incorporating External Information in Bayesian Classifiers Via Linear Feature Transformations

Pahikkala, Tapio; Boberg, Jorma; Mylläri, Aleksandr; Salakoski, Tapio

doi:10.1007/11816508_41

Tapio Pahikkala²¹,
Jorma Boberg²¹,
Aleksandr Mylläri²¹ &
…
Tapio Salakoski²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4139))

Included in the following conference series:

International Conference on Natural Language Processing (in Finland)

1625 Accesses

Abstract

Naive Bayes classifier is a frequently used method in various natural language processing tasks. Inspired by a modified version of the method called the flexible Bayes classifier, we explore the use of linear feature transformations together with the Bayesian classifiers, because it provides us an elegant way to endow the classifier with an external information that is relevant to the task. While the flexible Bayes classifier is based on the idea of using kernel density estimation to obtain the class conditional probabilities of continuously valued attributes, we use the linear transformations to smooth the feature frequency counts of discrete valued attributes. We evaluate the method on the context sensitive spelling error correction problem using the Reuters corpus. For this particular task, we define a positional feature transformation and a word feature transformation that take advantage of the positional information of the context words and the part-of-speech information of words, respectively. Our experimental results show that the performance of the Bayesian classifiers in the natural language disambiguation tasks can be improved with the proposed transformations and that the incorporation of external information via the linear feature transformations is a promising research direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The Effectiveness of the Max Entropy Classifier for Feature Selection

Improved Document Categorization Through Feature-Rich Combinations

Don’t Rule Out Simple Models Prematurely: A Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML

References

Pahikkala, T., Ginter, F., Boberg, J., Jarvinen, J., Salakoski, T.: Contextual weighting for support vector machines in literature mining: an application to gene versus protein name disambiguation. BMC Bioinformatics 6, 157 (2005)
Article Google Scholar
Pahikkala, T., Pyysalo, S., Ginter, F., Boberg, J., Järvinen, J., Salakoski, T.: Kernels incorporating word positional information in natural language disambiguation tasks. In: Russell, I., Markov, Z. (eds.) Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, Clearwater Beach, Florida, pp. 442–447. AAAI Press, Menlo Park (2005)
Google Scholar
Pahikkala, T., Pyysalo, S., Boberg, J., Mylläri, A., Salakoski, T.: Improving the performance of bayesian and support vector classifiers in word sense disambiguation using positional information. In: Honkela, T., Könönen, V., Pöllä, M., Simula, O. (eds.) Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, Espoo, Finland, Helsinki University of Technology, pp. 90–97 (2005)
Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Besnard, P., Hanks, S. (eds.) Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall PTR, Upper Saddle River (2000)
Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Joshi, A., Palmer, M. (eds.) Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 310–318. Morgan Kaufmann Publishers, San Francisco (1996)
Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman & Hall, London (1986)
MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2001)
MATH Google Scholar
Golding, A.R., Roth, D.: A winnow-based approach to context-sensitive spelling correction. Machine Learning 34, 107–130 (1999)
Article MATH Google Scholar
Rose, T.G., Stevenson, M., Whitehead, M.: The Reuters Corpus Volume 1: From yesterday’s news to tomorrow’s language resources. In: Rodriguez, M.G., Araujo, C.P.S. (eds.) Proceedings of the Third International Conference on Language Resources and Evaluation, ELRA, Paris, France (2002)
Google Scholar
Fawcett, T.: Roc graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, HP Labs, Palo Alto, California (2003)
Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Article Google Scholar
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41, 391–407 (1990)
Article Google Scholar
Pahikkala, T., Pyysalo, S., Boberg, J., Järvinen, J., Salakoski, T.: Matrix representations, linear transformations, and kernels for natural language processing (submitted, 2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Turku Centre for Computer Science (TUCS), Department of Information Technology, University of Turku, Lemminkäisenkatu 14 A, FIN-20520, Turku, Finland
Tapio Pahikkala, Jorma Boberg, Aleksandr Mylläri & Tapio Salakoski

Authors

Tapio Pahikkala
View author publications
You can also search for this author in PubMed Google Scholar
Jorma Boberg
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandr Mylläri
View author publications
You can also search for this author in PubMed Google Scholar
Tapio Salakoski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Turku Centre for Computer Science (TUCS), Department of Information Technology, University of Turku, Joukahaisenkatu 3-5 B, FIN-20520, Turku, Finland
Tapio Salakoski
Turku Centre for Computer Science (TUCS) and Department of IT, University of Turku, Lemminkäisenkatu 14 A, 20520, Turku, Finland
Filip Ginter & Sampo Pyysalo &
Department of Information Technology, University of Turku, Lemminkäisenkatu 14–18 A, FIN-20520, Turku, Finland
Tapio Pahikkala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pahikkala, T., Boberg, J., Mylläri, A., Salakoski, T. (2006). Incorporating External Information in Bayesian Classifiers Via Linear Feature Transformations. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds) Advances in Natural Language Processing. FinTAL 2006. Lecture Notes in Computer Science(), vol 4139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816508_41

Download citation

DOI: https://doi.org/10.1007/11816508_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37334-6
Online ISBN: 978-3-540-37336-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incorporating External Information in Bayesian Classifiers Via Linear Feature Transformations

Abstract

Access this chapter

Preview

Similar content being viewed by others

The Effectiveness of the Max Entropy Classifier for Feature Selection

Improved Document Categorization Through Feature-Rich Combinations

Don’t Rule Out Simple Models Prematurely: A Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Incorporating External Information in Bayesian Classifiers Via Linear Feature Transformations

Abstract

Access this chapter

Preview

Similar content being viewed by others

The Effectiveness of the Max Entropy Classifier for Feature Selection

Improved Document Categorization Through Feature-Rich Combinations

Don’t Rule Out Simple Models Prematurely: A Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation