Learning Class-Informed Semantic Similarity

Wang, Tinghua; Li, Wei

doi:10.1007/978-3-319-46675-0_48

Tinghua Wang¹⁹ &
Wei Li¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9949))

Included in the following conference series:

International Conference on Neural Information Processing

3033 Accesses

Abstract

Exponential kernel, which models semantic similarity by means of a diffusion process on a graph defined by lexicon and co-occurrence information, has been successfully applied to the task of text categorization. However, the diffusion is an unsupervised process, which fails to exploit the class information in a supervised classification scenario. To address the limitation, we present a class-informed exponential kernel to make use of the class knowledge of training documents in addition to the co-occurrence knowledge. The basic idea is to construct an augmented term-document matrix by encoding class information as additional terms and appending to training documents. Diffusion is then performed on the augmented term-document matrix. In this way, the words belonging to the same class are indirectly drawn closer to each other, hence the class-specific word correlations are strengthened. The proposed approach was demonstrated with several variants of the popular 20Newsgroup data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398. Springer, Heidelberg (1998)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel methods for pattern analysis. Cambridge University Press, New York (2004)
Book MATH Google Scholar
Bloehdorn, S., Basili, R., Cammisa, M., Moschitti, A.: Semantic kernels for text categorization based on topological measures of feature similarity. In: Proceedings of the 6th IEEE International Conference on Data Mining, Hong Kong, China, pp. 808–812 (2006)
Google Scholar
Wang, P., Domeniconi, C.: Building semantic kernels for text categorization using Wikipedia. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, USA, pp. 713–721 (2008)
Google Scholar
Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. J. Intell. Inf. Syst. 18(2–3), 127–152 (2002)
Article Google Scholar
Kandola, J., Shawe-Taylor, J., Cristianini, N.: Learning semantic similarity. In: Advances in Neural Information Processing Systems, vol. 15, pp. 657–664 (2003)
Google Scholar
Gliozzo, A.M., Strapparava, C.: Domain kernels for text categorization. In: Proceedings of the 9th Conference on Computational Natural Language Learning, Ann Arbor, USA, pp. 56–63 (2005)
Google Scholar
Chen, J., Zhong, J., Xie, Y., Cai, C.: Text categorization using SVM with exponential kernel. Appl. Mech. Mater. 519–520, 807–810 (2014)
Article Google Scholar
Altınel, B., Caniz, M.C., Diri, B.: A corpus-based semantic kernel for text categorization by using meaning values of terms. Eng. Appl. Artif. Intell. 43, 54–66 (2015)
Article Google Scholar
Wang, T., Rao, J., Hu, Q.: Supervised word sense disambiguation using semantic diffusion kernel. Eng. Appl. Artif. Intell. 27, 167–174 (2014)
Article Google Scholar
Chakraborti, S., Lothian, R., Wiratunga, N., Watt, S.N.: Sprinkling: supervised latent semantic indexing. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 510–514. Springer, Heidelberg (2006)
Chapter Google Scholar
Chakraborti, S., Mukras, R., Lothian, R., Wiratunga, N., Watt, S., Harper, D.: Supervised latent semantic indexing using adaptive sprinkling. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, pp. 1582–1587 (2007)
Google Scholar
Hingmire, S., Chakraborti, S.: Sprinkling topics for weakly supervised text categorization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 2, Short Paper, Baltimore, USA, pp. 55–60 (2014)
Google Scholar
Holzman, L.E., Fisher, T.A., Galitsky, L.M., Kontostathis, A., Pottenger, W.M.: A software infrastructure for research in textual data mining. Int. J. Artif. Intell. Tools 14(4), 829–849 (2004)
Article Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (No. 61562003), the Natural Science Foundation of Jiangxi Province of China (Nos. 20151BAB207029 and 20161BAB202070), the China Scholarship Council (No. 201508360144) and the “Bai Ren Yuan Hang” Project of Jiangxi Province of China in 2015.

Author information

Authors and Affiliations

School of Mathematics and Computer Science, Gannan Normal University, Ganzhou, 341000, People’s Republic of China
Tinghua Wang & Wei Li

Authors

Tinghua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tinghua Wang .

Editor information

Editors and Affiliations

The University of Tokyo , Tokyo, Japan
Akira Hirose
Kobe University , Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology , Ikoma, Japan
Kazushi Ikeda
Kyungpook National University , Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences , Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, T., Li, W. (2016). Learning Class-Informed Semantic Similarity. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9949. Springer, Cham. https://doi.org/10.1007/978-3-319-46675-0_48

Download citation

DOI: https://doi.org/10.1007/978-3-319-46675-0_48
Published: 29 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46674-3
Online ISBN: 978-3-319-46675-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics