Arabic Question Classification Using Support Vector Machines and Convolutional Neural Networks

Aouichat, Asma; Hadj Ameur, Mohamed Seghir; Geussoum, Ahmed

doi:10.1007/978-3-319-91947-8_12

Asma Aouichat¹⁸,
Mohamed Seghir Hadj Ameur¹⁸ &
Ahmed Geussoum¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10859))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

2527 Accesses
9 Citations

Abstract

A Question Classification is an important task in Question Answering Systems and Information Retrieval among other NLP systems. Given a question, the aim of Question Classification is to find the correct type of answer for it. The focus of this paper is on Arabic question classification. We present a novel approach that combines a Support Vector Machine (SVM) and a Convolutional Neural Network (CNN). This method works in two stages: in the first stage, we identify the coarse/main question class using an SVM model; in the second stage, for each coarse question class returned by the SVM model, a CNN model is used to predict the subclass (finer class) of the main class. The performed tests have shown that our approach to Arabic Questions Classification yields very promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A closer scrutiny of the patterns that were used in [2] has shown that they do not cover all the possible variations of uses of Interrogative Patterns in different contexts and settings.
2.
https://github.com/aboSamoor/polyglot.
3.
https://radimrehurek.com/gensim/.
4.
The dataset is available at http://cogcomp.org/Data/QA/QC/.
5.
http://scikit-learn.org/.
6.
https://keras.io/.

References

Abdelnasser, H., Ragab, M., Mohamed, R., Mohamed, A., Farouk, B., El-Makky, N., Torki, M.: Al-Bayan: an Arabic question answering system for the Holy Quran. In: Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), pp. 57–64 (2014)
Google Scholar
Al Chalabi, H.M., Ray, S.K., Shaalan, K.: Question classification for Arabic question answering systems. In: International Conference on Information and Communication Technology Research (ICTRC), Abu Dhabi, United Arab Emirates, pp. 310–313. IEEE (2015)
Google Scholar
Aouichat, A., Guessoum, A.: Building TALAA-AFAQ, a corpus of Arabic FActoid question-answers for a question answering system. In: Frasincar, F., Ittoo, A., Nguyen, L.M., Métais, E. (eds.) NLDB 2017. LNCS, vol. 10260, pp. 380–386. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59569-6_46
Chapter Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Chollet, F., et al.: Keras: https://github.com/fchollet/keras (2015)
Dahou, A., Xiong, S., Zhou, J., Haddoud, M.H., Duan, P.: Word embeddings and convolutional neural network for arabic sentiment classification. In: The 26th International Conference on Computational Linguistics: Proceedings of COLING 2016, Technical Papers, pp. 2418–2427 (2016)
Google Scholar
Eisele, A., Chen, Y.: MultiUN: a multilingual corpus from united nation documents. In: Tapias, D., Rosner, M., Piperidis, S., Odjik, J., Mariani, J., Maegaard, B., Choukri, K., Chair, N.C.C. (eds.) Proceedings of the Seventh Conference on International Language Resources and Evaluation, pp. 2868–2872. European Language Resources Association (ELRA), May 2010
Google Scholar
Hasan, A.M., Zakaria, L.Q.: Question classification using support vector machine and pattern matching. J. Theor. Appl. Inf. Technol. 87(2), 259–265 (2016)
Google Scholar
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)
Article Google Scholar
Huang, Z., Thint, M., Qin, Z.: Question classification using head words and their hypernyms. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 25–27 October (Sat-Mon) in Waikiki, Honolulu, Hawaii, pp. 927–936. Association for Computational Linguistics (2008)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics-Volume 1, Taipei, Taiwan, 24 August–01 September, pp. 1–7. Association for Computational Linguistics (2002)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2013, pp. 3111–3119. Curran Associates Inc., USA (2013). http://dl.acm.org/citation.cfm?id=2999792.2999959
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 807–814 (2010)
Google Scholar
Nyberg, E., Mitamura, T., Carbonnell, J., Callan, J., Collins-Thompson, K., Czuba, K., Duggan, M., Hiyakumoto, L., Hu, N., Huang, Y.: The JAVELIN question-answering system at TREC 2002. NIST SPEC. PUBL. SP 251, 128–137 (2003)
Google Scholar
Ray, S.K., Singh, S., Joshi, B.P.: A semantic approach for question classification using WordNet and Wikipedia. Pattern Recognit. Lett. 31(13), 1935–1943 (2010)
Article Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975). https://doi.org/10.1145/361219.361220
Article MATH Google Scholar
Scherer, D., Müller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010. LNCS, vol. 6354, pp. 92–101. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15825-4_10
Chapter Google Scholar
Silva, J., Coheur, L., Mendes, A.C., Wichert, A.: From symbolic to sub-symbolic information in question classification. Artif. Intell. Rev. 35(2), 137–154 (2011)
Article Google Scholar
Skilling, J.: Maximum Entropy and Bayesian Methods. Springer Science & Business Media, Netherlands (1988)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer science & business media, New York (2013)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

NLP and Machine Learning Research Group (TALAA), Laboratory for Research in Artificial Intelligence (LRIA), University of Science and Technology Houari Boumediene (USTHB), Algiers, Algeria
Asma Aouichat, Mohamed Seghir Hadj Ameur & Ahmed Geussoum

Authors

Asma Aouichat
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Seghir Hadj Ameur
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Geussoum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asma Aouichat .

Editor information

Editors and Affiliations

Université de Franche-Comté, Besançon, France
Max Silberztein
Conservatoire National des Arts et Métiers, Paris, France
Faten Atigui
Conservatoire National des Arts et Métiers, Paris, France
Elena Kornyshova
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Salford, Manchester, United Kingdom
Farid Meziane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aouichat, A., Hadj Ameur, M.S., Geussoum, A. (2018). Arabic Question Classification Using Support Vector Machines and Convolutional Neural Networks. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-91947-8_12
Published: 22 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91946-1
Online ISBN: 978-3-319-91947-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics