Abstract
With the development of the society, more and more people are concerned about education, such as preschool education, primary and secondary education and adult education. These people want to retrieve educational contents from large amount of information through the Internet. From the technical view, this requires identifying educational and non-educational data. This paper focuses on solving the educational and non-educational text classification problem based on deep Gaussian processes (DGPs). Before training the DGP, word2vec is adopted to construct the vector representation of text data. Then we use the DGP regression model to model the processed data. Experiments on real-world text data are conducted to demonstrate the feasibility of the DGP for the text classification problem. The promising results show the validity and superiority of the proposed method over other related methods, such as GP and Sparse GP.
H. Wang and J. Zhao contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Word2vec is an efficient tool for Google to represent the words as real value vectors. The python program can be achieved using the gensim toolkit.
- 2.
\(\mathbf {z}_l \) will be omitted in our paper to simplify the notation.
- 3.
The \(q^{\setminus 1}(\mathbf {u})\) is the variational cavity distribution of \(\mathbf {u}\).
References
Bai, X., Chen, F., Zhan, S.: A study on sentiment computing and classification of sina weibo with word2vec. In: IEEE International Congress on Big Data, pp. 358–363 (2014)
Bui, T., Hernández-Lobato, D., Hernandez-Lobato, J., Li, Y., Turner, R.: Deep Gaussian processes for regression using approximate expectation propagation. In: International Conference on Machine Learning, pp. 1472–1481 (2016)
Dai, Z., Damianou, A., Gonzlez, J., Lawrence, N.: Variational auto-encoded deep Gaussian processes. Comput. Sci. 14, 3942–3951 (2015)
Damianou, A., Lawrence, N.: Deep Gaussian processes. In: International Conference on Artificial Intelligence and Statistics, pp. 207–215 (2013)
El-Halees, A.: Arabic text classification using maximum entropy. IUG J. Nat. Stud. 15, 157–167 (2015)
Hsu, T.: Research methods and data analysis procedures used by educational researchers. Int. J. Res. Method Educ. 28, 109–133 (2005)
Kim, H., Ghahramani, Z.: Bayesian Gaussian process classification with the EM-EP algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1948–1959 (2006)
Lawrence, N., Moore, A.: Hierarchical Gaussian process latent variable models. In: International Conference on Machine Learning, pp. 481–488 (2007)
Li, Y., Hernández-Lobato, J., Turner, R.: Stochastic expectation propagation. In: Advances in Neural Information Processing Systems, vol. 27, pp. 2323–2331 (2015)
Limprasert, W., Kosolsombat, S.: A case study of data analysis for educational management. In: International Joint Conference on Computer Science and Software Engineering, pp. 1–5 (2016)
Luo, J., Sorour, S., Goda, K., Mine, T.: Predicting student grade based on free-style comments using word2vec and ANN by considering prediction results obtained in consecutive lessons. In: International Conference on Educational Data Mining, pp. 396–399 (2015)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 25, pp. 3111–3119 (2013)
Nickisch, H., Rasmussen, C.: Approximations for binary Gaussian process classification. J. Mach. Learn. Res. 9, 2035–2078 (2008)
Rahmawati, D., Khodra, M.: Word2vec semantic representation in multilabel classification for Indonesian news article. In: International Conference on Advanced Informatics: Concepts, Theory And Application, pp. 1–6 (2016)
Rasmussen, C.: Gaussian processes for machine learning. Citeseer (2006)
Snelson, E., Ghahramani, Z.: Sparse Gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems, vo. 18, pp. 1257–1264 (2006)
Sun, S.: Computational education science and ten research directions. Commun. Chin. Assoc. Artif. Intell. 9, 15–16 (2015)
Yin, M., Zhao, J., Sun, S.: Key course selection for academic early warning based on Gaussian processes. In: Yin, H., et al. (eds.) IDEAL 2016. LNCS, vol. 9937, pp. 240–247. Springer, Cham (2016). doi:10.1007/978-3-319-46257-8_26
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, vol. 27, pp. 649–657 (2015)
Zhao, J., Sun, S.: Variational dependent multi-output Gaussian process dynamical systems. J. Mach. Learn. Res. 17, 1–36 (2016)
Acknowledgments
The first two authors Huijuan Wang and Jing Zhao are joint first authors. This work is sponsored by Shanghai Sailing Program. The corresponding author Shiliang Sun would also like to thank supports by NSFC Projects 61673179 and 61370175, and Shanghai Knowledge Service Platform Project (No. ZF1213).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wang, H., Zhao, J., Tang, Z., Sun, S. (2017). Educational and Non-educational Text Classification Based on Deep Gaussian Processes. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10634. Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-70087-8_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70086-1
Online ISBN: 978-3-319-70087-8
eBook Packages: Computer ScienceComputer Science (R0)