Abstract
At present, Short Message Service (SMS) is widespread in many countries. Many researchers usually use conventional text classifiers to filter SMS spam. In fact, the actual situation of SMS spam messages isn’t consideration by most reseachers. Because the obvious characteristic is the content of SMS spam messages are miscellaneous, shorter and variant. Therefore, traditional classifiers aren’t fit to use for SMS spam filtering directly. In this paper, we propose to utilize A Biterm Topic Model(BTM) to identify SMS spam. The BTM can effectively learn latent semantic features from SMS spam corpus. The experiments in our work show the BTM can learn higher quality of topic features from SMS spam corpus, and can more effective in the task of SMS spam filtering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wu, N., Wu, M., and Chen, S.,: Real-time monitoring and filtering system for mobile SMS. IEEE Conference on Industrial Electronics & Applications, 1319 (2008).
Almeida, T. A., Hidalgo, J. M. G., and Yamakami, A.,: Contributions to the study of SMS spam filtering: new collection and results. in Proceedings of the 11th ACM symposium on Document engineering, ACM, pp. 259 (2011).
Sohn, D.-N., Lee, J.-T., and Rim, H.-C.,: The contribution of stylistic information to content-based mobile spam filtering. in Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, pp. 321 (2009).
Sohn, D.-N., Lee, J.-T., Han, K.-S., and Rim, H.-C.,: Content-based mobile spam classification using stylistically motivated features. Pattern Recognition Letters, 33, 364 (2012).
Wadhawan, A., and Negi, N.,: A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets. INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, 3, 5 (2014).
Delany, S. J., Buckley, M., and Greene, D.,: SMS spam filtering: Methods and data. Expert Systems with Applications, 39, 9899 (2012).
Center, C. N. S. R., (2014).
Jiang, N., Jin, Y., Skudlark, A., and Zhang, Z.-L.,: Understanding sms spam in a large cellular network: characteristics, strategies and defenses, in Research in Attacks, Intrusions, and Defenses. Springer, pp. 328 (2013).
Yan, X., Guo, J., Lan, Y., and Cheng, X.,: A biterm topic model for short texts. in Proceedings of the 22nd international conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 1445 (2013).
Blei, D. M., Ng, A. Y., and Jordan, M. I.,: Latent dirichlet allocation. the Journal of machine Learning research, 3, 993 (2003).
Hofmann, T.,: Probabilistic latent semantic indexing. in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 50 (1999).
Endres, D. M., and Schindelin, J. E.,: A new metric for probability distributions. IEEE Transactions on Information theory (2003).
Ho, T. P., Kang, H.-S., and Kim, S.-R.,: Graph-based KNN Algorithm for Spam SMS Detection. J. UCS, 19, 2404 (2013).
Ahmed, I., Ali, R., Guan, D., Lee, Y.-K., Lee, S., and Chung, T.,: Semi-supervised learning using frequent itemset and ensemble learning for SMS classification. Expert Systems with Applications, 42, 1065 (2015).
Heinrich, G., :Parameter estimation for text analysis, Technical Report (2004).
Ma J, Zhang Y, Wang Z, et al.: A Message Topic Model for Multi-Grain SMS Spam Filtering. International Journal of Technology and Human Interaction (IJTHI), 2016, 12(2): 83-95.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ma, J., Zhang, Y., Zhang, L. (2017). Mobile Spam Filtering base on BTM Topic Model. In: Xhafa, F., Barolli, L., Amato, F. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2016. Lecture Notes on Data Engineering and Communications Technologies, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-49109-7_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-49109-7_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49108-0
Online ISBN: 978-3-319-49109-7
eBook Packages: EngineeringEngineering (R0)