Mobile Spam Filtering base on BTM Topic Model

Ma, Jialin; Zhang, Yongjun; Zhang, Lin

doi:10.1007/978-3-319-49109-7_63

Jialin Ma^5,6,
Yongjun Zhang^5,6 &
Lin Zhang⁵

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 1))

Included in the following conference series:

International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

1758 Accesses

Abstract

At present, Short Message Service (SMS) is widespread in many countries. Many researchers usually use conventional text classifiers to filter SMS spam. In fact, the actual situation of SMS spam messages isn’t consideration by most reseachers. Because the obvious characteristic is the content of SMS spam messages are miscellaneous, shorter and variant. Therefore, traditional classifiers aren’t fit to use for SMS spam filtering directly. In this paper, we propose to utilize A Biterm Topic Model(BTM) to identify SMS spam. The BTM can effectively learn latent semantic features from SMS spam corpus. The experiments in our work show the BTM can learn higher quality of topic features from SMS spam corpus, and can more effective in the task of SMS spam filtering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comparative Analyzing of SMS Spam Using Topic Models

A Novel Approach to Spam Filtering Using Semantic Based Naive Bayesian Classifier in Text Analytics

A Hybrid Approach for Sparse Data Classification Based on Topic Model

References

Wu, N., Wu, M., and Chen, S.,: Real-time monitoring and filtering system for mobile SMS. IEEE Conference on Industrial Electronics & Applications, 1319 (2008).
Google Scholar
Almeida, T. A., Hidalgo, J. M. G., and Yamakami, A.,: Contributions to the study of SMS spam filtering: new collection and results. in Proceedings of the 11th ACM symposium on Document engineering, ACM, pp. 259 (2011).
Google Scholar
Sohn, D.-N., Lee, J.-T., and Rim, H.-C.,: The contribution of stylistic information to content-based mobile spam filtering. in Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, pp. 321 (2009).
Google Scholar
Sohn, D.-N., Lee, J.-T., Han, K.-S., and Rim, H.-C.,: Content-based mobile spam classification using stylistically motivated features. Pattern Recognition Letters, 33, 364 (2012).
Google Scholar
Wadhawan, A., and Negi, N.,: A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets. INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, 3, 5 (2014).
Google Scholar
Delany, S. J., Buckley, M., and Greene, D.,: SMS spam filtering: Methods and data. Expert Systems with Applications, 39, 9899 (2012).
Google Scholar
Center, C. N. S. R., (2014).
Google Scholar
Jiang, N., Jin, Y., Skudlark, A., and Zhang, Z.-L.,: Understanding sms spam in a large cellular network: characteristics, strategies and defenses, in Research in Attacks, Intrusions, and Defenses. Springer, pp. 328 (2013).
Google Scholar
Yan, X., Guo, J., Lan, Y., and Cheng, X.,: A biterm topic model for short texts. in Proceedings of the 22nd international conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 1445 (2013).
Google Scholar
Blei, D. M., Ng, A. Y., and Jordan, M. I.,: Latent dirichlet allocation. the Journal of machine Learning research, 3, 993 (2003).
Google Scholar
Hofmann, T.,: Probabilistic latent semantic indexing. in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 50 (1999).
Google Scholar
Endres, D. M., and Schindelin, J. E.,: A new metric for probability distributions. IEEE Transactions on Information theory (2003).
Google Scholar
Ho, T. P., Kang, H.-S., and Kim, S.-R.,: Graph-based KNN Algorithm for Spam SMS Detection. J. UCS, 19, 2404 (2013).
Google Scholar
Ahmed, I., Ali, R., Guan, D., Lee, Y.-K., Lee, S., and Chung, T.,: Semi-supervised learning using frequent itemset and ensemble learning for SMS classification. Expert Systems with Applications, 42, 1065 (2015).
Google Scholar
Heinrich, G., :Parameter estimation for text analysis, Technical Report (2004).
Google Scholar
Ma J, Zhang Y, Wang Z, et al.: A Message Topic Model for Multi-Grain SMS Spam Filtering. International Journal of Technology and Human Interaction (IJTHI), 2016, 12(2): 83-95.
Google Scholar

Download references

Author information

Authors and Affiliations

Huaiyin Institute of Technology, Huaian, China
Jialin Ma, Yongjun Zhang & Lin Zhang
College of Computer and Information, Hohai University, Nanjing, China
Jialin Ma & Yongjun Zhang

Authors

Jialin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yongjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jialin Ma .

Editor information

Editors and Affiliations

Campus Nord,Ed. Omega (Room 109), Technical University of Catalonia Campus Nord,Ed. Omega (Room 109), Barcelona, Spain
Fatos Xhafa
Fukuoka Institute of Technology , Fukuoka, Japan
Leonard Barolli
Federico II, Università degli Studi di Napoli Federico II, Napoli, Italy
Flora Amato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, J., Zhang, Y., Zhang, L. (2017). Mobile Spam Filtering base on BTM Topic Model. In: Xhafa, F., Barolli, L., Amato, F. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2016. Lecture Notes on Data Engineering and Communications Technologies, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-49109-7_63

Download citation

DOI: https://doi.org/10.1007/978-3-319-49109-7_63
Published: 22 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49108-0
Online ISBN: 978-3-319-49109-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics