Skip to main content

Mobile Spam Filtering base on BTM Topic Model

  • Conference paper
  • First Online:
Advances on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC 2016)

Abstract

At present, Short Message Service (SMS) is widespread in many countries. Many researchers usually use conventional text classifiers to filter SMS spam. In fact, the actual situation of SMS spam messages isn’t consideration by most reseachers. Because the obvious characteristic is the content of SMS spam messages are miscellaneous, shorter and variant. Therefore, traditional classifiers aren’t fit to use for SMS spam filtering directly. In this paper, we propose to utilize A Biterm Topic Model(BTM) to identify SMS spam. The BTM can effectively learn latent semantic features from SMS spam corpus. The experiments in our work show the BTM can learn higher quality of topic features from SMS spam corpus, and can more effective in the task of SMS spam filtering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wu, N., Wu, M., and Chen, S.,: Real-time monitoring and filtering system for mobile SMS. IEEE Conference on Industrial Electronics & Applications, 1319 (2008).

    Google Scholar 

  2. Almeida, T. A., Hidalgo, J. M. G., and Yamakami, A.,: Contributions to the study of SMS spam filtering: new collection and results. in Proceedings of the 11th ACM symposium on Document engineering, ACM, pp. 259 (2011).

    Google Scholar 

  3. Sohn, D.-N., Lee, J.-T., and Rim, H.-C.,: The contribution of stylistic information to content-based mobile spam filtering. in Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, pp. 321 (2009).

    Google Scholar 

  4. Sohn, D.-N., Lee, J.-T., Han, K.-S., and Rim, H.-C.,: Content-based mobile spam classification using stylistically motivated features. Pattern Recognition Letters, 33, 364 (2012).

    Google Scholar 

  5. Wadhawan, A., and Negi, N.,: A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets. INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, 3, 5 (2014).

    Google Scholar 

  6. Delany, S. J., Buckley, M., and Greene, D.,: SMS spam filtering: Methods and data. Expert Systems with Applications, 39, 9899 (2012).

    Google Scholar 

  7. Center, C. N. S. R., (2014).

    Google Scholar 

  8. Jiang, N., Jin, Y., Skudlark, A., and Zhang, Z.-L.,: Understanding sms spam in a large cellular network: characteristics, strategies and defenses, in Research in Attacks, Intrusions, and Defenses. Springer, pp. 328 (2013).

    Google Scholar 

  9. Yan, X., Guo, J., Lan, Y., and Cheng, X.,: A biterm topic model for short texts. in Proceedings of the 22nd international conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 1445 (2013).

    Google Scholar 

  10. Blei, D. M., Ng, A. Y., and Jordan, M. I.,: Latent dirichlet allocation. the Journal of machine Learning research, 3, 993 (2003).

    Google Scholar 

  11. Hofmann, T.,: Probabilistic latent semantic indexing. in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 50 (1999).

    Google Scholar 

  12. Endres, D. M., and Schindelin, J. E.,: A new metric for probability distributions. IEEE Transactions on Information theory (2003).

    Google Scholar 

  13. Ho, T. P., Kang, H.-S., and Kim, S.-R.,: Graph-based KNN Algorithm for Spam SMS Detection. J. UCS, 19, 2404 (2013).

    Google Scholar 

  14. Ahmed, I., Ali, R., Guan, D., Lee, Y.-K., Lee, S., and Chung, T.,: Semi-supervised learning using frequent itemset and ensemble learning for SMS classification. Expert Systems with Applications, 42, 1065 (2015).

    Google Scholar 

  15. Heinrich, G., :Parameter estimation for text analysis, Technical Report (2004).

    Google Scholar 

  16. Ma J, Zhang Y, Wang Z, et al.: A Message Topic Model for Multi-Grain SMS Spam Filtering. International Journal of Technology and Human Interaction (IJTHI), 2016, 12(2): 83-95.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jialin Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ma, J., Zhang, Y., Zhang, L. (2017). Mobile Spam Filtering base on BTM Topic Model. In: Xhafa, F., Barolli, L., Amato, F. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2016. Lecture Notes on Data Engineering and Communications Technologies, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-49109-7_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49109-7_63

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49108-0

  • Online ISBN: 978-3-319-49109-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics