Skip to main content

User Identification for Instant Messages

  • Conference paper
Neural Information Processing (ICONIP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7064))

Included in the following conference series:

Abstract

In this paper we study on recognizing user’s identity based on instant messages. Considering the special characteristics of chatting text, we mainly focus on three problems, one is how to extract the features of chatting text, the second is how the user’s model is affected by the size of training data, and the third is which classification model is fit for this problem. The chatting corpus used in this paper is collected from a Chinese IM tool and different feature selection methods and classification models are evaluate on it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pavelec, D., Oliveira, L.S., Justino, E.J.R.: Using Conjunctions and Adverbs for Author Verification. Journal of UCS 14, 2967–2981 (2008)

    Google Scholar 

  2. Van Halteren, H.: Author Verification by Linguistic Profiling: An Exploration of the Parameter Space. ACM Transactions on Speech and Language Processing 4, 1–17 (2007)

    Article  Google Scholar 

  3. Luyckx, K., Daelemans, W.: Authorship Attribution and Verification with Many Authors and Limited Data. In: 22nd International Conference on Computational Linguistics, pp. 513–520. ACL Press, Stroudsburg (2008)

    Google Scholar 

  4. Koppel, M., Schler, J.: Authorship Verification as a One-class Classification Problem. In: 21st International Conference on Machine Learning, pp. 1–7. ACM Press, New York (2004)

    Google Scholar 

  5. Abbasi, A., Chen, H.: Applying Authorship Analysis to Extremist-group Web Forum Messages. IEEE Intelligent Systems 20, 67–75 (2005)

    Article  Google Scholar 

  6. Yule, G.U.: On Sentence-length as a Statistical Characteristic of Style in Prose: with Application to Two Cases of Disputed Authorship. Biometrika 30, 363–390 (1939)

    Google Scholar 

  7. Stamatatos, E.: Ensemble-based Author Identification Using Character N-grams. In: 3rd International Workshop on Text-based Information Retrieval, pp. 41–46. Springer, Heidelberg (2006)

    Google Scholar 

  8. Sanderson, C., Guenter, S.: Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author Unmasking: An investigation. In: 2006 International Conference on Empirical Methods in Natural Language Engineering, pp. 482–491. ACL Press, Stroudsburg (2006)

    Google Scholar 

  9. De Vel, O., Anderson, A., Corney, M.: Mining Email Content for Author Identification Forensics. SIGMOD Record 30, 55–64 (2001)

    Article  Google Scholar 

  10. Forsyth, R.S., Holmes, D.I.: Feature Finding for Text Classification. Literary and Linguistic Computing 11, 163–174 (1996)

    Article  Google Scholar 

  11. Zheng, R., Li, J., Chen, H., Huang, Z.A.: Framework for Authorship Identification of Online Messages: Writing Style Features and Classification Techniques. American Society of Information Science and Technology 57, 378–393 (2006)

    Article  Google Scholar 

  12. Grieve, J.: Quantitative Authorship Attribution: An Evaluation of Techniques. Literary and Linguistic Computing 22, 251–270 (2007)

    Article  Google Scholar 

  13. Keselj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-Based Author Profiles for Authorship Attribution. In: 2003 Pacific Association for Computational Linguistics, pp. 255–264. Springer Press, Heidelberg (2003)

    Google Scholar 

  14. Kjell, B.: Discrimination of Authorship Using Visualization. Information Processing and Management 30, 141–150 (1994)

    Article  Google Scholar 

  15. Argamon, S., Saric, M., Stein, S.: Style Mining of Electronic Messages for Multiple Authorship Discrimination: First results. In: 2003 ACM SIGKDD, pp. 475–480. ACM Press, New York (2003)

    Google Scholar 

  16. Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S.: Stylistic Text Classification Using Functional Lexical Features. Journal of the American Society for Information Science and Technology 58, 802–822 (2007)

    Article  Google Scholar 

  17. Zhao, Y., Zobel, J.: Effective and Scalable Authorship Attribution Using Function Words. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 174–189. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  18. Koppel, M., Schler, J.: Exploiting Stylistic Idiosyncrasies for Authorship Attribution. In: IJCAI 2003 Workshop on Computational Approaches to Style Analysis and Synthesis, pp. 69–72. AAAI Press, Menlo Park (2003)

    Google Scholar 

  19. Zhao, Y., Zobel, J.: Searching with Style: Authorship Attribution in Classic Literature. In: Thirtieth Australasian Computer Science Conference, pp. 59–68. Australian Computer Society Press, Darlinghurst (2007)

    Google Scholar 

  20. Kucukyilmaz, T., Cambazoglu, B.B., Aykanat, C., Can, J.: Chat Mining for Gender Prediction. In: Yakhno, T., Neuhold, E.J. (eds.) ADVIS 2006. LNCS, vol. 4243, pp. 274–283. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  21. Kucukyilmaz, T., Cambazoglu, B.B., Aykanat, C., Can, F.: Chat Mining: Predicting User and Message Attributes in Computer-mediated Communication. Information Processing &Management 44, 1448–1466 (2008)

    Article  Google Scholar 

  22. Manevitza, L., Yousef, M.: One-class Document Classification via Neural Networks. Neurocomputing 70, 1466–1481 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ding, Y., Meng, X., Chai, G., Tang, Y. (2011). User Identification for Instant Messages. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7064. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24965-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24965-5_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24964-8

  • Online ISBN: 978-3-642-24965-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics