Skip to main content

Detection and Correction Scheme of Internet Chat Lingo Based on Statistic and Pinyin Similarity

  • Conference paper
Information Computing and Applications (ICICA 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 307))

Included in the following conference series:

  • 1211 Accesses

Abstract

The development of Internet promotes the usage of Internet chat lingo. This type of language is diversified and irregular for natural language processing. In this paper, according to the characteristics of the Chinese and Internet chat lingo, we proposed a method for lingo detection and correction based on the statistic and pinyin similarity. This method applied Bigram model to detect the boundary of lingos, and then corrected them by using pinyin similarity. According to the experimental results and analysis, our method can effectively detect and correct chat lingos.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Xie, Y.: Research for Internet Language. Medical Information 20(5) (2007) (Chinese)

    Google Scholar 

  2. Wu, Z.H.: The Study of Phonetic Metaphors in Cyber Language. Journal of Henan Institute of Engineering (Social Science Edition) 25(1) (2010) (Chinese)

    Google Scholar 

  3. Chen, Z.P., Lv, Y.Q., Liu, H.S., et al.: Chinese Spelling Correction in Search Engines Based on N-gram Model. Journal of CAEIT 4(3) (2009) (Chinese)

    Google Scholar 

  4. Ma, J.S., Zhang, Y., Liu, T., et al.: Detecting Chinese Text Errors Based on Trigram and Dependency Parsing. Journal of the China Society for Scientific and Technical Information 23(6) (2004) (Chinese)

    Google Scholar 

  5. Chen, T.Y., Chen, R., Pan, L.L., et al.: Archaic Chinese Punctuating Sentences Based on Context N-gram Model. Computer Engineering 33(3) (2007) (Chinese)

    Google Scholar 

  6. Zhang, Y.S., Cao, D.Y., Yu, S.W.: A Hybrid Model of Combining Rule-based and Statistics-based Approaches for Automatic Detecting Errors in Chinese Text. Journal of Chinese Information Processing 20(4) (2006) (Chinese)

    Google Scholar 

  7. Zhang, Y.S., Ding, B.Q.: Automatic Errors Detecting of Chinese Texts Based on the Bi-neighborship. Journal of Chinese Information Processing 15(3) (2000) (Chinese)

    Google Scholar 

  8. Feng, J.H., Gulila, A., Mayra, H.K.: Organization Name Recognition based on N-gram Model. Computer Engineering and Application 46(31), 135–138 (2010)

    Google Scholar 

  9. Diane, M.N., Amanda, S.: TechWriter: An Evolving System for Writing Assistance for Advanced Learners of English. CALICO Journal 26(3), 611–625 (2009)

    Google Scholar 

  10. Liu, Y., Yu, S.W., Zhu, X.F.: Construction of the Contemporary Chinese Compound Words Database (Chinese)

    Google Scholar 

  11. Zhou, H.P.: Study on Application of Levenshtein Distance in Programming Test Automatic Scoring. Computer Applications and Software 28(5) (2011) (Chinese)

    Google Scholar 

  12. Mayire, Y., Mijiti, A., Askar, H.: A Minimum Edit Distance Based Uighur Spelling Check. Journal of Chinese Information Processing 22(3) (2008) (Chinese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Han, B., Li, Z. (2012). Detection and Correction Scheme of Internet Chat Lingo Based on Statistic and Pinyin Similarity. In: Liu, C., Wang, L., Yang, A. (eds) Information Computing and Applications. ICICA 2012. Communications in Computer and Information Science, vol 307. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34038-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34038-3_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34037-6

  • Online ISBN: 978-3-642-34038-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics