Skip to main content

Measures to Detect Word Substitution in Intercepted Communication

  • Conference paper
Intelligence and Security Informatics (ISI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3975))

Included in the following conference series:

  • 1891 Accesses

Abstract

Those who want to conceal the content of their communications can do so by replacing words that might trigger attention by other words or locutions that seem more ordinary. We address the problem of discovering such substitutions when the original and substitute words have the same natural frequency. We construct a number of measures, all of which search for local discontinuities in properties such as string and bag-of-words frequency. Each of these measures individually is a weak detector. However, we show that combining them produces a detector that is reasonably effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bilmes, J.A., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Proceedings of HLT/NACCL (2003)

    Google Scholar 

  2. British National Corpus (BNC) (2004), http://www.natcorp.ox.ac.uk

  3. European Parliament Temporary Committee on the ECHELON Interception System. Final report on the existence of a global system for the interception of private and commercial communications (ECHELON interception system) (2001)

    Google Scholar 

  4. Fong, S.W., Skillicorn, D.B., Roussinov, D.: Detecting word substitution in adversarial communication. In: Workshop on Link Analysis, Counterterrorism and Security at the SIAM International Conference on Data Mining, to appear (2006)

    Google Scholar 

  5. Golding, A.R., Roth, D.: A Winnow-based approach to context-sensitive spelling correction. In: Machine Learning, Special issue on Machine Learning and Natural Language (1999)

    Google Scholar 

  6. Ferrer, R., Cancho, I., Solé, R.V.: The small world of human language. In: Proceedings of the Royal Society of London Series B – Biological Sciences, pp. 2261–2265 (2001)

    Google Scholar 

  7. Lee, H., Ng, A.Y.: Spam deobfuscation using a Hidden Markov Model. In: Proceedings of the Second Conference on Email and Anti-Spam (2005)

    Google Scholar 

  8. Roussinov, D., Zhao, L.: Automatic discovery of similarity relationships through web mining. In: Decision Support Systems, pp. 149–166 (2003)

    Google Scholar 

  9. Roussinov, D., Zhao, L., Fan, W.: Mining context specific similarity relationships using the World Wide Web. In: Proceedings of the 2005 Conference on Human Language Technologies (2005)

    Google Scholar 

  10. Skillicorn, D.B.: Beyond keyword filtering for message and conversation detection. In: Kantor, P., Muresan, G., Roberts, F., Zeng, D.D., Wang, F.-Y., Chen, H., Merkle, R.C. (eds.) ISI 2005. LNCS, vol. 3495, pp. 231–243. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fong, S., Skillicorn, D.B., Roussinov, D. (2006). Measures to Detect Word Substitution in Intercepted Communication. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, FY. (eds) Intelligence and Security Informatics. ISI 2006. Lecture Notes in Computer Science, vol 3975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760146_17

Download citation

  • DOI: https://doi.org/10.1007/11760146_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34478-0

  • Online ISBN: 978-3-540-34479-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics