Abstract
Those who want to conceal the content of their communications can do so by replacing words that might trigger attention by other words or locutions that seem more ordinary. We address the problem of discovering such substitutions when the original and substitute words have the same natural frequency. We construct a number of measures, all of which search for local discontinuities in properties such as string and bag-of-words frequency. Each of these measures individually is a weak detector. However, we show that combining them produces a detector that is reasonably effective.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bilmes, J.A., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Proceedings of HLT/NACCL (2003)
British National Corpus (BNC) (2004), http://www.natcorp.ox.ac.uk
European Parliament Temporary Committee on the ECHELON Interception System. Final report on the existence of a global system for the interception of private and commercial communications (ECHELON interception system) (2001)
Fong, S.W., Skillicorn, D.B., Roussinov, D.: Detecting word substitution in adversarial communication. In: Workshop on Link Analysis, Counterterrorism and Security at the SIAM International Conference on Data Mining, to appear (2006)
Golding, A.R., Roth, D.: A Winnow-based approach to context-sensitive spelling correction. In: Machine Learning, Special issue on Machine Learning and Natural Language (1999)
Ferrer, R., Cancho, I., Solé, R.V.: The small world of human language. In: Proceedings of the Royal Society of London Series B – Biological Sciences, pp. 2261–2265 (2001)
Lee, H., Ng, A.Y.: Spam deobfuscation using a Hidden Markov Model. In: Proceedings of the Second Conference on Email and Anti-Spam (2005)
Roussinov, D., Zhao, L.: Automatic discovery of similarity relationships through web mining. In: Decision Support Systems, pp. 149–166 (2003)
Roussinov, D., Zhao, L., Fan, W.: Mining context specific similarity relationships using the World Wide Web. In: Proceedings of the 2005 Conference on Human Language Technologies (2005)
Skillicorn, D.B.: Beyond keyword filtering for message and conversation detection. In: Kantor, P., Muresan, G., Roberts, F., Zeng, D.D., Wang, F.-Y., Chen, H., Merkle, R.C. (eds.) ISI 2005. LNCS, vol. 3495, pp. 231–243. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fong, S., Skillicorn, D.B., Roussinov, D. (2006). Measures to Detect Word Substitution in Intercepted Communication. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, FY. (eds) Intelligence and Security Informatics. ISI 2006. Lecture Notes in Computer Science, vol 3975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760146_17
Download citation
DOI: https://doi.org/10.1007/11760146_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34478-0
Online ISBN: 978-3-540-34479-7
eBook Packages: Computer ScienceComputer Science (R0)