Abstract
This paper approaches the problem of automatic pedophile content identification. We present a system for filename categorization, which is trained to identify suspicious files on P2P networks. In our initial experiments, we used regular pornography data as a substitution of child pornography. Our system separates filenames of pornographic media from the others with an accuracy that reaches 91–97%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ulges, A., Stahl, A.: Automatic detection of child pornography using color visual words. In: 2011 IEEE International Conference on Multimedia and Expo. (ICME), pp. 1–6. IEEE (2011)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. In: Proceeding of ACM SIGIR, pp. 841–842. ACM (2010)
Pendar, N.: Toward spotting the pedophile telling victim from predator in text chats. In: International Conference on Semantic Computing, ICSC 2007, pp. 235–241. IEEE (2007)
McGhee, I., Bayzick, J., Kontostathis, A., Edwards, L., McBride, A., Jakubowski, E.: Learning to identify internet sexual predation. International Journal of Electronic Commerce 15(3), 103–122 (2011)
Bogdanova, D., Petersburg, S., Rosso, P., Solorio, T.: On the impact of sentiment and emotion based features in detecting online sexual predators. In: WASSA 2012, pp. 110–118 (2012)
Peersman, C., Daelemans, W., Van Vaerenbergh, L.: Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM (2011)
Peersman, C., Vaassen, F., Van Asch, V., Daelemans, W.: Conversation level constraints on pedophile detection in chat rooms. In: PAN 2012 (2012)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of International Conference on New Methods in Language Processing, Manchester, UK, vol. 12, pp. 44–49 (1994)
Panchenko, A., Morozova, O., Naets, H.: A semantic similarity measure based on lexico-syntactic patterns. In: Proceedings of KONVENS 2012, pp. 174–178 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Panchenko, A., Beaufort, R., Naets, H., Fairon, C. (2013). Towards Detection of Child Sexual Abuse Media: Categorization of the Associated Filenames. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_82
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)