Abstract
This paper aims to present the design and implementation of a prototype that recognizes grooming attacks in the context of COP (child online protection) using Natural Language Processing and Machine Learning hybrid model, via Convolutional Neural Networks (CNN). The solution uses a vector representation of words as the semantic model and the implementation of the model was made using TensorFlow, evaluating the classification of grooming for a text (dialogue) prepared asynchronously in a controlled environment according to methodologies, techniques, frameworks and multiple proposed techniques with his development described. The model predicts a high number of false positives, therefore low precision and F-score, but a high 88.4% accuracy and 0.81 AUROC (Area under the Receiver Operating Characteristic).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smith, M.S.: Internet: Status report on legislative attempts to protect children from unsuitable material on the Web (2008). http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=psyc6&NEWS=N&AN=2008-10767-006
ITU-D 2010. Child Online Protection: Statistical Framework and Indicators 2010. https://www.itu.int/dmspub/itu-d/opb/ind/D-IND-COP.01-11-2010-PDF-E.pdf
Webster, S., Davidson, J., Bifulco, A., Gottschalk, P., Caretti, V., Pham, T., Grove-Hills, J., Turley, C., Tompkins, C., Ciulla, S., Milazzo, V., Schimmenti, A., Craparo, G.: Final Report European Online Grooming Project, European Online Grooming Project, p. 152, March 2012
Kopecký, K., René, S.: Sexting in the population of children and its risks (quantitative research). Int. J. Cyber Criminol. 12, 376–391 (2019). https://doi.org/10.5281/zenodo.3365620
Inches, G., Crestani, F.: Overview of the International Sexual Predator Identification Competition at PAN-2012 (2012)
Bird, S., Klein, E., Beijing, E.L.: Natural Language Processing with Python, 1st edn. O’Reilly Media Inc, Sebastopol (2009). ISBN 9780596803346
Norvig, P.: English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU (2013). http://norvig.com/mayzner.html, http://norvig.com/mayzner.html, Achieved at: http://www.webcitation.org/6b56XqsfK
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: 31st International Conference on Machine Learning, ICML 2014, 4 (2014)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). https://arxiv.org/abs/1603.04467
Acknowledgment
This work was supported by Universidad de Caldas and Universidad Tecnológica de Pereira in their research groups GITIR and SIRIUS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Muñoz, F., Isaza, G., Castillo, L. (2021). SMARTSEC4COP: Smart Cyber-Grooming Detection Using Natural Language Processing and Convolutional Neural Networks. In: Dong, Y., Herrera-Viedma, E., Matsui, K., Omatsu, S., González Briones, A., Rodríguez González, S. (eds) Distributed Computing and Artificial Intelligence, 17th International Conference. DCAI 2020. Advances in Intelligent Systems and Computing, vol 1237. Springer, Cham. https://doi.org/10.1007/978-3-030-53036-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-53036-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53035-8
Online ISBN: 978-3-030-53036-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)