Abstract
The widespread use of deception in written content has motivated the need for methods to automatically profile and identify deceivers. Particularly, the identification of deception based on demographic data such as gender, age, and religion, has become of importance due to ethical and security concerns. Previous work on deception detection has studied the role of gender using statistical approaches and domain-specific data. This work explores gender detection in open domain truths and lies using a machine learning approach. First, we collect a deception dataset consisting of truths and lies from male and female participants. Second, we extract a large feature set consisting of n-grams, shallow and deep syntactic features, semantic features derived from a psycholinguistics lexicon, and features derived from readability metrics. Third, we build deception classifiers able to predict participant’s gender with classification accuracies ranging from 60-70%. In addition, we present an analysis of differences in the linguistic style used by deceivers given their reported gender.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Almela, A., Valencia-García, R., Cantos, P.: Seeing through deception: A computational approach to deceit detection in written communication. In: Proceedings of the Workshop on Computational Approaches to Deception Detection, pp. 15–22. Association for Computational Linguistics, Avignon (2012), http://www.aclweb.org/anthology/W12-0403
De Paulo, B., Lindsay, J., Malone, B., Muhlenbruck, L., Charlton, K., Cooper, H.: Cues to deception. Psychological Bulletin 129(1) (2003)
Dreber, A., Johannesson, M.: Gender differences in deception. Economics Letters 99(1), 197–199 (2008)
Feng, S., Banerjee, R., Choi, Y.: Syntactic stylometry for deception detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, ACL 2012, vol. 2, pp. 171–175. Association for Computational Linguistics, Stroudsburg (2012), http://dl.acm.org/citation.cfm?id=2390665.2390708
Fornaciari, T., Poesio, M.: Automatic deception detection in italian court cases. Artificial Intelligence and Law 21(3), 303–340 (2013), http://dx.doi.org/10.1007/s10506-013-9140-4
Guadagno, R.E., Okdie, B.M., Kruse, S.A.: Dating deception: Gender, online dating, and exaggerated self-presentation. Comput. Hum. Behav. 28(2), 642–647 (2012), http://dx.doi.org/10.1016/j.chb.2011.11.010
Ho, S.M., Hollister, J.M.: Guess who? an empirical study of gender deception and detection in computer-mediated communication. Proceedings of the American Society for Information Science and Technology 50(1), 1–4 (2013)
Kaina, J., Ceruti, M.G., Liu, K., McGirr, S.C., Law, J.B.: Deception detection in multicultural coalitions: Foundations for a cognitive model. Tech. rep., DTIC Document (2011)
Lu, X.: Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics 15(4), 474–496 (2010)
Mihalcea, R., Strapparava, C.: The lie detector: Explorations in the automatic recognition of deceptive language. In: Proceedings of the Association for Computational Linguistics (ACL 2009), Singapore (2009)
Mihalcea, R., Pulman, S.: Linguistic ethnography: Identifying dominant word classes in text. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 594–602. Springer, Heidelberg (2009)
Ott, M., Choi, Y., Cardie, C., Hancock, J.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 309–319. Association for Computational Linguistics, Stroudsburg (2011), http://dl.acm.org/citation.cfm?id=2002472.2002512
Pennebaker, J., Francis, M.: Linguistic inquiry and word count: LIWC. Erlbaum Publishers (1999)
Tilley, P., George, J.F., Marett, K.: Gender differences in deception and its detection under varying electronic media conditions. In: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS 2005) - Track 1, vol. 1, p. 24.2. IEEE Computer Society, Washington, DC (2005), http://dx.doi.org/10.1109/HICSS.2005.284
Toma, C., Hancock, J., Ellison, N.: Separating fact from fiction: An examination of deceptive self-presentation in online dating profiles. Personality and Social Psychology Bulletin 34(8), 1023–1036 (2008), http://psp.sagepub.com/content/34/8/1023.abstract
Verhoeven, B., Daelemans, W.: Clips stylometry investigation (csi) corpus: A dutch corpus for the detection of age, gender, personality, sentiment and deception in text. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), Reykjavik (2014)
Warkentin, D., Woodworth, M., Hancock, J.T., Cormier, N.: Warrants and deception in computer mediated communication. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 9–12. ACM (2010)
Xu, Q., Zhao, H.: Using deep linguistic features for finding deceptive opinion spam. In: Proceedings of COLING 2012: Posters, The COLING 2012 Organizing Committee, Mumbai, India, pp. 1341–1350 (December 2012), http://www.aclweb.org/anthology/C12-2131
Yancheva, M., Rudzicz, F.: Automatic detection of deception in child-produced speech using syntactic complexity features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 944–953. Association for Computational Linguistics, Sofia (2013), http://www.aclweb.org/anthology/P13-1093
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Pérez-Rosas, V., Mihalcea, R. (2014). Gender Differences in Deceivers Writing Style. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-13647-9_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)