Gender Differences in Deceivers Writing Style

Pérez-Rosas, Verónica; Mihalcea, Rada

doi:10.1007/978-3-319-13647-9_17

Verónica Pérez-Rosas^22,23 &
Rada Mihalcea^22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8856))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1765 Accesses
1 Citations

Abstract

The widespread use of deception in written content has motivated the need for methods to automatically profile and identify deceivers. Particularly, the identification of deception based on demographic data such as gender, age, and religion, has become of importance due to ethical and security concerns. Previous work on deception detection has studied the role of gender using statistical approaches and domain-specific data. This work explores gender detection in open domain truths and lies using a machine learning approach. First, we collect a deception dataset consisting of truths and lies from male and female participants. Second, we extract a large feature set consisting of n-grams, shallow and deep syntactic features, semantic features derived from a psycholinguistics lexicon, and features derived from readability metrics. Third, we build deception classifiers able to predict participant’s gender with classification accuracies ranging from 60-70%. In addition, we present an analysis of differences in the linguistic style used by deceivers given their reported gender.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Almela, A., Valencia-García, R., Cantos, P.: Seeing through deception: A computational approach to deceit detection in written communication. In: Proceedings of the Workshop on Computational Approaches to Deception Detection, pp. 15–22. Association for Computational Linguistics, Avignon (2012), http://www.aclweb.org/anthology/W12-0403
Google Scholar
De Paulo, B., Lindsay, J., Malone, B., Muhlenbruck, L., Charlton, K., Cooper, H.: Cues to deception. Psychological Bulletin 129(1) (2003)
Google Scholar
Dreber, A., Johannesson, M.: Gender differences in deception. Economics Letters 99(1), 197–199 (2008)
Article Google Scholar
Feng, S., Banerjee, R., Choi, Y.: Syntactic stylometry for deception detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, ACL 2012, vol. 2, pp. 171–175. Association for Computational Linguistics, Stroudsburg (2012), http://dl.acm.org/citation.cfm?id=2390665.2390708
Google Scholar
Fornaciari, T., Poesio, M.: Automatic deception detection in italian court cases. Artificial Intelligence and Law 21(3), 303–340 (2013), http://dx.doi.org/10.1007/s10506-013-9140-4
Article Google Scholar
Guadagno, R.E., Okdie, B.M., Kruse, S.A.: Dating deception: Gender, online dating, and exaggerated self-presentation. Comput. Hum. Behav. 28(2), 642–647 (2012), http://dx.doi.org/10.1016/j.chb.2011.11.010
Article Google Scholar
Ho, S.M., Hollister, J.M.: Guess who? an empirical study of gender deception and detection in computer-mediated communication. Proceedings of the American Society for Information Science and Technology 50(1), 1–4 (2013)
Article Google Scholar
Kaina, J., Ceruti, M.G., Liu, K., McGirr, S.C., Law, J.B.: Deception detection in multicultural coalitions: Foundations for a cognitive model. Tech. rep., DTIC Document (2011)
Google Scholar
Lu, X.: Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics 15(4), 474–496 (2010)
Article Google Scholar
Mihalcea, R., Strapparava, C.: The lie detector: Explorations in the automatic recognition of deceptive language. In: Proceedings of the Association for Computational Linguistics (ACL 2009), Singapore (2009)
Google Scholar
Mihalcea, R., Pulman, S.: Linguistic ethnography: Identifying dominant word classes in text. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 594–602. Springer, Heidelberg (2009)
Chapter Google Scholar
Ott, M., Choi, Y., Cardie, C., Hancock, J.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 309–319. Association for Computational Linguistics, Stroudsburg (2011), http://dl.acm.org/citation.cfm?id=2002472.2002512
Google Scholar
Pennebaker, J., Francis, M.: Linguistic inquiry and word count: LIWC. Erlbaum Publishers (1999)
Google Scholar
Tilley, P., George, J.F., Marett, K.: Gender differences in deception and its detection under varying electronic media conditions. In: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS 2005) - Track 1, vol. 1, p. 24.2. IEEE Computer Society, Washington, DC (2005), http://dx.doi.org/10.1109/HICSS.2005.284
Toma, C., Hancock, J., Ellison, N.: Separating fact from fiction: An examination of deceptive self-presentation in online dating profiles. Personality and Social Psychology Bulletin 34(8), 1023–1036 (2008), http://psp.sagepub.com/content/34/8/1023.abstract
Article Google Scholar
Verhoeven, B., Daelemans, W.: Clips stylometry investigation (csi) corpus: A dutch corpus for the detection of age, gender, personality, sentiment and deception in text. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), European Language Resources Association (ELRA), Reykjavik (2014)
Google Scholar
Warkentin, D., Woodworth, M., Hancock, J.T., Cormier, N.: Warrants and deception in computer mediated communication. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 9–12. ACM (2010)
Google Scholar
Xu, Q., Zhao, H.: Using deep linguistic features for finding deceptive opinion spam. In: Proceedings of COLING 2012: Posters, The COLING 2012 Organizing Committee, Mumbai, India, pp. 1341–1350 (December 2012), http://www.aclweb.org/anthology/C12-2131
Yancheva, M., Rudzicz, F.: Automatic detection of deception in child-produced speech using syntactic complexity features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 944–953. Association for Computational Linguistics, Sofia (2013), http://www.aclweb.org/anthology/P13-1093
Google Scholar

Download references

Author information

Authors and Affiliations

University of North Texas, USA
Verónica Pérez-Rosas & Rada Mihalcea
University of Michigan, USA
Verónica Pérez-Rosas & Rada Mihalcea

Authors

Verónica Pérez-Rosas
View author publications
You can also search for this author in PubMed Google Scholar
Rada Mihalcea
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Alexander Gelbukh
Área Académica de Computación y Electrónica, Carretera Pachuca-Tulancingo, Universidad Autónoma del Estado de Hidalgo, Km. 4.5, Col. Carboneras, Mineral de la Reforma, 42180, Hidalgo, Mexico
Félix Castro Espinoza
Facultad de ciencias, Universidad Autónoma Nacional de México, Ciudad Universitaria, México DF, Mexico
Sofía N. Galicia-Haro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pérez-Rosas, V., Mihalcea, R. (2014). Gender Differences in Deceivers Writing Style. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-13647-9_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics