Abstract
Context
Research has shown gender differences in problem-solving, and gender biases in how software supports it. GenderMag has five problem-solving facets related to gender-inclusiveness: motivation for using software, information processing style, computer self-efficacy, attitude towards risk, and ways of learning new technology. Some facet values are more frequent in women, others in men. The role these facets may play when interacting with social goal models is unexplored.
Objectives
We evaluated the impact of different levels of GenderMag facets on creating, modifying, understanding, and reviewing iStar 2.0 models.
Methods
We performed a quasi-experiment and characterised 180 participants according to each GenderMag facet. Participants performed creation, modification, understanding, and reviewing tasks on iStar 2.0. We measured their accuracy, speed, and ease, using metrics of task success, time, and effort, collected with eye-tracking, EEG and EDA sensors, and participants’ feedback.
Results
Although participants with facet levels frequently seen in women had lower speed when compared to those with facet levels more often observed in men, their accuracy was higher. There were also statistically significant differences in visual and mental effort, and stress. Overall, participants were able to create, modify, and understand the models reasonably well, but struggled when reviewing them.
Conclusions
Participants with a comprehensive information processing style and a conservative attitude towards risk (characteristics frequently seen in female) solved the tasks with lower speed but higher accuracy. Participants with a selective information processing style (characteristic frequently seen in males) were able to better separate what was relevant from what was not. The complementarity of results suggests there is more gain in leveraging people’s diversity.
Similar content being viewed by others
References
Andreassi JL (2013) Psychophysiology: human behavior & physiological response. Psychology Press, London
Appel M, Kronberger N, Aronson J (2011) Stereotype threat impairs ability building: Effects on test preparation among women in science and technology. Eur J Soc Psychol 41(7):904–913
Basili VR, Rombach HD (1988) The TAME project: Towards improvement-oriented software environments. IEEE Trans Softw Eng 14(6):758–773
Beckwith L, Burnett M, Wiedenbeck S, Cook C, Sorte S, Hastings M (2005) Effectiveness of end-user debugging software features: are there gender issues?. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 869–878
Beckwith L, Kissinger C, Burnett M, Wiedenbeck S, Lawrance J, Blackwell A, Cook C (2006) Tinkering and gender in end-user programmers’ debugging. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 231–240
BioSignalsPlux Wristband (2020) BioSignalsPlus Wearables. https://biosignalsplux.com/. (Last access: May, 2020)
BITalino (2020) BITalino - Biomedical Equipement Low-cost Toolkit. http://bitalino.com/. (Last access: May, 2020)
Burnett M, Counts R, Lawrence R, Hanson H (2017) Gender hci and microsoft: Highlights from a longitudinal study. In: 2017 IEEE symposium on visual languages and human-centric computing (VL/HCC), IEEE, pp 139–143
Burnett M, Fleming SD, Iqbal S, Venolia G, Rajaram V, Farooq U, Grigoreanu V, Czerwinski M (2010) Gender differences and programming environments: across programming populations. In: Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, pp 1–10
Burnett M, Horvath A, Oleson A (2020) Gendermag personas foundations document. http://eusesconsortium.org/gender/GenderMagPersona-FoundationDocuments/Foundations.html
Burnett M, Stumpf S, Macbeth J, Makri S, Beckwith L, Kwan I, Peters A, Jernigan W (2016) Gendermag: A method for evaluating software’s gender inclusiveness. Interact Comput 28(6):760–787
Burnett MM, Beckwith L, Wiedenbeck S, Fleming SD, Cao J, Park TH, Grigoreanu V, Rector K (2011) Gender pluralism in problem-solving software. Interact Comput 23(5):450–460
Byrnes JP, Miller DC, Schafer WD (1999) Gender differences in risk taking: a meta-analysis. Psychol Bull 125(3):367
Cafferata P, Tybout AM (1989) Gender differences in information processing: a selectivity interpretation. Cognitive and affective responses to advertising. Lexington Books, Lanham
Cagiltay NE, Tokdemir G, Kilic O, Topalli D (2013) Performing and analyzing non-formal inspections of entity relationship diagram (erd). J Syst Softw 86(8):2184–2195
Cao A, Chintamani KK, Pandya AK, Ellis RD (2009) NASA TLX: Software for assessing subjective mental workload. Behav Res Methods 41 (1):113–117
Carlson NR (2019) Physiology of Behavior, 12th edn. Pearson, London
Charness G, Gneezy U (2012) Strong evidence for gender differences in risk taking. J Econ Behav Organ 83(1):50–58
Cohen J (1992) A power primer. Psychol Bull 112(1):155
Crosby ME, Stelovsky J (1990) How do we read algorithms? a case study. Computer 23(1):25–35
Dalpiaz F, Franch X, Horkoff J (2016) iStar 2.0 language guide arXiv:1605.07767v3
Dishman RK, Nakamura Y, Garcia ME, Thompson RW, Dunn AL, Blair SN (2000) Heart rate variability, trait anxiety, and perceived stress among physically fit men and women. Int J Psychophysiol 37(2):121–133
Dohmen T, Falk A, Huffman D, Sunde U, Schupp J, Wagner GG (2011) Individual risk attitudes: Measurement, determinants, and behavioral consequences. J Eur Econ Assoc 9(3):522–550
Duchowski A (2007) Eye tracking methodology: Theory and practice, vol 373, Springer Science & Business Media, Berlin
Durndell A, Haag Z (2002) Computer self efficacy, computer anxiety, attitudes towards the internet and reported experience with the internet, by gender, in an east european sample. Comput Human Behav 18(5):521–535
Ekman P, Levenson RW, Friesen WV (1983) Autonomic nervous system activity distinguishes among emotions. Science 221(4616):1208–1210. https://doi.org/10.1126/science.6612338
Fisher A, Margolis J (2003) Unlocking the clubhouse: women in computing. In: Grissom S, Knox D, Joyce DT, Dann W (eds) Proceedings of the 34th SIGCSE technical symposium on computer science education, 2003, ACM, p 23
Fisher M, Cox A, Zhao L (2006) Using sex differences to link spatial cognition and program comprehension. In: 2006 22nd IEEE international conference on software maintenance, IEEE, pp 289–298
Fritz T, Begel A, Müller SC, Yigit-Elliott S, Züger M (2014) Using psycho-physiological measures to assess task difficulty in software development. In: Proceedings of the 36th international conference on software engineering, ACM, pp 402–413
Galhotra S, Brun Y, Meliou A (2017) Fairness testing: testing software for discrimination. In: Proceedings of the 11th joint meeting on foundations of software engineering, ACM, pp 498–510
Goldberg JH, Kotval XP (1999) Computer interface evaluation using eye movements: methods and constructs. Int J Ind Ergon 24(6):631–645
Gralha C (2019) iStarLab Tool. http://microlina.github.io/Framework/tools/iStarLab2.0/ (Last access: May 2020)
Gralha C (2019) Quality evaluation of requirements models: The case of goal models and scenarios. Ph.D. thesis, Universidade Nova de Lisboa, Portugal
Gralha C, Goulão M, Araújo J (2019) Analysing gender differences in building social goal models: a quasi-experiment. In: Proceedings of the IEEE 27th international requirements engineering conference (RE 2019), IEEE, pp 165–176
Gralha C, Goulão M, Araújo J (2019) Are there gender differences when interacting with social goal models? Supplemental Material. https://doi.org/10.5281/zenodo.3819208 (Last access: May 2020)
Grigoreanu V, Burnett M, Wiedenbeck S, Cao J, Rector K, Kwan I (2012) End-user debugging strategies: A sensemaking perspective. ACM Trans Comput Human Interact 19(1):5
Haag A, Goronzy S, Schaich P, Williams J (2004) Emotion recognition using bio-sensors: First steps towards an automatic system. In: Proceedings of the tutorial and research workshop on affective dialogue system (ASD 2004). Springer, New York, pp 36–48, https://doi.org/10.1007/978-3-540-24842-2_4
Hancock PA, Chignell MH (1986) Toward a theory of mental workload: Stress and adaptability in human-machine systems. IEEE Transactions on Systems, Man and Cybernetics pp 378–383
Handy TC (2005) Event-related potentials: A methods handbook. MIT Press, Cambridge
Hart SG (2006) Nasa-task load index (nasa-tlx); 20 years later. In: Proceedings of the human factors and ergonomics society annual meeting, vol 50. SAGE Publications, New York, pp 904–908, https://doi.org/10.1177/154193120605000909
Hart SG, Staveland LE (1988) Development of nasa-tlx (task load index): Results of empirical and theoretical research. Adv Psychol 52:139–183. https://doi.org/10.1016/S0166-4115(08)62386-9
Hartzel K (2003) How self-efficacy and gender issues affect software adoption and use. Commun ACM 46:167–171
Hou W, Kaur M, Komlodi A, Lutters WG, Boot L, Cotten SR, Morrell C, Ozok AA, Tufekci Z (2006) Girls don’t waste time:pre-adolescent attitudes toward ICT. In: Olson GM, Jeffries R (eds) Extended abstracts proceedings of the 2006 conference on human factors in computing systems, CHI 2006, ACM, pp 875–880
Huffman AH, Whetten J, Huffman WH (2013) Using technology in higher education: the influence of gender roles on technology self-efficacy. Comput Hum Behav 29(4):1779–1786
Ikutani Y, Uwano H (2014) Brain activity measurement during program comprehension with nirs. In: Proceedings of the 15th IEEE/ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing (SNPD 2014), IEEE, pp 1–6, https://doi.org/10.1109/SNPD.2014.6888727
Jernigan W, Horvath A, Lee M, Burnett M, Cuilty T, Kuttal S, Peters A, Kwan I, Bahmani F, Ko A (2015) A principled evaluation for a principled idea garden. In: IEEE Symposium on visual languages and human-centric computing, IEEE, pp 235–243
Kitchenham B, Madeyski L, Brereton P (2019) Problems with statistical practice in human-centric software engineering experiments. In: Proceedings of the evaluation and assessment on software engineering, pp 134–143
Kitchenham B, Madeyski L, Budgen D, Keung J, Brereton P, Charters S, Gibbs S, Pohthong A (2017) Robust statistical methods for empirical software engineering. Empir Softw Eng 22(2):579–630
Kramer AF (1991) Physiological metrics of mental workload: A review of recent progress. In: Damos DL (ed) Multiple-task performance. 1st edn. Taylor & francis, New York, pp 279–328
Li M, Lu BL (2009) Emotion classification based on gamma-band eeg. In: Proceedings of the international conference of the IEEE engineering in medicine and biology society, IEEE, pp 1223–1226, https://doi.org/10.1109/IEMBS.2009.5334139
Luque-Casado A, Perales JC, Cárdenas D, Sanabria D (2016) Heart rate variability and cognitive processing: the autonomic response to task demands. Biol Psychol 113:83–90
Martini FH, Bartholomew EF (2016) Essentials of anatomy and physiology, 7th edn., Pearson, London
Meyers-Levy J, Loken B (2015) Revisiting gender differences: What we know and what lies ahead. J Consum Psychol 25(1):129–149
Meyers-Levy J, Maheswaran D (1991) Exploring differences in males and females processing strategies. J Consum Res 18(1):63–70
Müller SC, Fritz T (2015) Stuck and frustrated or in flow and happy: sensing developers’ emotions and progress. In: 2015 IEEE/ACM 37Th IEEE international conference on software engineering, IEEE, vol 1, pp 688–699
Murugappan M, Nagarajan R, Yaacob S (2009) Modified energy based time-frequency features for classifying human emotions using eeg. In: International conference on man-machine systems, pp 1–5
Nasa (2020) TLX@NASA Ames - NASA TLX Paper/Pencil Version. https://humansystems.arc.nasa.gov/groups/TLX/tlxpaperpencil.php. (Last access: May)
NeuroSky MindWave EEG headset (2020) EEG Sensors - EEG Headsets NeuroSky MindWave. http://neurosky.com/biosensors/eeg-sensor/biosensors/. (Last access: May, 2020)
Nourbakhsh N, Wang Y, Chen F, Calvo RA (2012) Using galvanic skin response for cognitive load measurement in arithmetic and reading tasks. In: Proceedings of the 24th Australian computer-human interaction conference, ACM, pp 420–423
O’Donnell E, Johnson E (2001) Gender effects on processing effort during analytical procedures. Int J Auditing 5:91–105
Paas FGWC, Van Merriënboer JJG (1993) The efficiency of instructional conditions: an approach to combine mental effort and performance measures. Human Factors J Human Factors Ergonom Soc 35(4):737–743. https://doi.org/10.1177/001872089303500412
Paas FGWC, Van Merriënboer JJG (1994) Instructional control of cognitive load in the training of complex cognitive tasks. Educ Psychol Rev 6(4):351–371. https://doi.org/10.1007/BF02213420
Pajares F, Miller MD (1994) Role of self-efficacy and self-concept beliefs in mathematical problem solving: a path analysis. J Educ Psychol 86(2):193
Pereira R (2020) Avaliação da qualidade de user stories. Master’s thesis, Universidade Nova de Lisboa, Portugal
Petrusel R, Mendling J (2013) Eye-tracking the factors of process model comprehension tasks. In: Proceedings of the 25th international conference on advanced information systems engineering, pp 224–239, https://doi.org/10.1007/978-3-642-38709-8_15
Pimentel J, Castro J (2018) Pistar tool – a pluggable online tool for goal modeling. In: Proceedings of the IEEE international requirements engineering conference (RE 2018), IEEE, pp 498–499, https://doi.org/10.1109/RE.2018.00071
Poole A, Ball LJ (2006) Eye tracking in HCI and usability research. Encyclopedia Human Comput Interact 1:211–219
Porras GC, Guéhéneuc YG (2010) An empirical study on the efficiency of different design pattern representations in uml class diagrams. Empir Softw Eng 15(5):493–522
Radach R, Hyona J, Deubel H (2003) The Mind’s eye: Cognitive and applied aspects of eye movement research, 1st edn., Elsevier, New York
Rayner K (1998) Eye movements in reading and information processing: 20 years of research. Psychol Bull 124(3):372–422. https://doi.org/10.1037/0033-2909.124.3.372
Rosner D, Bean J (2009) Learning from ikea hacking: i’m not one to decoupage a tabletop and call it a day. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 419–422
Runeson P, Host M, Rainer A, Regnell B (2012) Case study research in software engineering: Guidelines and examples. Wiley, New York
Santos M, Gralha C, Goulão M, Araujo J, Moreira A (2018) On the impact of semantic transparency on understanding and reviewing social goal models. In: Proceedings of the IEEE 26th international requirements engineering conference (RE 2018), IEEE, pp 228–239
Santos M, Gralha C, Goulão M, Araújo J, Moreira A, Cambeiro J (2016) What is the impact of bad layout in the understandability of social goal models?. In: Proceedings of the IEEE 24th international requirements engineering conference (RE 2016), IEEE, pp 206–215
Sharafi Z, Marchetto A, Susi A, Antoniol G, Guéhéneuc YG (2013) An empirical study on the efficiency of graphical vs. textual representations in requirements comprehension. In: Proceedings of the 21st international conference on program comprehension, IEEE, pp 33–42
Sharafi Z, Shaffer T, Sharif B, et al. (2015) Eye-tracking metrics in software engineering. In: 2015 Asia-pacific software engineering conference (APSEC), IEEE, pp 96–103
Sharafi Z, Soh Z, Guéhéneuc YG (2015) A systematic literature review on the usage of eye-tracking in software engineering. Inf Softw Technol 67:79–107
Sharafi Z, Soh Z, Guéhéneuc YG, Antoniol G (2012) Women and men – different but equal: on the impact of identifier style on source code reading. In: 20th IEEE international conference on program comprehension (ICPC), IEEE, pp 27–36
Sharif B (2011) Empirical assessment of uml class diagram layouts based on architectural importance. In: Proceeding of the 27th international conference on software maintenance, IEEE, pp 544–549, https://doi.org/10.1109/ICSM.2011.6080828
Sharif B, Maletic J (2010) An eye tracking study on the effects of layout in understanding the role of design patterns. In: Proceedings of the 26th IEEE international conference on software maintenance, IEEE, pp 1–10, https://doi.org/10.1109/ICSM.2010.5609582
Shi Y, Ruiz N, Taib R, Choi E, Chen F (2007) Galvanic skin response (gsr) as an index of cognitive load, pp 2651–2656
Showkat D, Grimm C (2018) Identifying gender differences in information processing style, self-efficacy, and tinkering for robot tele-operation. In: Proceedings of the 15th international conference on ubiquitous robots, IEEE, pp 443–448
Siegmund J, Kästner C, Apel S, Parnin C, Bethmann A, Leich T, Saake G, Brechmann A (2014) Understanding understanding source code with functional magnetic resonance imaging. In: Proceedings of the 36th international conference on software engineering (CAiSE 2014), ACM, pp 378–389, https://doi.org/10.1145/2568225.2568252
Simon SJ (2001) The impact of culture and gender on web sites: an empirical study. Data Base 32(1):18–37
Sloan RP, Shapiro PA, Bagiella E, Boni SM, Paik M, Bigger Jr JT, Steinman RC, Gorman JM (1994) Effect of mental stress throughout the day on cardiac autonomic control. Biol Psychol 37(2):89–99
de Smet B, Lempereur L, Sharafi Z, Guéhéneuc YG, Antoniol G, Habra N (2014) Taupe: Visualizing and analyzing eye-tracking data. Sci Comput Program 79:260–278
Smith ME, Gevins A (2005) Neurophysiologic monitoring of mental workload and fatigue during operation of a flight simulator. In: Biomonitoring for physiological and cognitive performance during military operations, International society for optics and photonics, vol 5797, pp 116–127
Störrle H, Baltsen N, Christoffersen H, Maier A (2014) On the impact of diagram layout: How are models actually read?. In: International conference on model driven engineering languages and systems (moDELS), pp 31–35
Szafir D, Mutlu B (2012) Pay attention!: designing adaptive agents that monitor and improve user engagement. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 11–20
Tan DS, Czerwinski M, Robertson G (2003) Women go with the (optical) flow. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 209–215
Tatum WO (2014) Handbook of EEG Interpretation, 2nd edn. Demos Medical Publishing, New York
The Eye Tribe (2020) The Eye Tribe eye-tracker. https://theeyetribe.com/. (Last access: May, 2020)
Torkzadeh G, Koufteros X (1994) Factorial validity of a computer self-efficacy scale and the impact of computer training. Educ Psychol Measure 54 (3):813–821
Vegas S, Apa C, Juristo N (2016) Crossover designs in software engineering experiments: Benefits and perils. IEEE Trans Softw Eng 42(2):120–135
Vorvoreanu M, Zhang L, Huang Y, Hilderbrand C, Steine-Hanson Z, Burnett M (2019) From gender biases to gender-inclusive design: an empirical investigation. In: ACM SIGCHI
Weber EU, Blais AR, Betz NE (2002) A domain-specific risk-attitude scale: Measuring risk perceptions and risk behaviors. J Behav Decision Making 15(4):263–290
Welford AT (1978) Mental workload as a function of demand, capacity, strategy and skill. Ergonomics 21(3):151–167. https://doi.org/10.1080/00140137808931710
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2012) Experimentation in software engineering, 2nd edn. Springer, London
Yeh YY, Wickens CD (1988) Dissociation of performance and subjective measures of workload. Human Factors J Human Factors Ergonom Soc 30(1):111–120. https://doi.org/10.1177/001872088803000110
Yu E (1995) Modelling strategic relationships for process reengineering. Ph.D. thesis, University of Toronto, Canada
Yu E (1997) Towards modelling and reasoning support for early-phase requirements engineering. In: Proceedings of ISRE’97: 3rd IEEE international symposium on requirements engineering, IEEE, pp 226–235
Yusuf S, Kagdi H, Maletic J, et al. (2007) Assessing the comprehension of uml class diagrams via eye tracking. In: Proceeding of the 15th international conference on program comprehension, IEEE, pp 113–122
Acknowledgements
We thank NOVA LINCS UID/CEC/04516/2019 and FCT-MCTES SFRH/BD/108492/ 2015 for financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Kelly Blincoe, Daniela Damian, and Anna Perini
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Requirements Engineering
Rights and permissions
About this article
Cite this article
Gralha, C., Goulão, M. & Araujo, J. Are there gender differences when interacting with social goal models?. Empir Software Eng 25, 5416–5453 (2020). https://doi.org/10.1007/s10664-020-09883-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-020-09883-y