ABSTRACT
Despite the advantages of using expert evaluation as a method within games user research (GUR) (i.e. provides stakeholders low cost, rapid feedback), it does not always accurately reflect the general player’s experience. Testing the game out with real users (also called playtesting) helps bridge this gap by giving game developers an in-depth look into the player experience. However, playtesting is resource intensive and time consuming, making it difficult to implement within the tight time frames of industry game development. AI can help to mitigate some of these issues by providing an automated way to simulate player behaviour and experience. In this paper, we introduce a tool called PathOS+—a playtesting interface which uses AI playtesting data to help enhance expert evaluation. Results from a study conducted with expert participants shows how PathOS+ could contribute to game design and assist developers and researchers in conducting expert evaluations. This is an important contribution as it provides game user researchers and designers with a fast, low-cost and effective game evaluation approach which has the potential to make game evaluation more accessible to indie and smaller game studios.
- Sinan Ariyurek, Elif Sürer, and Aysu Betin Can. 2021. Playtesting: What is Beyond Personas. ArXiv abs/2107.11965(2021).Google Scholar
- Adream Blair-Early and Mike Zender. 2008. User Interface Design Principles for Interaction Design. Design Issues 24(2008), 85–107.Google ScholarCross Ref
- Peter Braun, Alfredo Cuzzocrea, Timothy D. Keding, Carson K. Leung, Adam G.M. Padzor, and Dell Sayson. 2017. Game Data Mining: Clustering and Visualization of Online Game Data in Cyber-Physical Worlds. Procedia Computer Science 112 (2017), 2259–2268. https://doi.org/10.1016/j.procs.2017.08.141 Knowledge-Based and Intelligent Information and Engineering Systems: Proceedings of the 21st International Conference, KES-20176-8 September 2017, Marseille, France.Google ScholarDigital Library
- Judeth Oden Choi, Jodi Forlizzi, Michael Christel, Rachel Moeller, MacKenzie Bates, and Jessica Hammer. 2016. Playtesting with a Purpose. In Proceedings of the 2016 Annual Symposium on Computer-Human Interaction in Play (Austin, Texas, USA) (CHI PLAY ’16). Association for Computing Machinery, New York, NY, USA, 254–265. https://doi.org/10.1145/2967934.2968103Google ScholarDigital Library
- Gilbert Cockton and Alan Woolrych. 2001. Understanding Inspection Methods: Lessons from an Assessment of Heuristic Evaluation. (01 2001). https://doi.org/10.1007/978-1-4471-0353-0_11Google Scholar
- Dan Curtis. 2018. Blighttown - Dark Souls Wiki Guide. https://www.ign.com/wikis/dark-souls/BlighttownGoogle Scholar
- Fernando de Mesentier Silva, Igor Borovikov, John F. Kolen, Navid Aghdaie, and Kazi A. Zaman. 2018. Exploring Gameplay With AI Agents. In AIIDE.Google Scholar
- Shuh-Yeuan Deng and Kuo-Kuang Fan. 2021. Evaluation system for game playability using emotion sensor based on ai. Sensors and Materials 33, 9 (2021). https://doi.org/10.18494/sam.2021.3479Google Scholar
- Heather Desurvire and Charlotte Wiberg. 2009. Game Usability Heuristics (PLAY) for Evaluating and Designing Better Games: The Next Iteration. In Online Communities and Social Computing, A. Ant Ozok and Panayiotis Zaphiris (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 557–566.Google Scholar
- Heather Desurvire and Charlotte Wiberg. 2010. User Experience Design for Inexperienced Gamers: GAP – Game Approachability Principles. 131–147. https://doi.org/10.1007/978-1-84882-963-3_8Google Scholar
- Anders Drachen, Alessandro Canossa, and Georgios N. Yannakakis. 2009. Player modeling using self-organization in Tomb Raider: Underworld. 2009 IEEE Symposium on Computational Intelligence and Games (2009), 1–8.Google ScholarCross Ref
- Anders Drachen, Pejman Mirza-Babaei, and Lennart Nacke. 2018. Games User Research. Oxford University Press, Inc., USA.Google Scholar
- GamingRevenant. 2018. Far cry 1: Walkthrough - volcano [level 20] (realistic mode) 4K UHD - 60fps max settings. https://www.youtube.com/watch?v=1fD7xcp7KuU&ab_channel=GamingRevenantGoogle Scholar
- Pablo Garcia-Sanchez, Alberto TONDA, Antonio Mora, Giovanni Squillero, and Juan Julian Merelo. 2018. Automated playtesting in collectible card games using evolutionary algorithms: A case study in hearthstone. Knowledge-Based Systems 153 (Aug. 2018), 133–146. https://doi.org/10.1016/j.knosys.2018.04.030Google ScholarDigital Library
- Ben Gilbert. 2020. Video-game industry revenues grew so much during the pandemic that they reportedly exceeded sports and film combined. Business Insider (2020). https://www.businessinsider.com/video-game-industry-revenues-exceed-sports-and-film-combined-idc-2020-12Google Scholar
- Guy Hawkins, Keith Nesbitt, and Scott Brown. 2012. Dynamic Difficulty Balancing for Cautious Players and Risk Takers. International Journal of Computer Games Technology 2012 (06 2012). https://doi.org/10.1155/2012/625476Google ScholarCross Ref
- Ebba Thora Hvannberg, Effie Lai-Chong Law, and Marta Kristín Lárusdóttir. 2007. Heuristic evaluation: Comparing ways of finding and reporting usability problems. Interacting with Computers 19, 2 (2007), 225–240. https://doi.org/10.1016/j.intcom.2006.10.001 HCI Issues in Computer Games.Google ScholarDigital Library
- Ahmed Khalifa, Aaron Isaksen, Julian Togelius, and Andy Nealen. 2016. Modifying MCTS for Human-like General Video Game Playing. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (New York, New York, USA) (IJCAI’16). AAAI Press, 2514–2520.Google ScholarDigital Library
- David E. Kieras. 2003. GOMS Models for Task Analysis.Google Scholar
- Pejman Mirza-Babaei, Naeem Moosajee, and Brandon Drenikow. 2016. Playtesting for Indie Studios. In Proceedings of the 20th International Academic Mindtrek Conference (Tampere, Finland) (AcademicMindtrek ’16). Association for Computing Machinery, New York, NY, USA, 366–374. https://doi.org/10.1145/2994310.2994364Google ScholarDigital Library
- Jakob Nielsen and Robert L. Mack. 1994. Usability Inspection Methods. Wiley, New York.Google ScholarDigital Library
- Jakob Nielsen and Rolf Molich. 1990. Heuristic Evaluation of User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seattle, Washington, USA) (CHI ’90). Association for Computing Machinery, New York, NY, USA, 249–256. https://doi.org/10.1145/97243.97281Google ScholarDigital Library
- Atiya N Nova, Stevie Cheryl Francesca Sansalone, and Pejman Mirza-Babaei. 2021. PathOS+: A New Realm in Expert Evaluation. In Extended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play (Virtual Event, Austria) (CHI PLAY ’21). Association for Computing Machinery, New York, NY, USA, 122–127. https://doi.org/10.1145/3450337.3483495Google ScholarDigital Library
- Jeffrey Parkin. 2018. Dark souls remastered guide: Blighttown Map. https://www.polygon.com/dark-souls-remastered-guide/2018/7/2/17478910/blighttown-map-items-npcGoogle Scholar
- Jack Pooley. 2017. 13 terrible levels in otherwise awesome video games. https://whatculture.com/gaming/13-terrible-levels-in-otherwise-awesome-video-gamesGoogle Scholar
- Jack Pooley. 2021. 10 convoluted video game levels everyone got lost in. https://whatculture.com/gaming/10-convoluted-video-game-levels-everyone-got-lost-in?page=9Google Scholar
- Mikko Rajanen and Dorina Rajanen. 2018. Heuristic evaluation in game and gamification development. In GamiFIN.Google Scholar
- Shaghayegh Roohi, Christian Guckelsberger, Asko Relas, Henri Heiskanen, Jari Takatalo, and Perttu Hämäläinen. 2021. Predicting Game Difficulty and Engagement Using AI Players. Proc. ACM Hum.-Comput. Interact. 5, CHI PLAY, Article 231 (oct 2021), 17 pages. https://doi.org/10.1145/3474658Google ScholarDigital Library
- Noor Shaker, Mohammad Hossein Shaker, and Julian Togelius. 2013. Ropossum: An Authoring Tool for Designing, Optimizing and Solving Cut the Rope Levels. In AIIDE.Google Scholar
- Samantha . Stahlke, Atiya Nova, and Pejman Mirza-Babaei. 2019. Artificial Playfulness: A Tool for Automated Agent-Based Playtesting. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI EA ’19). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3290607.3313039Google ScholarDigital Library
- Samantha Stahlke, Atiya Nova, and Pejman Mirza-Babaei. 2020. Artificial Players in the Design Process: Developing an Automated Testing Tool for Game Level and World Design. Association for Computing Machinery, New York, NY, USA, 267–280. https://doi.org/10.1145/3410404.3414249Google ScholarDigital Library
- Ryan Taljonick and Connor Sheridan. 2017. Remember these 11 frustrating video game levels and try not to smash your controller. https://www.gamesradar.com/frustrating-levels-nearly-made-us-break-our-controllers/Google Scholar
- Wei-Siong Tan, Dahai Liu, and R. Bishu. 2009. Web evaluation: heuristic evaluation vs. user testing.International Journal of Industrial Ergonomics 39 (2009), 621–627.Google Scholar
- Kristinn Thórisson and Helgi Helgason. 2012. Cognitive Architectures and Autonomy: A Comparative Review. Journal of Artificial General Intelligence 3 (01 2012), 1–30. https://doi.org/10.2478/v10229-011-0015-3Google Scholar
- Günter Wallner and Simone Kriglstein. 2013. Visualization-based analysis of gameplay data - A review of literature. Entertain. Comput. 4(2013), 143–155.Google ScholarCross Ref
- Gareth R. White, Pejman Mirza-babaei, Graham McAllister, and Judith Good. 2011. Weak Inter-Rater Reliability in Heuristic Evaluation of Video Games. In CHI ’11 Extended Abstracts on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI EA ’11). Association for Computing Machinery, New York, NY, USA, 1441–1446. https://doi.org/10.1145/1979742.1979788Google ScholarDigital Library
Index Terms
- Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert Evaluation
Recommendations
PathOS+: A New Realm in Expert Evaluation
CHI PLAY '21: Extended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in PlayExpert evaluation is commonly employed in usability research as it is fast and cost-effective. However, as it heavily relies on evaluators’ expertise, it is associated with problems of subjective interpretation. This is particularly noticeable in the ...
Playtesting for indie studios
AcademicMindtrek '16: Proceedings of the 20th International Academic Mindtrek ConferenceCreating video games is a lengthy and demanding process. Financial success for games studios often depends on making games that deliver a fun and engaging experience for a diverse audience of players. Therefore, understanding how players interact and ...
Do usability expert evaluation and test provide novel and useful data for game development?
A case study was done to study whether usability expert evaluation and testing are suitable for game development. In the study, a computer game under development was first evaluated and then tested. Game developers were then asked to rate the findings ...
Comments