research-article

Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert Evaluation

Authors:
Atiya Nova

Ontario Tech University, Canada

Ontario Tech University, Canada
View Profile

,
Stevie Sansalone

Ontario Tech University, Canada

Ontario Tech University, Canada
View Profile

,
Raquel Robinson

Ontario Tech University, Canada

Ontario Tech University, Canada
View Profile

,
Pejman Mirza-Babaei

Ontario Tech University, Canada

Ontario Tech University, Canada
View Profile

FDG '22: Proceedings of the 17th International Conference on the Foundations of Digital GamesSeptember 2022Article No.: 28Pages 1–12https://doi.org/10.1145/3555858.3555880

Published:04 November 2022Publication History

FDG '22: Proceedings of the 17th International Conference on the Foundations of Digital Games

Pages 1–12

ABSTRACT

Despite the advantages of using expert evaluation as a method within games user research (GUR) (i.e. provides stakeholders low cost, rapid feedback), it does not always accurately reflect the general player’s experience. Testing the game out with real users (also called playtesting) helps bridge this gap by giving game developers an in-depth look into the player experience. However, playtesting is resource intensive and time consuming, making it difficult to implement within the tight time frames of industry game development. AI can help to mitigate some of these issues by providing an automated way to simulate player behaviour and experience. In this paper, we introduce a tool called PathOS+—a playtesting interface which uses AI playtesting data to help enhance expert evaluation. Results from a study conducted with expert participants shows how PathOS+ could contribute to game design and assist developers and researchers in conducting expert evaluations. This is an important contribution as it provides game user researchers and designers with a fast, low-cost and effective game evaluation approach which has the potential to make game evaluation more accessible to indie and smaller game studios.

References

Sinan Ariyurek, Elif Sürer, and Aysu Betin Can. 2021. Playtesting: What is Beyond Personas. ArXiv abs/2107.11965(2021).Google Scholar
Adream Blair-Early and Mike Zender. 2008. User Interface Design Principles for Interaction Design. Design Issues 24(2008), 85–107.Google ScholarCross Ref
Peter Braun, Alfredo Cuzzocrea, Timothy D. Keding, Carson K. Leung, Adam G.M. Padzor, and Dell Sayson. 2017. Game Data Mining: Clustering and Visualization of Online Game Data in Cyber-Physical Worlds. Procedia Computer Science 112 (2017), 2259–2268. https://doi.org/10.1016/j.procs.2017.08.141 Knowledge-Based and Intelligent Information and Engineering Systems: Proceedings of the 21st International Conference, KES-20176-8 September 2017, Marseille, France.Google ScholarDigital Library
Judeth Oden Choi, Jodi Forlizzi, Michael Christel, Rachel Moeller, MacKenzie Bates, and Jessica Hammer. 2016. Playtesting with a Purpose. In Proceedings of the 2016 Annual Symposium on Computer-Human Interaction in Play (Austin, Texas, USA) (CHI PLAY ’16). Association for Computing Machinery, New York, NY, USA, 254–265. https://doi.org/10.1145/2967934.2968103Google ScholarDigital Library
Gilbert Cockton and Alan Woolrych. 2001. Understanding Inspection Methods: Lessons from an Assessment of Heuristic Evaluation. (01 2001). https://doi.org/10.1007/978-1-4471-0353-0_11Google Scholar
Dan Curtis. 2018. Blighttown - Dark Souls Wiki Guide. https://www.ign.com/wikis/dark-souls/BlighttownGoogle Scholar
Fernando de Mesentier Silva, Igor Borovikov, John F. Kolen, Navid Aghdaie, and Kazi A. Zaman. 2018. Exploring Gameplay With AI Agents. In AIIDE.Google Scholar
Shuh-Yeuan Deng and Kuo-Kuang Fan. 2021. Evaluation system for game playability using emotion sensor based on ai. Sensors and Materials 33, 9 (2021). https://doi.org/10.18494/sam.2021.3479Google Scholar
Heather Desurvire and Charlotte Wiberg. 2009. Game Usability Heuristics (PLAY) for Evaluating and Designing Better Games: The Next Iteration. In Online Communities and Social Computing, A. Ant Ozok and Panayiotis Zaphiris (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 557–566.Google Scholar
Heather Desurvire and Charlotte Wiberg. 2010. User Experience Design for Inexperienced Gamers: GAP – Game Approachability Principles. 131–147. https://doi.org/10.1007/978-1-84882-963-3_8Google Scholar
Anders Drachen, Alessandro Canossa, and Georgios N. Yannakakis. 2009. Player modeling using self-organization in Tomb Raider: Underworld. 2009 IEEE Symposium on Computational Intelligence and Games (2009), 1–8.Google ScholarCross Ref
Anders Drachen, Pejman Mirza-Babaei, and Lennart Nacke. 2018. Games User Research. Oxford University Press, Inc., USA.Google Scholar
GamingRevenant. 2018. Far cry 1: Walkthrough - volcano [level 20] (realistic mode) 4K UHD - 60fps max settings. https://www.youtube.com/watch?v=1fD7xcp7KuU&ab_channel=GamingRevenantGoogle Scholar
Pablo Garcia-Sanchez, Alberto TONDA, Antonio Mora, Giovanni Squillero, and Juan Julian Merelo. 2018. Automated playtesting in collectible card games using evolutionary algorithms: A case study in hearthstone. Knowledge-Based Systems 153 (Aug. 2018), 133–146. https://doi.org/10.1016/j.knosys.2018.04.030Google ScholarDigital Library
Ben Gilbert. 2020. Video-game industry revenues grew so much during the pandemic that they reportedly exceeded sports and film combined. Business Insider (2020). https://www.businessinsider.com/video-game-industry-revenues-exceed-sports-and-film-combined-idc-2020-12Google Scholar
Guy Hawkins, Keith Nesbitt, and Scott Brown. 2012. Dynamic Difficulty Balancing for Cautious Players and Risk Takers. International Journal of Computer Games Technology 2012 (06 2012). https://doi.org/10.1155/2012/625476Google ScholarCross Ref
Ebba Thora Hvannberg, Effie Lai-Chong Law, and Marta Kristín Lárusdóttir. 2007. Heuristic evaluation: Comparing ways of finding and reporting usability problems. Interacting with Computers 19, 2 (2007), 225–240. https://doi.org/10.1016/j.intcom.2006.10.001 HCI Issues in Computer Games.Google ScholarDigital Library
Ahmed Khalifa, Aaron Isaksen, Julian Togelius, and Andy Nealen. 2016. Modifying MCTS for Human-like General Video Game Playing. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (New York, New York, USA) (IJCAI’16). AAAI Press, 2514–2520.Google ScholarDigital Library
David E. Kieras. 2003. GOMS Models for Task Analysis.Google Scholar
Pejman Mirza-Babaei, Naeem Moosajee, and Brandon Drenikow. 2016. Playtesting for Indie Studios. In Proceedings of the 20th International Academic Mindtrek Conference (Tampere, Finland) (AcademicMindtrek ’16). Association for Computing Machinery, New York, NY, USA, 366–374. https://doi.org/10.1145/2994310.2994364Google ScholarDigital Library
Jakob Nielsen and Robert L. Mack. 1994. Usability Inspection Methods. Wiley, New York.Google ScholarDigital Library
Jakob Nielsen and Rolf Molich. 1990. Heuristic Evaluation of User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seattle, Washington, USA) (CHI ’90). Association for Computing Machinery, New York, NY, USA, 249–256. https://doi.org/10.1145/97243.97281Google ScholarDigital Library
Atiya N Nova, Stevie Cheryl Francesca Sansalone, and Pejman Mirza-Babaei. 2021. PathOS+: A New Realm in Expert Evaluation. In Extended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play (Virtual Event, Austria) (CHI PLAY ’21). Association for Computing Machinery, New York, NY, USA, 122–127. https://doi.org/10.1145/3450337.3483495Google ScholarDigital Library
Jeffrey Parkin. 2018. Dark souls remastered guide: Blighttown Map. https://www.polygon.com/dark-souls-remastered-guide/2018/7/2/17478910/blighttown-map-items-npcGoogle Scholar
Jack Pooley. 2017. 13 terrible levels in otherwise awesome video games. https://whatculture.com/gaming/13-terrible-levels-in-otherwise-awesome-video-gamesGoogle Scholar
Jack Pooley. 2021. 10 convoluted video game levels everyone got lost in. https://whatculture.com/gaming/10-convoluted-video-game-levels-everyone-got-lost-in?page=9Google Scholar
Mikko Rajanen and Dorina Rajanen. 2018. Heuristic evaluation in game and gamification development. In GamiFIN.Google Scholar
Shaghayegh Roohi, Christian Guckelsberger, Asko Relas, Henri Heiskanen, Jari Takatalo, and Perttu Hämäläinen. 2021. Predicting Game Difficulty and Engagement Using AI Players. Proc. ACM Hum.-Comput. Interact. 5, CHI PLAY, Article 231 (oct 2021), 17 pages. https://doi.org/10.1145/3474658Google ScholarDigital Library
Noor Shaker, Mohammad Hossein Shaker, and Julian Togelius. 2013. Ropossum: An Authoring Tool for Designing, Optimizing and Solving Cut the Rope Levels. In AIIDE.Google Scholar
Samantha . Stahlke, Atiya Nova, and Pejman Mirza-Babaei. 2019. Artificial Playfulness: A Tool for Automated Agent-Based Playtesting. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI EA ’19). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3290607.3313039Google ScholarDigital Library
Samantha Stahlke, Atiya Nova, and Pejman Mirza-Babaei. 2020. Artificial Players in the Design Process: Developing an Automated Testing Tool for Game Level and World Design. Association for Computing Machinery, New York, NY, USA, 267–280. https://doi.org/10.1145/3410404.3414249Google ScholarDigital Library
Ryan Taljonick and Connor Sheridan. 2017. Remember these 11 frustrating video game levels and try not to smash your controller. https://www.gamesradar.com/frustrating-levels-nearly-made-us-break-our-controllers/Google Scholar
Wei-Siong Tan, Dahai Liu, and R. Bishu. 2009. Web evaluation: heuristic evaluation vs. user testing.International Journal of Industrial Ergonomics 39 (2009), 621–627.Google Scholar
Kristinn Thórisson and Helgi Helgason. 2012. Cognitive Architectures and Autonomy: A Comparative Review. Journal of Artificial General Intelligence 3 (01 2012), 1–30. https://doi.org/10.2478/v10229-011-0015-3Google Scholar
Günter Wallner and Simone Kriglstein. 2013. Visualization-based analysis of gameplay data - A review of literature. Entertain. Comput. 4(2013), 143–155.Google ScholarCross Ref
Gareth R. White, Pejman Mirza-babaei, Graham McAllister, and Judith Good. 2011. Weak Inter-Rater Reliability in Heuristic Evaluation of Video Games. In CHI ’11 Extended Abstracts on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI EA ’11). Association for Computing Machinery, New York, NY, USA, 1441–1446. https://doi.org/10.1145/1979742.1979788Google ScholarDigital Library

Index Terms

Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert Evaluation
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User studies

Recommendations

PathOS+: A New Realm in Expert Evaluation
CHI PLAY '21: Extended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play

Expert evaluation is commonly employed in usability research as it is fast and cost-effective. However, as it heavily relies on evaluators’ expertise, it is associated with problems of subjective interpretation. This is particularly noticeable in the ...
Read More
Playtesting for indie studios
AcademicMindtrek '16: Proceedings of the 20th International Academic Mindtrek Conference

Creating video games is a lengthy and demanding process. Financial success for games studios often depends on making games that deliver a fun and engaging experience for a diverse audience of players. Therefore, understanding how players interact and ...
Read More
Do usability expert evaluation and test provide novel and useful data for game development?

A case study was done to study whether usability expert evaluation and testing are suitable for game development. In the study, a computer game under development was first evaluated and then tested. Game developers were then asked to rate the findings ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FDG '22: Proceedings of the 17th International Conference on the Foundations of Digital Games
September 2022
664 pages
ISBN:9781450397957
DOI:10.1145/3555858
Editors:
Kostas Karpouzis,
Stefano Gualeni,
Johanna Pirker,
Allan Fowler
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 November 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
artificial intelligence
expert evaluation
game development
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate152of415submissions,37%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 102
  Total Downloads
- Downloads (Last 12 months)78
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert Evaluation

FDG '22: Proceedings of the 17th International Conference on the Foundations of Digital Games

ABSTRACT

References

Cited By

Index Terms

Recommendations

PathOS+: A New Realm in Expert Evaluation

Playtesting for indie studios

Do usability expert evaluation and test provide novel and useful data for game development?