ABSTRACT
Evaluating the output of content generators is still one of the key open research challenges in Procedural Content Generation (PCG). This paper presents a collection of metrics for evaluating the quality of platform game levels, and analyzes how well these metrics are able to capture the human-perceived difficulty, visual aesthetics and enjoyment of these levels. We show empirically, in the context of Infinite Mario Bros (IMB), that some of the proposed metrics yield correlation values with human ratings that are near empirical upper bounds derived from a human inter-rater agreement study. We also show that a simple linear regression model using a subset of our metrics as input features is able to substantially outperform a previous approach that uses a neural network for predicting human-perceived difficulty, visual aesthetics, and enjoyment in IMB levels.
- M. Bauerly and Y. Liu. 2006. Computational modeling and experimental investigation of effects of compositional elements on interface and design aesthetics. International Journal of Human-Computer Studies 64, 8 (2006), 670--682. Google ScholarDigital Library
- Alessandro Canossa and Gillian Smith. 2015. Towards a Procedural Evaluation Technique: Metrics for Level Design. Proceedings of FDG (2015).Google Scholar
- Steve Dahlskog and Julian Togelius. 2013. Patterns as objectives for level generation. (2013).Google Scholar
- Steve Dahlskog, Julian Togelius, and Mark J. Nelson. 2014. Linear levels through n-grams. In Proceedings of the 18th International Academic MindTrek Conference. Google ScholarDigital Library
- Matthew Guzdial and Mark O. Riedl. 2015. Toward Game Level Generation from Gameplay Videos. In Proceedings of the FDG workshop on Procedural Content Generation in Games.Google Scholar
- M. Guzdial, N. Sturtevant, and B. Li. 2016. Deep Static and Dynamic Level Analysis: A Study on Infinite Mario. In Proceedings of the 3rd Experimental AI in Games Workshop. 8.Google Scholar
- P. E. Hart, N.J. Nilsson, and B. Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics SSC-4(2) (1968), 100--107. Google ScholarCross Ref
- Britton Horn, Steve Dahlskog, Noor Shaker, Gillian Smith, and Julian Togelius. 2014. A comparative evaluation of procedural level generators in the mario ai framework. (2014).Google Scholar
- J. R. H Mariño, W. M. P. Reis, and L. H. S. Lelis. 2015. An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.Google Scholar
- J. R. H. Marino and L. H. S. Lelis. 2016. A Computational Model based on Symmetry for Generating Visually Pleasing Maps of Platform Games. In Proceedings of the Conference on Artificial Intelligence and Interactive Digital Entertainment.Google Scholar
- D. C. L. Ngo, A. Samsudin, and R. Abdullah. 2000. Aesthetic measures for assessing graphic screens. J. Inf. Sci. Eng 16, 1 (2000), 97--116.Google Scholar
- David Chek Ling Ngo, Lian Seng Teo, and John G Byrne. 2003. Modelling interface aesthetics. Information Sciences 152 (2003), 25--46.Google ScholarDigital Library
- Christopher Pedersen, Julian Togelius, and Georgios N Yannakakis. 2010. Modeling player experience for content creation. IEEE Transactions on Computational Intelligence and AI in Games 2, 1 (2010), 54--67. Google ScholarCross Ref
- Paolo Piselli, Mark Claypool, and James Doyle. 2009. Relating cognitive models of computer games to user evaluations of entertainment.. In FDG, Jim Whitehead and R. Michael Young (Eds.). ACM, 153--160. Google ScholarDigital Library
- W. M. P. Reis, L. H. S. Lelis, and Y. Gal. 2015. Human Computation for Procedural Content Generation in Platform Games. In Conference of Computational Intelligence and Games. IEEE, 99--106. Google ScholarCross Ref
- Santiago Londoño and Olana Missura. 2015. Graph Grammars for Super Mario Bros Levels. In Proceedings of the Procedural Content Generation Workshop.Google Scholar
- Noor Shaker and Moahamed Abou-Zleikha. 2014. Alone We Can Do So Little, Together We Can Do So Much: A Combinatorial Approach for Generating Game Content. In Proceedings of AIIDE.Google Scholar
- N. Shaker, M. Nicolau, G. N. Yannakakis, J. Togelius, and M. O'Neill. 2012. Evolving levels for Super Mario Bros using grammatical evolution. In Conference of Comp. Intell. and Games. IEEE, 304--311. Google ScholarCross Ref
- G. Smith, M. Treanor, J. Whitehead, M. Mateas, M. Treanor, J. March, and M. Cha. 2011. Launchpad: A Rhythm-Based Level Generation for 2D Platformers. IEEE Transactions on Computing Intelligence and AI in Games 3, 1 (2011), 1--16. Google ScholarCross Ref
- Gillian Smith and Jim Whitehead. 2010. Analyzing the expressive range of a level generator. In Proceedings of the 2010 Workshop on Procedural Content Generation in Games. ACM, 4.Google ScholarDigital Library
- Gillian Smith, Jim Whitehead, and Michael Mateas. 2010. Tanagra: A mixed-initiative level design tool. In Proceedings of the Fifth International Conference on the Foundations of Digital Games. ACM, 209--216. Google ScholarDigital Library
- Sam Snodgrass and Santiago Ontanon. 2016. Learning to Generate Video Game Maps Using Markov Models. IEEE TCIAIG (2016).Google Scholar
- Adam Summerville and Michael Mateas. 2016. Super Mario as a String: Platformer Level Generation Via LSTMs. In To Appear In Proceedings of the First International Conference of DiGRA and FDG.Google Scholar
- Adam Summerville, Shweta Philip, and Michael Mateas. 2015. MCMCTS PCG 4 SMB: Monte Carlo Tree Search to Guide Platformer Level Generation. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.Google Scholar
- R. Tibshirani. 1994. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society, Series B 58 (1994), 267--288.Google Scholar
- Christopher W Totten. 2014. An architectural approach to level design. CRC Press.Google Scholar
- R. M. Yerkes and J. D. Dodson. 1908. The relation of strength of stimulus to rapidity of habit formation. Journal of Comparative Neurology and Psychology 18 (1908), 459--482. Google ScholarCross Ref
Index Terms
- Understanding mario: an evaluation of design metrics for platformers
Recommendations
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
Understanding the value of considering client usage context in package cohesion for fault-proneness prediction
By far, many package cohesion metrics have been proposed from internal structure view and external usage view. Based on whether client usage context (i.e., the way packages are used by their clients) is exploited, we group these metrics into two ...
An Objective Measure of Digital System Design Quality
ISQED '00: Proceedings of the 1st International Symposium on Quality of Electronic DesignThis paper proposes a method for defining the quality of a digital system in terms of measurable parameters of both the specification and a subsequent implementation of the design. Initially, software quality metrics are reviewed together with their ...
Comments