Skip to main content
Log in

Fueling Prediction of Player Decisions: Foundations of Feature Engineering for Optimized Behavior Modeling in Serious Games

  • Original research
  • Published:
Technology, Knowledge and Learning Aims and scope Submit manuscript

Abstract

As a digital learning medium, serious games can be powerful, immersive educational vehicles and provide large data streams for understanding player behavior. Educational data mining and learning analytics can effectively leverage big data in this context to heighten insight into student trajectories and behavior profiles. In application of these methods, distilling event-stream data down to a set of salient features for analysis (i.e. feature engineering) is a vital element of robust modeling. This paper presents a process for systematic game-based feature engineering to optimize insight into player behavior: the IDEFA framework (Integrated Design of Event-stream Features for Analysis). IDEFA aligns game design and data collection for high-resolution feature engineering, honed through critical, iterative interplay with analysis. Building on recent research in game-based data mining, we empirically investigate IDEFA application in serious games. Results show that behavioral models which used a full feature set produced more meaningful results than those with no feature engineering, with greater insight into impactful learning interactions, and play trajectories characterizing groups of players. This discovery of emergent player behavior is fueled by the data framework, resultant base data stream, and rigorous feature creation process put forward in IDEFA—integrating iterative design, feature engineering, and analysis for optimal insight into serious play.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. It's worth noting that this is useful across game genres. For example, in nonlinear and open-world games, multiple kinds of play progression may also be recorded, e.g. in Skyrim (an open-world role-playing game, where players are not on a linear track), multiple opened quests can be tracked at once, as well as the player's current "xp" or experience level (points gained as the player does more and more in the game).

  2. Other ETL (extract, transform, load) processes may be involved to convert the data into a final legible form that reflects the data framework design.

  3. Source generalized for purposes of blinding the manuscript.

  4. www.adageapi.org.

  5. https://www.glasslabgames.org.

  6. http://www.ageoflearning.com/.

  7. https://www.glasslabgames.org/.

References

  • Baker, R. (2010). Data mining for education. International Encyclopedia of Education,7(3), 112–118.

    Article  Google Scholar 

  • Baker, R. (2013). Big data in educationweek six: Feature engineering. Retrieved September 4, 2014, from https://www.educationaldatamining.org/bde/W003V001v1.pptx.

  • Baker, R., & Clarke-Midura, J. (2013). Predicting successful inquiry learning in a virtual performance assessment for science. In Proceedings of the 21st international conference on user modeling, adaptation, and personalization (pp. 203–214). Retrieved February 12, 2014, from https://www.columbia.edu/~rsb2162/UMAP-2013-BCM-v9.pdf.

  • Baker, R., & de Carvalho, A. (2008). Labeling student behavior faster and more precisely with text replays. In Proceedings of the 1st international conference on educational data mining (pp. 38–47). Retrieved March 9, 2014, from https://learnlab.org/uploads/mypslc/publications/edm2008textreplayalgebrag.pdf.

  • Baker, R., Gowda, S., & Corbett, A. (2011). Towards predicting future transfer of learning. In Artificial intelligence in education (pp. 23–30). Springer. Retrieved January 26, 2014, from https://link.springer.com/content/pdf/10.1007/978-3-642-21869-9.pdf#page=49.

  • Baker, R., & Siemens, G. (2014). Educational data mining and learning analytics. In K. Sawyer (Ed.), Cambridge handbook of the learning sciences (2nd ed., pp. 253–274). New York: Cambridge University Press.

    Chapter  Google Scholar 

  • Berland, M., Baker, R., & Blikstein, P. (2014). Educational data mining and learning analytics: Applications to constructionist research. Technology, Knowledge and Learning,19(1–2), 205–220.

    Article  Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks.

    Google Scholar 

  • Chung, G. K. W. K. (2015). Guidelines for the design, implementation, and analysis of game telemetry. In C. S. Loh, Y. Sheng, & D. Ifenthaler (Eds.), Serious games analytics: Methodologies for performance measurement, assessment, and improvement (pp. 59–79). New York, NY: Springer.

    Chapter  Google Scholar 

  • Clark, D. B., Martinez-Garza, M. M., Biswas, G., Luecht, R. M., & Sengupta, P. (2012). Driving assessment of students’ explanations in game dialog using computer-adaptive testing and hidden Markov modeling. In D. Ifenthaler, D. Eseryel, & X. Ge (Eds.), Assessment in game-based learning (pp. 173–199). New York: Springer.

    Chapter  Google Scholar 

  • Corrigan, S., DiCerbo, K. E., Frenz, M., Hoffman, E., John, M., & Owen, V. E. (2015). GlassLab game design handbook. GlassLab Games, Redwood City, CA. Retrieved June 16, 2015, from https://gamedesign.glasslabgames.org/.

  • Danielak, B. (2014). Analyzing data with ADAGE. Gitbooks. Retrieved June 23, 2014, from https://capbri.gitbooks.io/makescape-adage-gitbook/.

  • DiCerbo, K. E., & Kidwai, K. (2013). Detecting player goals from game log files. Presented at the 6th international conference on educational data mining. Retrieved January 24, 2014, from https://www.educationaldatamining.org/EDM2013/papers/rn_paper_58.pdf.

  • Fogarty, J. A. (2006). Constructing and evaluating sensor-based statistical models of human interruptibility. IBM Research. Retrieved April 2, 2014, from https://www.cs.cmu.edu/afs/.cs.cmu.edu/Web/People/jfogarty/publications/jfogarty-dissertation-final.pdf.

  • Fogarty, J., Hudson, S. E., & Lai, J. (2004). Examining the robustness of sensor-based statistical models of human interruptibility. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 207–214). ACM

  • Gee, J. P. (2003). What video games have to teach us about learning and literacy. Basingstoke: Palgrave Macmillan.

    Book  Google Scholar 

  • Gee, J. P. (2005). Learning by design: Good video games as learning machines. E-Learning and Digital Media,2(1), 5–16.

    Article  Google Scholar 

  • Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. The Journal of Machine Learning Research,3, 1157–1182.

    Google Scholar 

  • Habgood, M. J., & Ainsworth, S. E. (2011). Motivating children to learn effectively: Exploring the value of intrinsic integration in educational games. The Journal of the Learning Sciences,20(2), 169–206.

    Article  Google Scholar 

  • Halverson, R., & Owen, V. E. (2014). Game based assessment: An integrated model for capturing evidence of learning in play. International Journal of Learning Technology [Special Issue: Game-Based Learning],9(2), 111–138.

    Article  Google Scholar 

  • Hao, J., Smith, L., Mislevy, R., von Davier, A., & Bauer, M. (2016). Taming log files from game/simulation-based assessments: Data models and data analysis tools (ETS Research Report Series) (pp. 1–17).

  • Ifenthaler, D., Adcock, A. B., Erlandson, B. E., Gosper, M., Greiff, S., & Pirnay-Dummer, P. (2014). Challenges for education in a connected world: Digital learning, data rich environments, and computer-based assessment—introduction to the inaugural special issue of technology, knowledge and learning. Technology, Knowledge and Learning,19(1–2), 121–126. https://doi.org/10.1007/s10758-014-9228-2.

    Article  Google Scholar 

  • Kapur, M. (2006). Productive failure. In S. Barab, K. Hay, & D. Hickey (Eds.), Proceedings of the international conference on the learning sciences (Vol. 0, pp. 307–313).

  • Kevan, J. M., & Ryan, P. R. (2016). Experience API: Flexible, decentralized and activity-centric data collection. Technology, Knowledge and Learning,21(1), 143–149. https://doi.org/10.1007/s10758-015-9260-x.

    Article  Google Scholar 

  • Koedinger, K. R., Corbett, A. T., & Perfetti, C. (2012). The knowledge-learning-instruction framework: bridging the science-practice chasm to enhance robust student learning. Cognitive Science,36(5), 757–798. https://doi.org/10.1111/j.1551-6709.2012.01245.x.

    Article  Google Scholar 

  • Loh, C. S. (2012). Information trails: In-process assessment of game-based learning. In D. Ifenthaler, D. Eseryel, & X. Ge (Eds.), Assessment in Game-Based Learning (pp. 123–144). New York: Springer.

    Chapter  Google Scholar 

  • Malkiewich, L., Baker, R., Shute, V. J., Kai, S., & Paquette, L. (2016). Classifying behavior to elucidate elegant problem solving in an educational game. In Proceedings of the 9th international conference on educational data mining (p. 448). Raleigh, NC.

  • Mislevy, R. J., & Haertel, G. D. (2006). Implications of evidence centered design for educational testing. Educational Measurement: Issues and Practice,25(4), 6–20.

    Article  Google Scholar 

  • Newell, A. (1973). In W. G. Chase (Ed.), Visual information processing. New York: Academic Press.

    Google Scholar 

  • Owen, V. E. (2014). Capturing in-game learner trajectories with ADAGE (Assessment Data Aggregator for Game Environments): A cross-method analysis. Madison, WI: University of Wisconsin-Madison.

    Google Scholar 

  • Owen, V. E., Anton, G., & Baker, R. (2016). Modeling user exploration and boundary testing in digital learning games. In Proceedings of the 2016 conference on user modeling adaptation and personalization (pp. 301–302). New York, NY: ACM Press

  • Owen, V. E., Shapiro, R. B., & Halverson, R. (2013). Gameplay as assessment: Analyzing event-stream player data and learning using GBA (a Game-Based Assessment Model). In CSCL 2013 conference proceedings (Vol. Volume 1—Full Papers & Symposia, pp. 360–367). Madison, WI: International Society of the Learning Sciences (ISLS).

  • Paquette, L., de Carvahlo, A., Baker, R., & Ocumpaugh, J. (2014). Reengineering the feature distillation process: A case study in detection of gaming the system. In Educational data mining 2014. Retrieved September 13, 2015, from https://www.educationaldatamining.org/conferences/index.php/EDM/2014/paper/download/1447/1413.

  • Salen, K., & Zimmerman, E. (2004). Rules of play: Game design fundamentals. Cambridge: MIT Press.

    Google Scholar 

  • Sao Pedro, M. A., Baker, R., & Gobert, J. D. (2012). Improving construct validity yields better models of systematic inquiry, even with less information. In International conference on user modeling, adaptation, and personalization (pp. 249–260). Springer. Retrieved May 5, 2014, from https://link.springer.com/10.1007%2F978-3-642-31454-4_21.

  • Serrano-Laguna, A., Martinez-Ortiz, I., Haag, J., Regan, D., Johnson, A., & Fernández-Manjóna, B. (2017). Applying standards to systematize learning analytics in serious games. Computer Standards & Interfaces,50, 116–123.

    Article  Google Scholar 

  • Shute, V. J. (2011). Stealth assessment in computer-based games to support learning. Computer Games and Instruction,55(2), 503–524.

    Google Scholar 

  • Shute, V. J., & Kim, Y. J. (2014). Formative and stealth assessment. In J. M. Spector, M. D. Merrill, J. Elen, & M. J. Bishop (Eds.), Handbook of research on educational communications and technology (pp. 311–321). New York, NY: Springer. https://doi.org/10.1007/978-1-4614-3185-5_25.

    Chapter  Google Scholar 

  • Squire, K. (2006). From content to context: Videogames as designed experience. Educational Researcher,35(8), 19–29.

    Article  Google Scholar 

  • Squire, K. (2011). Video games and learning: Teaching and participatory culture in the digital age. Teachers College Press. Retrieved January 24, 2012, from https://eric.ed.gov/?id=ED523599.

  • Steinkuehler, C., Barab, S., & Squire, K. (Eds.). (2012). Games, learning, and society: Learning and meaning in the digital age. New York: Cambridge University Press.

    Google Scholar 

  • Wang, Y.-C., & Witten, I. H. (1997). Inducing model trees for continuous classes. In Proceedings of the ninth European conference on machine learning (pp. 128–137). Retrieved September 6, 2016, from https://www.cs.waikato.ac.nz/~ml/publications/1997/Wang-Witten-Induct.pdf.

Download references

Acknowledgements

This work was made possible by a grant from the National Science Foundation (DRL-1119383), although the views expressed herein are those of the authors’ and do not necessarily represent the funding agency. We would also like to thank Richard Halverson, Constance Steinkuehler, Kurt Squire, and Matthew Berland.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Elizabeth Owen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Owen, V.E., Baker, R.S. Fueling Prediction of Player Decisions: Foundations of Feature Engineering for Optimized Behavior Modeling in Serious Games. Tech Know Learn 25, 225–250 (2020). https://doi.org/10.1007/s10758-018-9393-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10758-018-9393-9

Keywords

Navigation