ABSTRACT
This work introduces a robust software trainer for the guitar, called Fret Ferret, which utilizes the contextual multi-armed bandit class of algorithms, as well as implicit signals from the user, to customize lessons to each user automatically and in real time, avoiding the need to tweak a large selection of game parameters in order to obtain a desired difficulty level. We discuss the consequences of using these algorithms to drive new styles of computer-human interaction that can accelerate our learning process and help us reach new mental achievements. We elaborate on the details of our algorithms, how they inform our user interface design, and how we address the difficulty of scaling the development of a large number of machine learning models to adapt to many mini games and each user individually. Finally, we address future work that can further augment the performance and capabilities of our software.
Supplemental Material
Available for Download
- 2020. FaChords Guitar. https://www.fachords.com/guitar-learning-software/Google Scholar
- 2020. Guitar Notes Fretboard Trainer. https://www.guitarorb.com/guitar-notesGoogle Scholar
- 2020. Strong Apps. http://www.strongapplications.comGoogle Scholar
- Donald A. Berry and Bert Fristedt. 1985. Bandit problems : sequential allocation of experiments / Donald A. Berry, Bert Fristedt. Chapman and Hall London ; New York. viii, 275 p. : pages.Google Scholar
- Olivier Chapelle and Lihong Li. 2011. An Empirical Evaluation of Thompson Sampling. In Advances in Neural Information Processing Systems, J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Vol. 24. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2011/file/e53a0a2978c28872a4505bdb51db06dc-Paper.pdfGoogle Scholar
- Anna Coenen. 2020. How The New York Times is Experimenting with Recommendation Algorithms. https://open.nytimes.com/how-the-new-york-times-is-experimenting-with-recommendation-algorithms-562f78624d26Google Scholar
- Mihaly Csikszentmihalyi. 1991. Flow: The Psychology of Optimal Experience. Harper Perennial, New York, NY. http://www.amazon.com/gp/product/0060920432/ref=si3_rdr_bb_product/104-4616565-4570345Google Scholar
- Anders Elowsson. 2018. Polyphonic Pitch Tracking with Deep Layered Learning. CoRR abs/1804.02918(2018). arXiv:1804.02918http://arxiv.org/abs/1804.02918Google Scholar
- J. C. Gittins. 1989. Multi-armed Bandit Allocation Indices. Wiley, Chichester, NY.Google Scholar
- Anthony G Greenwald, Debbie E McGhee, and Jordan L. K Schwartz. 1998. Measuring Individual Differences in Implicit Cognition: The Implicit Association Test. Journal of personality and social psychology 74, 6(1998), 1464–1480.Google ScholarCross Ref
- Peter Hayes. 2020. pitchfinder. https://github.com/peterkhayes/pitchfinderGoogle Scholar
- Douglas Mason. 2021. Gaussian processes and equivalent Bayesian linear regressions. https://doi.org/10.5281/zenodo.5828879Google Scholar
- Douglas Mason. 2021. Real-World Reinforcement Learning. https://doi.org/10.5281/zenodo.5828870Google Scholar
- Douglas Mason. 2022. The Koyote Science, LLC, Approach to Personalization and Recommender Systems. https://doi.org/10.5281/zenodo.455689261Google Scholar
- John Mui, Fuhua Lin, and M Dewan. 2021. Multi-armed Bandit Algorithms for Adaptive Learning: A Survey. In International Conference on Artificial Intelligence in Education. Springer, 273–278.Google ScholarDigital Library
- Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10(2009), 1345–1359.Google ScholarDigital Library
- Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, and Zheng Wen. 2020. A Tutorial on Thompson Sampling. arxiv:1707.02038 [cs.LG]Google Scholar
- Steven L. Scott. 2010. A modern Bayesian look at the multi-armed bandit. Applied Stochastic Models in Business and Industry 26, 6 (2010), 639–658. https://doi.org/10.1002/asmb.874 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/asmb.874Google ScholarDigital Library
- Avi Segal, Yossi Ben David, Joseph Jay Williams, Kobi Gal, and Yaar Shalom. 2018. Combining difficulty ranking with multi-armed bandits to sequence educational content. In International conference on artificial intelligence in education. Springer, 317–321.Google ScholarCross Ref
- Adish Singla, Anna N Rafferty, Goran Radanovic, and Neil T Heffernan. 2021. Reinforcement Learning for Education: Opportunities and Challenges. arXiv preprint arXiv:2107.08828(2021).Google Scholar
- J.M. White. 2012. Bandit Algorithms for Website Optimization: Developing, Deploying, and Debugging. O’Reilly Media. https://books.google.com/books?id=xnAZLjqGybwCGoogle Scholar
Recommendations
Ballooning Multi-Armed Bandits
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent SystemsWe introduce ballooning multi-armed bandits (BL-MAB), a novel extension to the classical stochastic MAB model. In the BL-MAB model, the set of available arms grows (or balloons) over time. The regret in a BL-MAB setting is computed with respect to the ...
Budgeted Combinatorial Multi-Armed Bandits
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent SystemsWe consider a budgeted combinatorial multi-armed bandit setting where, in every round, the algorithm selects a super-arm consisting of one or more arms. The goal is to minimize the total expected regret after all rounds within a limited budget. Existing ...
Comments