Learning control of finite Markov chains with an explicit trade-off between estimation and control | IEEE Journals & Magazine | IEEE Xplore