Performance Investigation of UCB Policy in Q-learning | IEEE Conference Publication | IEEE Xplore