Comparison study of two reinforcement learning based real-time control policies for two-machine-one-buffer production system | IEEE Conference Publication | IEEE Xplore