A sandpile model for reliable actor-critic reinforcement learning | IEEE Conference Publication | IEEE Xplore