VLC and D2D Heterogeneous Network Optimization: A Reinforcement Learning Approach Based on Equilibrium Problems With Equilibrium Constraints | IEEE Journals & Magazine | IEEE Xplore