Exploring the potentialities of deep reinforcement learning for incentive-based demand response in a cluster of small commercial buildings