Peak shaving in district heating exploiting reinforcement learning and agent-based modelling