This paper proposes a framework to derive interpretable and transferable Rule Extraction (RE) controllers from Deep Reinforcement Learning (DRL) control agents. Despite their strong performance in predictive energy management, model free and model based controllers face limited adoption due to complexity, computational cost, and low interpretability. The proposed RE approach translates DRL policies into simple, transparent, and scalable rule sets. The approach was tested on a heating system composed of an air-to-water Heat Pump (HP), a Thermal Energy Storage (TES) tank that also serves as a hydraulic separator, and fan coil units in a single-zone building, under the climate conditions of Turin (Northern Italy). The methodology follows a multi-stage process. High performing DRL agents were first identified through co-simulation and an optimization routine. A decision tree procedure then extracted if else rules, aggregated into a single interpretable RE controller. This controller was benchmarked against the best DRL agent and a standard Rule Based Controller (RBC). In the reference case, the RE outperformed the RBC and matched the DRL, retaining the original control logic. The RE controller was deployed without retraining across eight scenarios with different occupancy profiles, climates, and TES sizes. In all cases, it ensured stable indoor temperature control and outperformed the RBC, achieving at least 7% energy savings, 23% peak power reduction, and about 13% improvement in heat pump COP. The RE controller was also compared with scenario specific DRL agents optimized for each context. Without retraining, it achieved on average 74% of their energy savings and 86% of their peak power reduction relative to the RBC. The results confirmed the robustness and transferability of the framework, highlighting its potential for real-world deployment.

Extracting a transferable rule-based controller from deep reinforcement learning agents with validation across multiple scenarios for heat pump systems with thermal storage / Piscitelli, Marco Savino; Mele, Alessandro Aniello; Razzano, Giuseppe; Capozzoli, Alfonso. - In: APPLIED ENERGY. - ISSN 0306-2619. - 412:(2026). [10.1016/j.apenergy.2026.127661]

Extracting a transferable rule-based controller from deep reinforcement learning agents with validation across multiple scenarios for heat pump systems with thermal storage

Piscitelli, Marco Savino;Mele, Alessandro Aniello;Razzano, Giuseppe;Capozzoli, Alfonso
2026

Abstract

This paper proposes a framework to derive interpretable and transferable Rule Extraction (RE) controllers from Deep Reinforcement Learning (DRL) control agents. Despite their strong performance in predictive energy management, model free and model based controllers face limited adoption due to complexity, computational cost, and low interpretability. The proposed RE approach translates DRL policies into simple, transparent, and scalable rule sets. The approach was tested on a heating system composed of an air-to-water Heat Pump (HP), a Thermal Energy Storage (TES) tank that also serves as a hydraulic separator, and fan coil units in a single-zone building, under the climate conditions of Turin (Northern Italy). The methodology follows a multi-stage process. High performing DRL agents were first identified through co-simulation and an optimization routine. A decision tree procedure then extracted if else rules, aggregated into a single interpretable RE controller. This controller was benchmarked against the best DRL agent and a standard Rule Based Controller (RBC). In the reference case, the RE outperformed the RBC and matched the DRL, retaining the original control logic. The RE controller was deployed without retraining across eight scenarios with different occupancy profiles, climates, and TES sizes. In all cases, it ensured stable indoor temperature control and outperformed the RBC, achieving at least 7% energy savings, 23% peak power reduction, and about 13% improvement in heat pump COP. The RE controller was also compared with scenario specific DRL agents optimized for each context. Without retraining, it achieved on average 74% of their energy savings and 86% of their peak power reduction relative to the RBC. The results confirmed the robustness and transferability of the framework, highlighting its potential for real-world deployment.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3009354
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo