Battery Storage Systems (BSS) are increasingly utilized to enhance renewable energy consumption and operational stability in energy microgrids. However, the uncertainty characterizing renewable energy generation poses significant challenges in the optimal control of BSS. Multiple Reinforcement Learning (RL) approaches have been presented to solve this optimization problem. However, a comparison of different targets for the training of the RL systems is rarely performed. This work compares different reward functions that enable efficient BSS usage in the power plant of a transport hub while expressing the problem as a partially observable Markov Decision Process (POMDP). A Proximal Policy Optimization (PPO) algorithm is trained using reward functions derived from financial targets and BSS efficiency objectives. Results indicate that reward functions aligning BSS usage with market trends lead to superior performance compared to traditional earnings-based objectives. Furthermore, limitations regarding training episode numbers and reward normalization are identified, suggesting avenues for future research. This study contributes to advancing RL-based approaches for optimal BSS management in energy microgrid environments.
Optimizing Battery Storage Systems in Energy Microgrids: A Reinforcement Learning Approach Comparing Multiple Reward Functions / Ghione, Giorgia; Randazzo, Vincenzo; Badami, Marco; Pasero, Eros. - ELETTRONICO. - (2024), pp. 642-646. (Intervento presentato al convegno 8th IEEE International Forum on Research and Technologies for Society and Industry Innovation, RTSI 2024 tenutosi a Politecnico di Milano - Polo Territoriale di Lecco, ita nel 2024) [10.1109/rtsi61910.2024.10761708].
Optimizing Battery Storage Systems in Energy Microgrids: A Reinforcement Learning Approach Comparing Multiple Reward Functions
Ghione, Giorgia;Randazzo, Vincenzo;Badami, Marco;Pasero, Eros
2024
Abstract
Battery Storage Systems (BSS) are increasingly utilized to enhance renewable energy consumption and operational stability in energy microgrids. However, the uncertainty characterizing renewable energy generation poses significant challenges in the optimal control of BSS. Multiple Reinforcement Learning (RL) approaches have been presented to solve this optimization problem. However, a comparison of different targets for the training of the RL systems is rarely performed. This work compares different reward functions that enable efficient BSS usage in the power plant of a transport hub while expressing the problem as a partially observable Markov Decision Process (POMDP). A Proximal Policy Optimization (PPO) algorithm is trained using reward functions derived from financial targets and BSS efficiency objectives. Results indicate that reward functions aligning BSS usage with market trends lead to superior performance compared to traditional earnings-based objectives. Furthermore, limitations regarding training episode numbers and reward normalization are identified, suggesting avenues for future research. This study contributes to advancing RL-based approaches for optimal BSS management in energy microgrid environments.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2996290
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo