The transportation sector is seeing the flourishing of one of the most interesting technologies, autonomous driving (AD). In particular, Cooperative Adaptive Cruise Control (CACC) systems ensure higher levels both of safety and comfort, enhancing at the same time the reduction of energy consumption. In this framework a real-time velocity planner for a Battery Electric Vehicle, based on a Deep Reinforcement Learning algorithm called Deep Deterministic Policy Gradient (DDPG), has been developed, aiming at maximizing energy savings, and improving comfort, thanks to the exchange of information on distance, speed and acceleration through the exploitation of vehicle-to-vehicle technology (V2V). The aforementioned DDPG algorithm relies on a multi-objective reward function that is adaptive to different driving cycles. The simulation results show how the agent can obtain good results on standard cycles, such as WLTP, UDDS and AUDC, and on real-world driving cycles. Moreover, it displays great adaptability to driving cycles different from the training one.

Acceleration control strategy for Battery Electric Vehicle based on Deep Reinforcement Learning in V2V driving / Acquarone, Matteo; Borneo, Angelo; Misul, Daniela Anna. - ELETTRONICO. - (2022), pp. 202-207. ((Intervento presentato al convegno 2022 IEEE Transportation Electrification Conference and Expo, ITEC 2022 tenutosi a Anaheim, CA, USA nel 15-17 June 2022 [10.1109/ITEC53557.2022.9813785].

Acceleration control strategy for Battery Electric Vehicle based on Deep Reinforcement Learning in V2V driving

Acquarone, Matteo;Borneo, Angelo;Misul, Daniela Anna
2022

Abstract

The transportation sector is seeing the flourishing of one of the most interesting technologies, autonomous driving (AD). In particular, Cooperative Adaptive Cruise Control (CACC) systems ensure higher levels both of safety and comfort, enhancing at the same time the reduction of energy consumption. In this framework a real-time velocity planner for a Battery Electric Vehicle, based on a Deep Reinforcement Learning algorithm called Deep Deterministic Policy Gradient (DDPG), has been developed, aiming at maximizing energy savings, and improving comfort, thanks to the exchange of information on distance, speed and acceleration through the exploitation of vehicle-to-vehicle technology (V2V). The aforementioned DDPG algorithm relies on a multi-objective reward function that is adaptive to different driving cycles. The simulation results show how the agent can obtain good results on standard cycles, such as WLTP, UDDS and AUDC, and on real-world driving cycles. Moreover, it displays great adaptability to driving cycles different from the training one.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2972990
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo