This paper proposes a Reinforcement Learning (RL) approach for coverage planning of unexplored areas with obstacles applied to fleets of Unmanned Aerial Vehicles (UAVs) . The goal is to reduce the steps and the energy needed to achieve full coverage, while avoiding collisions with fixed obstacles and other fleet members. This objective is accomplished through a Reinforcement Learning (RL)-based algorithm, through which UAVs are trained concurrently in simulated environment to maximize their individually explored areas while keeping uniformly distributed. This mixed cooperative-competitive behaviour is learned through a Convolutional Neural Network (CNN), running on each fleet unit, which outputs a suitable waypoint to be reached on the basis of all UAVs locations and already explored areas. Training process is developed in a novel approach, by gathering all UAVs’ trajectories collected during simulated episodes to update a shared policy function. In test phase, the learned behaviour is exploited in a decentralized way. Trained fleets are tested in simulated fields with different obstacle configurations, and performances are assessed in terms of both strategical distribution and exploration capabilities in maps with different complexity levels. Results show that fleets of 2 to 10 drones manage to reach full coverage of the test maps while spreading efficiently in the environment.

Reinforcement Learning based Coverage Planning for UAVs Fleets / Bromo, Cosimo; Godio, Simone; Guglieri, Giorgio. - (2023). (Intervento presentato al convegno AIAA Sci Tech Forum tenutosi a National Harbour, MD nel 23-27 January 2023) [10.2514/6.2023-1149].

Reinforcement Learning based Coverage Planning for UAVs Fleets

Bromo, Cosimo;Godio, Simone;Guglieri, Giorgio
2023

Abstract

This paper proposes a Reinforcement Learning (RL) approach for coverage planning of unexplored areas with obstacles applied to fleets of Unmanned Aerial Vehicles (UAVs) . The goal is to reduce the steps and the energy needed to achieve full coverage, while avoiding collisions with fixed obstacles and other fleet members. This objective is accomplished through a Reinforcement Learning (RL)-based algorithm, through which UAVs are trained concurrently in simulated environment to maximize their individually explored areas while keeping uniformly distributed. This mixed cooperative-competitive behaviour is learned through a Convolutional Neural Network (CNN), running on each fleet unit, which outputs a suitable waypoint to be reached on the basis of all UAVs locations and already explored areas. Training process is developed in a novel approach, by gathering all UAVs’ trajectories collected during simulated episodes to update a shared policy function. In test phase, the learned behaviour is exploited in a decentralized way. Trained fleets are tested in simulated fields with different obstacle configurations, and performances are assessed in terms of both strategical distribution and exploration capabilities in maps with different complexity levels. Results show that fleets of 2 to 10 drones manage to reach full coverage of the test maps while spreading efficiently in the environment.
File in questo prodotto:
File Dimensione Formato  
6.2023-1149.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.46 MB
Formato Adobe PDF
1.46 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2974817