This paper proposes a Reinforcement Learning (RL) approach for coverage planning of unexplored areas with obstacles applied to fleets of Unmanned Aerial Vehicles (UAVs) . The goal is to reduce the steps and the energy needed to achieve full coverage, while avoiding collisions with fixed obstacles and other fleet members. This objective is accomplished through a Reinforcement Learning (RL)-based algorithm, through which UAVs are trained concurrently in simulated environment to maximize their individually explored areas while keeping uniformly distributed. This mixed cooperative-competitive behaviour is learned through a Convolutional Neural Network (CNN), running on each fleet unit, which outputs a suitable waypoint to be reached on the basis of all UAVs locations and already explored areas. Training process is developed in a novel approach, by gathering all UAVs’ trajectories collected during simulated episodes to update a shared policy function. In test phase, the learned behaviour is exploited in a decentralized way. Trained fleets are tested in simulated fields with different obstacle configurations, and performances are assessed in terms of both strategical distribution and exploration capabilities in maps with different complexity levels. Results show that fleets of 2 to 10 drones manage to reach full coverage of the test maps while spreading efficiently in the environment.
Reinforcement Learning based Coverage Planning for UAVs Fleets / Bromo, Cosimo; Godio, Simone; Guglieri, Giorgio. - (2023). (Intervento presentato al convegno AIAA Sci Tech Forum tenutosi a National Harbour, MD nel 23-27 January 2023) [10.2514/6.2023-1149].
Reinforcement Learning based Coverage Planning for UAVs Fleets
Bromo, Cosimo;Godio, Simone;Guglieri, Giorgio
2023
Abstract
This paper proposes a Reinforcement Learning (RL) approach for coverage planning of unexplored areas with obstacles applied to fleets of Unmanned Aerial Vehicles (UAVs) . The goal is to reduce the steps and the energy needed to achieve full coverage, while avoiding collisions with fixed obstacles and other fleet members. This objective is accomplished through a Reinforcement Learning (RL)-based algorithm, through which UAVs are trained concurrently in simulated environment to maximize their individually explored areas while keeping uniformly distributed. This mixed cooperative-competitive behaviour is learned through a Convolutional Neural Network (CNN), running on each fleet unit, which outputs a suitable waypoint to be reached on the basis of all UAVs locations and already explored areas. Training process is developed in a novel approach, by gathering all UAVs’ trajectories collected during simulated episodes to update a shared policy function. In test phase, the learned behaviour is exploited in a decentralized way. Trained fleets are tested in simulated fields with different obstacle configurations, and performances are assessed in terms of both strategical distribution and exploration capabilities in maps with different complexity levels. Results show that fleets of 2 to 10 drones manage to reach full coverage of the test maps while spreading efficiently in the environment.File | Dimensione | Formato | |
---|---|---|---|
6.2023-1149.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
1.46 MB
Formato
Adobe PDF
|
1.46 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2974817