The Policy Gradient approach is one of the deep reinforcement learning families that combines deep neural networks (DNN) with reinforcement learning RL to discover the optimum of the control problem through experience gained from the interaction between the robot and its surroundings. In contrast to earlier policy gradient algorithms, which were unable to handle these two types of error because of over-or under-estimation introduced by the deep neural network model, this article will discuss the state-of-the-art SOTA policy gradient technique, trust region policy optimization (TRPO), by applying this method in various environments compared to another policy gradient method, the Proximal Policy Optimization (PPO), to explain their robust optimization, using this SOTA to gather experience data during various training phases after observing the impact of hyper-parameters on neural network performance.

Robot Movement Using the Trust Region Policy Optimization / Ali, Romisaa. - In: PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY. - ISSN 1307-6884. - 17:(In corso di stampa).

Robot Movement Using the Trust Region Policy Optimization

Ali, Romisaa
In corso di stampa

Abstract

The Policy Gradient approach is one of the deep reinforcement learning families that combines deep neural networks (DNN) with reinforcement learning RL to discover the optimum of the control problem through experience gained from the interaction between the robot and its surroundings. In contrast to earlier policy gradient algorithms, which were unable to handle these two types of error because of over-or under-estimation introduced by the deep neural network model, this article will discuss the state-of-the-art SOTA policy gradient technique, trust region policy optimization (TRPO), by applying this method in various environments compared to another policy gradient method, the Proximal Policy Optimization (PPO), to explain their robust optimization, using this SOTA to gather experience data during various training phases after observing the impact of hyper-parameters on neural network performance.
File in questo prodotto:
File Dimensione Formato  
paper_acceptance_letter.pdf

accesso aperto

Descrizione: acceptance and invitation letter
Tipologia: Altro materiale allegato
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 209.79 kB
Formato Adobe PDF
209.79 kB Adobe PDF Visualizza/Apri
robot+movinglast+(Autosaved).pdf

accesso aperto

Descrizione: this paper accepted as oral presentation in ICRMCA 2023 : International Conference on Robot Motion Control and Automation.
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 611.08 kB
Formato Adobe PDF
611.08 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2973080