Online scheduling has been an attractive field of research for over three decades. Some recent developments suggest that Reinforcement Learning (RL) techniques can effectively deal with online scheduling issues. Driven by an industrial application, in this paper we apply four of the most important RL techniques, namely Q-learning, Sarsa, Watkins’s Q(λ), and Sarsa(λ), to the online single-machine scheduling problem. Our main goal is to provide insights into how such techniques perform in the scheduling process. We will consider the minimization of two different and widely used objective functions: the total tardiness and the total earliness and tardiness of the jobs. The computational experiments show that Watkins’s Q(λ) performs best in minimizing the total tardiness. At the same time, it seems that the RL approaches are not very effective in minimizing the total earliness and tardiness over large time horizons.
Online Single-Machine Scheduling via Reinforcement Learning / Li, Yuanyuan; Fadda, Edoardo; Manerba, Daniele; Roohnavazfar, Mina; Tadei, Roberto; Terzo, Olivier (STUDIES IN COMPUTATIONAL INTELLIGENCE). - In: Recent Advances in Computational OptimizationELETTRONICO. - [s.l] : Springer, 2022. - ISBN 978-3-030-82399-3. - pp. 103-122 [10.1007/978-3-030-82397-9_5]
Online Single-Machine Scheduling via Reinforcement Learning
Edoardo Fadda;Mina Roohnavazfar;Roberto Tadei;
2022
Abstract
Online scheduling has been an attractive field of research for over three decades. Some recent developments suggest that Reinforcement Learning (RL) techniques can effectively deal with online scheduling issues. Driven by an industrial application, in this paper we apply four of the most important RL techniques, namely Q-learning, Sarsa, Watkins’s Q(λ), and Sarsa(λ), to the online single-machine scheduling problem. Our main goal is to provide insights into how such techniques perform in the scheduling process. We will consider the minimization of two different and widely used objective functions: the total tardiness and the total earliness and tardiness of the jobs. The computational experiments show that Watkins’s Q(λ) performs best in minimizing the total tardiness. At the same time, it seems that the RL approaches are not very effective in minimizing the total earliness and tardiness over large time horizons.File | Dimensione | Formato | |
---|---|---|---|
Online_Single_Machine_Scheduling_via_Reinforcement_Learning.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
1.52 MB
Formato
Adobe PDF
|
1.52 MB | Adobe PDF | Visualizza/Apri |
2020 Online Single-Machine Scheduling via Reinforcement Learning.pdf
accesso riservato
Descrizione: Book Chapter
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
250.32 kB
Formato
Adobe PDF
|
250.32 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2896776