Online scheduling has been an attractive field of research for over three decades. Some recent developments suggest that Reinforcement Learning (RL) techniques can effectively deal with online scheduling issues. Driven by an industrial application, in this paper we apply four of the most important RL techniques, namely Q-learning, Sarsa, Watkins’s Q(λ), and Sarsa(λ), to the online single-machine scheduling problem. Our main goal is to provide insights into how such techniques perform in the scheduling process. We will consider the minimization of two different and widely used objective functions: the total tardiness and the total earliness and tardiness of the jobs. The computational experiments show that Watkins’s Q(λ) performs best in minimizing the total tardiness. At the same time, it seems that the RL approaches are not very effective in minimizing the total earliness and tardiness over large time horizons.
Online Single-Machine Scheduling via Reinforcement Learning / Li, Yuanyuan; Fadda, Edoardo; Manerba, Daniele; Roohnavazfar, Mina; Tadei, Roberto; Terzo, Olivier (STUDIES IN COMPUTATIONAL INTELLIGENCE). - In: Recent Advances in Computational OptimizationELETTRONICO. - [s.l] : Springer, 2022. - ISBN 978-3-030-82399-3. - pp. 103-122 [10.1007/978-3-030-82397-9_5]
Online Single-Machine Scheduling via Reinforcement Learning
Edoardo Fadda;Mina Roohnavazfar;Roberto Tadei;
2022
Abstract
Online scheduling has been an attractive field of research for over three decades. Some recent developments suggest that Reinforcement Learning (RL) techniques can effectively deal with online scheduling issues. Driven by an industrial application, in this paper we apply four of the most important RL techniques, namely Q-learning, Sarsa, Watkins’s Q(λ), and Sarsa(λ), to the online single-machine scheduling problem. Our main goal is to provide insights into how such techniques perform in the scheduling process. We will consider the minimization of two different and widely used objective functions: the total tardiness and the total earliness and tardiness of the jobs. The computational experiments show that Watkins’s Q(λ) performs best in minimizing the total tardiness. At the same time, it seems that the RL approaches are not very effective in minimizing the total earliness and tardiness over large time horizons.| File | Dimensione | Formato | |
|---|---|---|---|
| Online_Single_Machine_Scheduling_via_Reinforcement_Learning.pdf accesso aperto 
											Tipologia:
											2. Post-print / Author's Accepted Manuscript
										 
											Licenza:
											
											
												Pubblico - Tutti i diritti riservati
												
												
												
											
										 
										Dimensione
										1.52 MB
									 
										Formato
										Adobe PDF
									 | 1.52 MB | Adobe PDF | Visualizza/Apri | 
| 2020 Online Single-Machine Scheduling via Reinforcement Learning.pdf accesso riservato 
											Descrizione: Book Chapter
										 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Non Pubblico - Accesso privato/ristretto
												
												
												
											
										 
										Dimensione
										250.32 kB
									 
										Formato
										Adobe PDF
									 | 250.32 kB | Adobe PDF | Visualizza/Apri Richiedi una copia | 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2896776
			
		
	
	
	
			      	