Background: We consider a variant of the traveling salesperson problem motivated by the case of a company delivering furniture. The problem is both dynamic, due to random arrivals of delivery requests, and multiperiod, due to flexibility in delivering items within a time window of a few days. A sequence of daily routes must be selected over time, and both volume and route duration constraints are relevant.Moreover, customers are irregularly distributed in clusters with high or low density. When receiving a request from a low-density cluster, we may consider the possibility of waiting for further requests from the same cluster, which involves a tradeoff between total traveled distance and service quality. Methods: We designed alternative decision policies based on approximate dynamic programming principles. We compared policy and cost function approximations, tuning their parameters by simulation-based optimization. Results: We compared the decision policies by realistic out-of-sample simulations. A simple trigger-based decision policy was able to achieve a good compromise among possibly conflicting objectives,without resorting to full-fledged multiobjective models. Conclusions: The insights into the relative strengths and weaknesses of the tested policies pave the way to practical extensions. Due to its computational efficiency, the trigger policy may be improved by base-policy rollout and integrated within a multi-vehicle routing architecture.

Learning decision rules for a stochastic multiperiod capacitated traveling salesperson problem with irregularly clustered customers / Mutailifu, Subei; Brandimarte, Paolo; Maimaiti, Aili. - In: LOGISTICS. - ISSN 2305-6290. - ELETTRONICO. - 9:3(2025), pp. 1-24. [10.3390/logistics9030119]

Learning decision rules for a stochastic multiperiod capacitated traveling salesperson problem with irregularly clustered customers

Brandimarte,Paolo;
2025

Abstract

Background: We consider a variant of the traveling salesperson problem motivated by the case of a company delivering furniture. The problem is both dynamic, due to random arrivals of delivery requests, and multiperiod, due to flexibility in delivering items within a time window of a few days. A sequence of daily routes must be selected over time, and both volume and route duration constraints are relevant.Moreover, customers are irregularly distributed in clusters with high or low density. When receiving a request from a low-density cluster, we may consider the possibility of waiting for further requests from the same cluster, which involves a tradeoff between total traveled distance and service quality. Methods: We designed alternative decision policies based on approximate dynamic programming principles. We compared policy and cost function approximations, tuning their parameters by simulation-based optimization. Results: We compared the decision policies by realistic out-of-sample simulations. A simple trigger-based decision policy was able to achieve a good compromise among possibly conflicting objectives,without resorting to full-fledged multiobjective models. Conclusions: The insights into the relative strengths and weaknesses of the tested policies pave the way to practical extensions. Due to its computational efficiency, the trigger policy may be improved by base-policy rollout and integrated within a multi-vehicle routing architecture.
2025
File in questo prodotto:
File Dimensione Formato  
logistics-09-00119.pdf

accesso aperto

Descrizione: File PDF (open access)
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 395.72 kB
Formato Adobe PDF
395.72 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002465