Background: We consider a variant of the traveling salesperson problem motivated by the case of a company delivering furniture. The problem is both dynamic, due to random arrivals of delivery requests, and multiperiod, due to flexibility in delivering items within a time window of a few days. A sequence of daily routes must be selected over time, and both volume and route duration constraints are relevant.Moreover, customers are irregularly distributed in clusters with high or low density. When receiving a request from a low-density cluster, we may consider the possibility of waiting for further requests from the same cluster, which involves a tradeoff between total traveled distance and service quality. Methods: We designed alternative decision policies based on approximate dynamic programming principles. We compared policy and cost function approximations, tuning their parameters by simulation-based optimization. Results: We compared the decision policies by realistic out-of-sample simulations. A simple trigger-based decision policy was able to achieve a good compromise among possibly conflicting objectives,without resorting to full-fledged multiobjective models. Conclusions: The insights into the relative strengths and weaknesses of the tested policies pave the way to practical extensions. Due to its computational efficiency, the trigger policy may be improved by base-policy rollout and integrated within a multi-vehicle routing architecture.
Learning decision rules for a stochastic multiperiod capacitated traveling salesperson problem with irregularly clustered customers / Mutailifu, Subei; Brandimarte, Paolo; Maimaiti, Aili. - In: LOGISTICS. - ISSN 2305-6290. - ELETTRONICO. - 9:3(2025), pp. 1-24. [10.3390/logistics9030119]
Learning decision rules for a stochastic multiperiod capacitated traveling salesperson problem with irregularly clustered customers
Brandimarte,Paolo;
2025
Abstract
Background: We consider a variant of the traveling salesperson problem motivated by the case of a company delivering furniture. The problem is both dynamic, due to random arrivals of delivery requests, and multiperiod, due to flexibility in delivering items within a time window of a few days. A sequence of daily routes must be selected over time, and both volume and route duration constraints are relevant.Moreover, customers are irregularly distributed in clusters with high or low density. When receiving a request from a low-density cluster, we may consider the possibility of waiting for further requests from the same cluster, which involves a tradeoff between total traveled distance and service quality. Methods: We designed alternative decision policies based on approximate dynamic programming principles. We compared policy and cost function approximations, tuning their parameters by simulation-based optimization. Results: We compared the decision policies by realistic out-of-sample simulations. A simple trigger-based decision policy was able to achieve a good compromise among possibly conflicting objectives,without resorting to full-fledged multiobjective models. Conclusions: The insights into the relative strengths and weaknesses of the tested policies pave the way to practical extensions. Due to its computational efficiency, the trigger policy may be improved by base-policy rollout and integrated within a multi-vehicle routing architecture.File | Dimensione | Formato | |
---|---|---|---|
logistics-09-00119.pdf
accesso aperto
Descrizione: File PDF (open access)
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
395.72 kB
Formato
Adobe PDF
|
395.72 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3002465