Projects often face issues that trigger project controls, where Estimates at Completion (EACs) play a crucial role in determining the scope of corrective actions. Recent studies have applied supervised Machine Learning (ML) regression techniques to develop EAC models, utilizing features derived from Earned Value Management (EVM) and Earned Schedule Management (ESM) methodologies. However, these studies overlook several underfitting and overfitting issues that could compromise model robustness, leading to biased results. This paper introduces an ML pipeline designed to address these issues through automated procedures for data balancing and augmentation, feature engineering, and model training and evaluation. The pipeline was tested with 30 ML techniques on a dataset of 50 real-world construction projects. Results show that the EAC models developed through the pipeline achieve superior accuracy, precision, and timeliness to EVM and ESM ones. These findings validate the pipeline and offer practitioners an automated framework for developing robust, ML-based EAC models.

Automated machine learning pipeline for robust project cost and duration forecasting / Ottaviani, F. M.; Ballesteros-Perez, P.; Narbaev, T.. - In: AUTOMATION IN CONSTRUCTION. - ISSN 0926-5805. - ELETTRONICO. - 178:(2025). [10.1016/j.autcon.2025.106426]

Automated machine learning pipeline for robust project cost and duration forecasting

Ottaviani F. M.;Narbaev T.
2025

Abstract

Projects often face issues that trigger project controls, where Estimates at Completion (EACs) play a crucial role in determining the scope of corrective actions. Recent studies have applied supervised Machine Learning (ML) regression techniques to develop EAC models, utilizing features derived from Earned Value Management (EVM) and Earned Schedule Management (ESM) methodologies. However, these studies overlook several underfitting and overfitting issues that could compromise model robustness, leading to biased results. This paper introduces an ML pipeline designed to address these issues through automated procedures for data balancing and augmentation, feature engineering, and model training and evaluation. The pipeline was tested with 30 ML techniques on a dataset of 50 real-world construction projects. Results show that the EAC models developed through the pipeline achieve superior accuracy, precision, and timeliness to EVM and ESM ones. These findings validate the pipeline and offer practitioners an automated framework for developing robust, ML-based EAC models.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0926580525004662-main (3).pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 1.91 MB
Formato Adobe PDF
1.91 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002311