Projects often face issues that trigger project controls, where Estimates at Completion (EACs) play a crucial role in determining the scope of corrective actions. Recent studies have applied supervised Machine Learning (ML) regression techniques to develop EAC models, utilizing features derived from Earned Value Management (EVM) and Earned Schedule Management (ESM) methodologies. However, these studies overlook several underfitting and overfitting issues that could compromise model robustness, leading to biased results. This paper introduces an ML pipeline designed to address these issues through automated procedures for data balancing and augmentation, feature engineering, and model training and evaluation. The pipeline was tested with 30 ML techniques on a dataset of 50 real-world construction projects. Results show that the EAC models developed through the pipeline achieve superior accuracy, precision, and timeliness to EVM and ESM ones. These findings validate the pipeline and offer practitioners an automated framework for developing robust, ML-based EAC models.
Automated machine learning pipeline for robust project cost and duration forecasting / Ottaviani, F. M.; Ballesteros-Perez, P.; Narbaev, T.. - In: AUTOMATION IN CONSTRUCTION. - ISSN 0926-5805. - ELETTRONICO. - 178:(2025). [10.1016/j.autcon.2025.106426]
Automated machine learning pipeline for robust project cost and duration forecasting
Ottaviani F. M.;Narbaev T.
2025
Abstract
Projects often face issues that trigger project controls, where Estimates at Completion (EACs) play a crucial role in determining the scope of corrective actions. Recent studies have applied supervised Machine Learning (ML) regression techniques to develop EAC models, utilizing features derived from Earned Value Management (EVM) and Earned Schedule Management (ESM) methodologies. However, these studies overlook several underfitting and overfitting issues that could compromise model robustness, leading to biased results. This paper introduces an ML pipeline designed to address these issues through automated procedures for data balancing and augmentation, feature engineering, and model training and evaluation. The pipeline was tested with 30 ML techniques on a dataset of 50 real-world construction projects. Results show that the EAC models developed through the pipeline achieve superior accuracy, precision, and timeliness to EVM and ESM ones. These findings validate the pipeline and offer practitioners an automated framework for developing robust, ML-based EAC models.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0926580525004662-main (3).pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
1.91 MB
Formato
Adobe PDF
|
1.91 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3002311