This study proposes a machine learning framework to support risk‑based verification of expenditure declarations in European Structural and Investment Funds, reflecting the current regulatory emphasis on proportional and data‑driven audit strategies. Quantitatively, the problem is formulated as an imbalanced three-class classification task with ordered outcomes on high-dimensional administrative data; the ordinal structure is exploited ex post in evaluation and error interpretation. The framework classifies expense documents as validated, partially validated, or not validated, and provides audit authorities with interpretable probability estimates for each case. A predictive model was trained and validated on more than ninety thousand expense documents from the Italian Regional Operational Programme co‑funded by the European Regional Development Fund (2014–2020). Methodological challenges—ordered outcomes, severe class imbalance, and mixed‑type features—were addressed through targeted preprocessing and the CatBoost gradient‑boosting algorithm. The model achieved satisfactory predictive performance, offering probabilistic outputs aligned with the ordered structure of audit outcomes. Variable‑importance analysis confirmed the relevance of both financial and administrative variables in predicting irregularities. The framework is designed with operational integration in mind and could underpin risk‑based sampling in expenditure verification, subject to further validation across time, programmes, and beneficiary structures. Departing from a literature that largely focuses on binary classification or fraud detection, the study addresses the understudied challenge of multi‑class prediction in public expenditure control and provides an interpretable prototype decision‑support tool. The model could support public authorities in prioritizing controls and allocating resources more efficiently, contributing to the modernization of European Union fund management and promoting data‑driven, proportionate oversight—conditional on governance arrangements and external validation.
Risk-based predictive modelling for audit verification: evidence from EU-funded programmes / Verna, Elisa; Genta, Gianfranco; Galetto, Maurizio. - In: QUALITY AND QUANTITY. - ISSN 1573-7845. - ELETTRONICO. - (2026). [10.1007/s11135-026-02725-x]
Risk-based predictive modelling for audit verification: evidence from EU-funded programmes
Verna, Elisa;Genta, Gianfranco;Galetto, Maurizio
2026
Abstract
This study proposes a machine learning framework to support risk‑based verification of expenditure declarations in European Structural and Investment Funds, reflecting the current regulatory emphasis on proportional and data‑driven audit strategies. Quantitatively, the problem is formulated as an imbalanced three-class classification task with ordered outcomes on high-dimensional administrative data; the ordinal structure is exploited ex post in evaluation and error interpretation. The framework classifies expense documents as validated, partially validated, or not validated, and provides audit authorities with interpretable probability estimates for each case. A predictive model was trained and validated on more than ninety thousand expense documents from the Italian Regional Operational Programme co‑funded by the European Regional Development Fund (2014–2020). Methodological challenges—ordered outcomes, severe class imbalance, and mixed‑type features—were addressed through targeted preprocessing and the CatBoost gradient‑boosting algorithm. The model achieved satisfactory predictive performance, offering probabilistic outputs aligned with the ordered structure of audit outcomes. Variable‑importance analysis confirmed the relevance of both financial and administrative variables in predicting irregularities. The framework is designed with operational integration in mind and could underpin risk‑based sampling in expenditure verification, subject to further validation across time, programmes, and beneficiary structures. Departing from a literature that largely focuses on binary classification or fraud detection, the study addresses the understudied challenge of multi‑class prediction in public expenditure control and provides an interpretable prototype decision‑support tool. The model could support public authorities in prioritizing controls and allocating resources more efficiently, contributing to the modernization of European Union fund management and promoting data‑driven, proportionate oversight—conditional on governance arrangements and external validation.| File | Dimensione | Formato | |
|---|---|---|---|
|
Finpiemonte_s11135-026-02725-x.pdf
accesso aperto
Descrizione: Articolo completo
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
1.57 MB
Formato
Adobe PDF
|
1.57 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3009356
