Residual Moisture (RM) in freeze-dried products is one of the most important critical quality attributes (CQAs) to monitor, since it affects the stability of the active pharmaceutical ingredient (API). The standard experimental method adopted for the measurements of RM is the Karl-Fischer (KF) titration, that is a destructive and time-consuming technique. Therefore, Near-Infrared (NIR) spectroscopy was widely investigated in the last decades as an alternative tool to quantify the RM. In the present paper, a novel method was developed based on NIR spectroscopy combined with machine learning tools for the prediction of RM in freeze-dried products. Two different types of models were used: a linear regression model and a neural network based one. The architecture of the neural network was chosen so as to optimize the prediction of the residual moisture, by minimizing the root mean square error with the dataset used in the learning step. Moreover, the parity plots and the absolute error plots were reported, allowing a visual evaluation of the results. Different factors were considered when developing the model, namely the range of wavelengths considered, the shape of the spectra and the type of model. The possibility of developing the model using a smaller dataset, obtained with just one product, that could be then applied to a wider range of products was investigated, as well as the performance of a model developed for a dataset encompassing several products. Different formulations were analyzed: the main part of the dataset was characterized by a different percentage of sucrose in solution (3%, 6% and 9% specifically); a smaller part was made up of sucrose-arginine mixtures at different percentages and only one formulation was characterized by another excipient, the trehalose. The product-specific model for the 6% sucrose mixture was found consistent for the prediction of RM in other sucrose containing mixtures and in the one containing trehalose, while failed for the dataset with higher percentage of arginine. Therefore, a global model was developed by including a certain percentage of all the available dataset in the calibration phase. Results presented and discussed in this paper demonstrate the higher accuracy and robustness of the machine learning based model with respect to the linear models.

Use of machine learning tools and NIR spectra to estimate residual moisture in freeze-dried products / Massei, Ambra; Falco, Nunzia; Fissore, Davide. - In: SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY. - ISSN 1386-1425. - STAMPA. - 293:(2023), p. 122485. [10.1016/j.saa.2023.122485]

Use of machine learning tools and NIR spectra to estimate residual moisture in freeze-dried products

Massei, Ambra;Fissore, Davide
2023

Abstract

Residual Moisture (RM) in freeze-dried products is one of the most important critical quality attributes (CQAs) to monitor, since it affects the stability of the active pharmaceutical ingredient (API). The standard experimental method adopted for the measurements of RM is the Karl-Fischer (KF) titration, that is a destructive and time-consuming technique. Therefore, Near-Infrared (NIR) spectroscopy was widely investigated in the last decades as an alternative tool to quantify the RM. In the present paper, a novel method was developed based on NIR spectroscopy combined with machine learning tools for the prediction of RM in freeze-dried products. Two different types of models were used: a linear regression model and a neural network based one. The architecture of the neural network was chosen so as to optimize the prediction of the residual moisture, by minimizing the root mean square error with the dataset used in the learning step. Moreover, the parity plots and the absolute error plots were reported, allowing a visual evaluation of the results. Different factors were considered when developing the model, namely the range of wavelengths considered, the shape of the spectra and the type of model. The possibility of developing the model using a smaller dataset, obtained with just one product, that could be then applied to a wider range of products was investigated, as well as the performance of a model developed for a dataset encompassing several products. Different formulations were analyzed: the main part of the dataset was characterized by a different percentage of sucrose in solution (3%, 6% and 9% specifically); a smaller part was made up of sucrose-arginine mixtures at different percentages and only one formulation was characterized by another excipient, the trehalose. The product-specific model for the 6% sucrose mixture was found consistent for the prediction of RM in other sucrose containing mixtures and in the one containing trehalose, while failed for the dataset with higher percentage of arginine. Therefore, a global model was developed by including a certain percentage of all the available dataset in the calibration phase. Results presented and discussed in this paper demonstrate the higher accuracy and robustness of the machine learning based model with respect to the linear models.
File in questo prodotto:
File Dimensione Formato  
109_SAA_2023_vol293.pdf

embargo fino al 14/02/2025

Descrizione: Post print
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 2.8 MB
Formato Adobe PDF
2.8 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
109.SA_2023_vol293.pdf

non disponibili

Descrizione: Versione editoriale per archiviazione
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.42 MB
Formato Adobe PDF
2.42 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2976207