Surface-Enhanced Raman Scattering (SERS) spectroscopy is a powerful technique for detecting trace amounts of molecules, but the analysis of SERS spectra can be challenging due to inherent noise and complexity. In this study, we propose a novel approach that leverages the power of deep learning to overcome these limitations. We designed and implemented an autoencoder, an artificial neural network trained to learn the essential features of a large dataset of SERS spectra of various metabolites. Our methodology, which goes beyond traditional denoising techniques, processes the spectra to extract a compressed, low-dimensional representation, effectively acting as a "chemical fingerprint." We then applied a K-means clustering algorithm to this learned representation. Our results demonstrate that the autoencoder successfully groups the metabolites into distinct clusters based on their underlying chemical and structural properties, even for spectra that appear visually dissimilar. The clusters reveal meaningful correlations, such as the segregation of sulfur-containing amino acids and tryptophan derivatives, confirming the model's ability to identify fundamental spectral signatures. This work not only validates the use of autoencoders for a precise and efficient classification of SERS data but also lays the groundwork for the creation of a comprehensive library of ''pseudo-spectra.'' These pseudo-spectra, which represent the ideal chemical fingerprints of molecules, can serve as a powerful reference for future research, offering a robust tool for the identification and analysis of unknown SERS samples.
Beyond the Spectrum: How an AI Autoencoder Deciphers the Chemical Fingerprint in SERS Data / Sparavigna, Amelia Carolina. - ELETTRONICO. - (2025). [10.5281/zenodo.16895315]
Beyond the Spectrum: How an AI Autoencoder Deciphers the Chemical Fingerprint in SERS Data
Amelia Carolina Sparavigna
2025
Abstract
Surface-Enhanced Raman Scattering (SERS) spectroscopy is a powerful technique for detecting trace amounts of molecules, but the analysis of SERS spectra can be challenging due to inherent noise and complexity. In this study, we propose a novel approach that leverages the power of deep learning to overcome these limitations. We designed and implemented an autoencoder, an artificial neural network trained to learn the essential features of a large dataset of SERS spectra of various metabolites. Our methodology, which goes beyond traditional denoising techniques, processes the spectra to extract a compressed, low-dimensional representation, effectively acting as a "chemical fingerprint." We then applied a K-means clustering algorithm to this learned representation. Our results demonstrate that the autoencoder successfully groups the metabolites into distinct clusters based on their underlying chemical and structural properties, even for spectra that appear visually dissimilar. The clusters reveal meaningful correlations, such as the segregation of sulfur-containing amino acids and tryptophan derivatives, confirming the model's ability to identify fundamental spectral signatures. This work not only validates the use of autoencoders for a precise and efficient classification of SERS data but also lays the groundwork for the creation of a comprehensive library of ''pseudo-spectra.'' These pseudo-spectra, which represent the ideal chemical fingerprints of molecules, can serve as a powerful reference for future research, offering a robust tool for the identification and analysis of unknown SERS samples.File | Dimensione | Formato | |
---|---|---|---|
SERS-autoencoder.pdf
accesso aperto
Tipologia:
1. Preprint / submitted version [pre- review]
Licenza:
Creative commons
Dimensione
1.21 MB
Formato
Adobe PDF
|
1.21 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3002456