Cancer is globally a leading cause of death that would benefit from diagnostic approaches detecting it in its early stages. However, despite much research and investment, cancer early diagnosis is still underdeveloped. Owing to its high sensitivity, surface-enhanced Raman spectroscopy (SERS)-based detection of biomarkers has attracted growing interest in this area. Oligonucleotides are an important type of genetic biomarkers as their alterations can be linked to the disease prior to symptom onset. We propose a machine-learning (ML)-enabled framework to analyze complex direct SERS spectra of short, single-stranded DNA and RNA targets to identify relevant mutations occurring in genetic biomarkers, which are key disease indicators. First, by employing ad hoc-synthesized colloidal silver nanoparticles as SERS substrates, we analyze single-base mutations in ssDNA and RNA sequences using a direct SERS-sensing approach. Then, an ML-based hypothesis test is proposed to identify these changes and differentiate the mutated sequences from the corresponding native ones. Rooted in "functional data analysis," this ML approach fully leverages the rich information and dependencies within SERS spectral data for improved modeling and detection capability. Tested on a large set of DNA and RNA SERS data, including from miR-21 (a known cancer miRNA biomarker), our approach is shown to accurately differentiate SERS spectra obtained from different oligonucleotides, outperforming various data-driven methods across several performance metrics, including accuracy, sensitivity, specificity, and F1-scores. Hence, this work represents a step forward in the development of the combined use of SERS and ML as effective methods for disease diagnosis with real applicability in the clinic.

Discrimination of Genetic Biomarkers of Disease through Machine-Learning-Based Hypothesis Testing of Direct SERS Spectra of DNA and RNA / Chheda, J.; Fang, Y.; Deriu, C.; Ezzat, A. A.; Fabris, L.. - In: ACS SENSORS. - ISSN 2379-3694. - 9:5(2024), pp. 2488-2498. [10.1021/acssensors.4c00166]

Discrimination of Genetic Biomarkers of Disease through Machine-Learning-Based Hypothesis Testing of Direct SERS Spectra of DNA and RNA

Deriu C.;Fabris L.
2024

Abstract

Cancer is globally a leading cause of death that would benefit from diagnostic approaches detecting it in its early stages. However, despite much research and investment, cancer early diagnosis is still underdeveloped. Owing to its high sensitivity, surface-enhanced Raman spectroscopy (SERS)-based detection of biomarkers has attracted growing interest in this area. Oligonucleotides are an important type of genetic biomarkers as their alterations can be linked to the disease prior to symptom onset. We propose a machine-learning (ML)-enabled framework to analyze complex direct SERS spectra of short, single-stranded DNA and RNA targets to identify relevant mutations occurring in genetic biomarkers, which are key disease indicators. First, by employing ad hoc-synthesized colloidal silver nanoparticles as SERS substrates, we analyze single-base mutations in ssDNA and RNA sequences using a direct SERS-sensing approach. Then, an ML-based hypothesis test is proposed to identify these changes and differentiate the mutated sequences from the corresponding native ones. Rooted in "functional data analysis," this ML approach fully leverages the rich information and dependencies within SERS spectral data for improved modeling and detection capability. Tested on a large set of DNA and RNA SERS data, including from miR-21 (a known cancer miRNA biomarker), our approach is shown to accurately differentiate SERS spectra obtained from different oligonucleotides, outperforming various data-driven methods across several performance metrics, including accuracy, sensitivity, specificity, and F1-scores. Hence, this work represents a step forward in the development of the combined use of SERS and ML as effective methods for disease diagnosis with real applicability in the clinic.
2024
File in questo prodotto:
File Dimensione Formato  
Submitted.pdf

accesso riservato

Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.31 MB
Formato Adobe PDF
2.31 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
chheda-et-al-2024-discrimination-of-genetic-biomarkers-of-disease-through-machine-learning-based-hypothesis-testing-of.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 5.31 MB
Formato Adobe PDF
5.31 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2995423