This study investigates the effect of vowel context (excerpted from speech versus sustained) on twovoice quality measures: the cepstral peak prominence smoothed (CPPS) and sample entropy (SampEn).Thirty-one dysphonic subjects with different types of organic dysphonia and thirty-one controls read aphonetically balanced text and phonated sustained [a:] vowels in comfortable pitch and loudness. Allthe [a:] vowels of the read text were excerpted by automatic speech recognition and phonetic (forced)alignment. CPPS and SampEn were calculated for all excerpted vowels of each subject, forming one distri-bution of CPPS and SampEn values per subject. The sustained vowels were analyzed using a 41 ms window,forming another distribution of CPPS and SampEn values per subject. Two speech-language pathologistsperformed a perceptual evaluation of the dysphonic subjects’ voice quality from the recorded text. Thepower of discriminating the dysphonic group from the controls for SampEn and CPPS was assessed for theexcerpted and sustained vowels with the Receiver-Operator Characteristic (ROC) analysis. The best dis-crimination in terms of Area Under Curve (AUC) for CPPS occurred using the mean of the excerpted voweldistributions (AUC=0.85) and for SampEn using the 95th percentile of the sustained vowel distributions(AUC=0.84). CPPS and SampEn were found to be negatively correlated, and the largest correlation wasfound between the corresponding 95th percentiles of their distributions (Pearson, r=−0.83, p < 10−3). Astrong correlation was also found between the 95th percentile of SampEn distributions and the perceptualquality of breathiness (Pearson, r=0.83, p < 10−3). The results suggest that depending on the acoustic voicequality measure, sustained vowels can be more effective than excerpted vowels for detecting dysphonia.Additionally, when using CPPS or SampEn there is an advantage of using the measures’ distributionsrather than their average values.

Effect of vowel context in cepstral and entropy analysis of pathological voices / Selamtzis, Andreas; Castellana, Antonella; Salvi, Giampiero; Carullo, Alessio; Astolfi, Arianna. - In: BIOMEDICAL SIGNAL PROCESSING AND CONTROL. - ISSN 1746-8094. - STAMPA. - 47:(2019), pp. 350-357. [10.1016/j.bspc.2018.08.021]

Effect of vowel context in cepstral and entropy analysis of pathological voices

Antonella Castellana;Alessio Carullo;Arianna Astolfi
2019

Abstract

This study investigates the effect of vowel context (excerpted from speech versus sustained) on twovoice quality measures: the cepstral peak prominence smoothed (CPPS) and sample entropy (SampEn).Thirty-one dysphonic subjects with different types of organic dysphonia and thirty-one controls read aphonetically balanced text and phonated sustained [a:] vowels in comfortable pitch and loudness. Allthe [a:] vowels of the read text were excerpted by automatic speech recognition and phonetic (forced)alignment. CPPS and SampEn were calculated for all excerpted vowels of each subject, forming one distri-bution of CPPS and SampEn values per subject. The sustained vowels were analyzed using a 41 ms window,forming another distribution of CPPS and SampEn values per subject. Two speech-language pathologistsperformed a perceptual evaluation of the dysphonic subjects’ voice quality from the recorded text. Thepower of discriminating the dysphonic group from the controls for SampEn and CPPS was assessed for theexcerpted and sustained vowels with the Receiver-Operator Characteristic (ROC) analysis. The best dis-crimination in terms of Area Under Curve (AUC) for CPPS occurred using the mean of the excerpted voweldistributions (AUC=0.85) and for SampEn using the 95th percentile of the sustained vowel distributions(AUC=0.84). CPPS and SampEn were found to be negatively correlated, and the largest correlation wasfound between the corresponding 95th percentiles of their distributions (Pearson, r=−0.83, p < 10−3). Astrong correlation was also found between the 95th percentile of SampEn distributions and the perceptualquality of breathiness (Pearson, r=0.83, p < 10−3). The results suggest that depending on the acoustic voicequality measure, sustained vowels can be more effective than excerpted vowels for detecting dysphonia.Additionally, when using CPPS or SampEn there is an advantage of using the measures’ distributionsrather than their average values.
File in questo prodotto:
File Dimensione Formato  
Effect of vowel context in cepstral and entropy analysis of pathological voices.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 464.24 kB
Formato Adobe PDF
464.24 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Selamtzis et al Effect of vowel context in cepstral and entropy analysis of pathological voices-convertito.pdf

Open Access dal 13/09/2020

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 547.49 kB
Formato Adobe PDF
547.49 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2728266