This study investigates the effect of vowel context (excerpted from speech versus sustained) on twovoice quality measures: the cepstral peak prominence smoothed (CPPS) and sample entropy (SampEn).Thirty-one dysphonic subjects with different types of organic dysphonia and thirty-one controls read aphonetically balanced text and phonated sustained [a:] vowels in comfortable pitch and loudness. Allthe [a:] vowels of the read text were excerpted by automatic speech recognition and phonetic (forced)alignment. CPPS and SampEn were calculated for all excerpted vowels of each subject, forming one distri-bution of CPPS and SampEn values per subject. The sustained vowels were analyzed using a 41 ms window,forming another distribution of CPPS and SampEn values per subject. Two speech-language pathologistsperformed a perceptual evaluation of the dysphonic subjects’ voice quality from the recorded text. Thepower of discriminating the dysphonic group from the controls for SampEn and CPPS was assessed for theexcerpted and sustained vowels with the Receiver-Operator Characteristic (ROC) analysis. The best dis-crimination in terms of Area Under Curve (AUC) for CPPS occurred using the mean of the excerpted voweldistributions (AUC=0.85) and for SampEn using the 95th percentile of the sustained vowel distributions(AUC=0.84). CPPS and SampEn were found to be negatively correlated, and the largest correlation wasfound between the corresponding 95th percentiles of their distributions (Pearson, r=−0.83, p < 10−3). Astrong correlation was also found between the 95th percentile of SampEn distributions and the perceptualquality of breathiness (Pearson, r=0.83, p < 10−3). The results suggest that depending on the acoustic voicequality measure, sustained vowels can be more effective than excerpted vowels for detecting dysphonia.Additionally, when using CPPS or SampEn there is an advantage of using the measures’ distributionsrather than their average values.
Effect of vowel context in cepstral and entropy analysis of pathological voices / Selamtzis, Andreas; Castellana, Antonella; Salvi, Giampiero; Carullo, Alessio; Astolfi, Arianna. - In: BIOMEDICAL SIGNAL PROCESSING AND CONTROL. - ISSN 1746-8094. - STAMPA. - 47:(2019), pp. 350-357. [10.1016/j.bspc.2018.08.021]
Effect of vowel context in cepstral and entropy analysis of pathological voices
Antonella Castellana;Alessio Carullo;Arianna Astolfi
2019
Abstract
This study investigates the effect of vowel context (excerpted from speech versus sustained) on twovoice quality measures: the cepstral peak prominence smoothed (CPPS) and sample entropy (SampEn).Thirty-one dysphonic subjects with different types of organic dysphonia and thirty-one controls read aphonetically balanced text and phonated sustained [a:] vowels in comfortable pitch and loudness. Allthe [a:] vowels of the read text were excerpted by automatic speech recognition and phonetic (forced)alignment. CPPS and SampEn were calculated for all excerpted vowels of each subject, forming one distri-bution of CPPS and SampEn values per subject. The sustained vowels were analyzed using a 41 ms window,forming another distribution of CPPS and SampEn values per subject. Two speech-language pathologistsperformed a perceptual evaluation of the dysphonic subjects’ voice quality from the recorded text. Thepower of discriminating the dysphonic group from the controls for SampEn and CPPS was assessed for theexcerpted and sustained vowels with the Receiver-Operator Characteristic (ROC) analysis. The best dis-crimination in terms of Area Under Curve (AUC) for CPPS occurred using the mean of the excerpted voweldistributions (AUC=0.85) and for SampEn using the 95th percentile of the sustained vowel distributions(AUC=0.84). CPPS and SampEn were found to be negatively correlated, and the largest correlation wasfound between the corresponding 95th percentiles of their distributions (Pearson, r=−0.83, p < 10−3). Astrong correlation was also found between the 95th percentile of SampEn distributions and the perceptualquality of breathiness (Pearson, r=0.83, p < 10−3). The results suggest that depending on the acoustic voicequality measure, sustained vowels can be more effective than excerpted vowels for detecting dysphonia.Additionally, when using CPPS or SampEn there is an advantage of using the measures’ distributionsrather than their average values.File | Dimensione | Formato | |
---|---|---|---|
Effect of vowel context in cepstral and entropy analysis of pathological voices.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
464.24 kB
Formato
Adobe PDF
|
464.24 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Selamtzis et al Effect of vowel context in cepstral and entropy analysis of pathological voices-convertito.pdf
Open Access dal 13/09/2020
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
547.49 kB
Formato
Adobe PDF
|
547.49 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2728266