A system for recognizing isolated utterances belonging to a very large vocabulary is presented that follows a two-pass strategy. The first step, hypothesization, consists in the selection of a subset of word candidates, starting from the segmentation of speech into six broad phonetic classes. This module is implemented through a dynamic programming algorithm working in a three-dimensional space. The search is performed on a tree representing a coarse description of the lexicon. The second step is the search for the best N candidates according to a maximum-likelihood criterion. Each word candidate is represented by a graph of subword hidden Markov models, and a tree structure of the whole word subset is built on line for an efficient implementation of the Viterbi algorithm. A comparison with a direct approach that does not use the hypothesization module shows that the two-pass approach has the same performance with an 80% reduction in computational complexity.
Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies / L., Fissore; Laface, Pietro; G., Micca; R., Pieraccini. - STAMPA. - (1988), pp. 203-206. (Intervento presentato al convegno International Conference on Acoustics, Speech, and Signal Processing, ICASSP-88 tenutosi a New York (USA) nel 1988) [10.1109/ICASSP.1988.196549].
Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies
LAFACE, Pietro;
1988
Abstract
A system for recognizing isolated utterances belonging to a very large vocabulary is presented that follows a two-pass strategy. The first step, hypothesization, consists in the selection of a subset of word candidates, starting from the segmentation of speech into six broad phonetic classes. This module is implemented through a dynamic programming algorithm working in a three-dimensional space. The search is performed on a tree representing a coarse description of the lexicon. The second step is the search for the best N candidates according to a maximum-likelihood criterion. Each word candidate is represented by a graph of subword hidden Markov models, and a tree structure of the whole word subset is built on line for an efficient implementation of the Viterbi algorithm. A comparison with a direct approach that does not use the hypothesization module shows that the two-pass approach has the same performance with an 80% reduction in computational complexity.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2584459
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo