This paper describes a Continuous Speech Understanding System that allows information services to be accessed through the telephone line. It accepts queries within a restricted semantic domain, expressed in free but syntactically correct natural language, with a lexicon of the order of 800 words. In the implementation here described, a user can access an electronic mailbox or a train information service through a PABX telephone line. The architecture of the system is based on two main modules that represent and use different knowledge sources. A speaker independent recognition module generates, for each utterance, a lattice of word hypotheses which is the interface to an understanding module that performs the syntactic and semantic analysis. The recognition module is based on Hidden Markov Models of subword units, and performs the acoustic decoding process according to a beam search strategy. The understanding module finds the most likely sequence of words and represents its meaning in a format which facilitates the access to a database. It makes use of a modified caseframe analysis guided by the word hypotheses scores. Experiments were performed with 600 sentences from 10 speakers on the E-Mail application task. Using 15 Gaussian mixtures per state, a word accuracy of 75.7 was obtained with a test vocabulary of 787 words and no linguistic constraints. Linguistic processing of the corresponding lattices achieved a sentence understanding rate of 82%.

A Speech Understanding System for Information Retrieval / P., Baggia; L., Fissore; E., Giachin; G., Micca; C., Rullent; Laface, Pietro. - In: INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE. - ISSN 0218-0014. - STAMPA. - 8:1(1994), pp. 71-97. [10.1142/S0218001494000048]

A Speech Understanding System for Information Retrieval

LAFACE, Pietro
1994

Abstract

This paper describes a Continuous Speech Understanding System that allows information services to be accessed through the telephone line. It accepts queries within a restricted semantic domain, expressed in free but syntactically correct natural language, with a lexicon of the order of 800 words. In the implementation here described, a user can access an electronic mailbox or a train information service through a PABX telephone line. The architecture of the system is based on two main modules that represent and use different knowledge sources. A speaker independent recognition module generates, for each utterance, a lattice of word hypotheses which is the interface to an understanding module that performs the syntactic and semantic analysis. The recognition module is based on Hidden Markov Models of subword units, and performs the acoustic decoding process according to a beam search strategy. The understanding module finds the most likely sequence of words and represents its meaning in a format which facilitates the access to a database. It makes use of a modified caseframe analysis guided by the word hypotheses scores. Experiments were performed with 600 sentences from 10 speakers on the E-Mail application task. Using 15 Gaussian mixtures per state, a word accuracy of 75.7 was obtained with a test vocabulary of 787 words and no linguistic constraints. Linguistic processing of the corresponding lattices achieved a sentence understanding rate of 82%.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2584379
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo