Table Of ContentSpoken Language Generation and Understanding
NATO ADVANCED STUDY INSTITUTES SERIES
Proceedings of the Advanced Study Institute Programme, which aims
at the dissemination of advanced knowledge and
the formation of contacts among scientists from different countries
The series is published by an international board of publishers in conjunction
with NATO Scientific Affairs Division
A Life Sciences Plenum Publishing Corporation
B Physics London and New York
C Mathematical and D. Reidel Publishing Company
Physical Sciences Dordrecht, Boston and London
D Behavioural and Sijthoff & Noordhoff International
Social Sciences Publishers
E Applied Sciences Alphen aan den Rijn and Germantown
U.s.A.
Series C - Mathematical and Physical Sciences
Volume 59 - Spoken Language Generation and Understanding
Spoken Language
Generation
and Understanding
Proceedings of the NATO Advanced Study Institute
held at Bonas, France, June 26 - July 7, 1979
edited by
J. C. SIMON
Institut de Programmlltion
Universite Pierre et Marie Curie, Paris VI, France
D. Reidel Publishing Company
Dordrecht : Holland ! Boston: U.S.A. ! London: England
Published in cooperation with NATO Scientific Affairs Division
library of Congress Cataloging in Publication Data
NATO Advanced Study Institute on speech, 1st, Bonas, France, 1979
Spoken language generation and understanding.
(NATO advanced study institute series: Series C, Mathematical and
physica). sciences; v. 59)
Includes index.
1. Speech processing systems-Congresses. 2. Speech perception-
Congresses. I. Simon, Jean Claude, 1923 - II. Title. III. Series.
TK7882.S65N37 1979 621.3819'598 80-26841
ISBN-13: 978-94-009-9093-7 e-ISBN-13: 978-94-009-9091-3
DOI: 10.1007/978-94-009-9091-3
Published by D. Reidel Publishing Company
P.O. Box 17,3300 AA Dordrecht, Holland
Sold and distributed in the U.S.A. and Canada
by Kluwer Boston Inc.,
190 Old Derby Street, Hingham, MA 02043, U.S.A.
In all other countries, sold and distributed
by Kluwer Academic Publishers Group,
P.O. Box 322,3300 AH Dordrecht, Holland
D. Reidel Publishing Company is a member of the Kluwer Group
All Rights Reserved
Copyright © 1980 by D. Reidel Publishing Company, Dordrecht, Holland
Softcover reprint of the hardcover I st edition 1980
No part of the material protected by this copyright notice may be reproduced or utilized
in any form or by any means, electronic or mechanical, including photocopying,
recording or by any informational storage and retrieval system,
without written permission from the copyright owner
TABLE OF CONTENTS
ACKNOWLEDGEMENTS
PREFACE / How to Read the Book: Some Comments Xl
LIST OF PARTICIPANTS XV
§ 1. AN OVERVIEW, WITH AN EMPHASIS ON PSYCHOLOGY
*D.R. HILL / Spoken Language Generation and Understanding
by Machine: A Problems and Applications 3
Oriented Overview
*W.D. MARSLEN-WILSON / Speech Understanding as a
Psychological Process 39
M.J. UNDERWOOD / What the Engineers Would Like to Know
from the Psychologists 69
C.J. DARWIN and A. DONOVAN / Perceptual Studies of Speech
Rhythm: Isochrony and Intonation 77
K. SHIRAI and M. HONDA / Estimation of Articulatory Motion
from Speech Waves and Its Application for 87
Automatic Recognition
§ 2. ACOUSTIC AND PHONEMIC
*J.J. WOLF / Speech Signal Processing and Feature Extraction 103
*J.S. BRIDLE / Pattern Recognition Techniques for
Speech Recognition 129
*T. SAKAI/Automatic Mapping of Acoustic Features into
Phonemic Labels 147
*R. DE MORI / Automatic Phoneme Recognition in Continuous
Speech: A Syntactic Approach 191
*K.S. FU / Syntactic Approach to Pattern Recognition 221
P. ALINAT / Phoneme Recognition Using a Cochlear Model 253
*A~e-Re;Ie;-P~pe~~
vi TABLE OF CONTENTS
*W.J. HESS / Pitch Determination of Speech Signals
- A Survey 263
M. BAUDRY and B. DUPEYRAT / Speech Analysis Using Syntactic
Methods and a Pitch Synchronous Formant Detector 279
on the Direct Signal
G. CHOLLET / Variability of Vowel Formant Frequency 1n
Different Speech Styles 293
§ 3. LEXICON, SYNTAX AND SEMANTIC
*J-P. HATON / The Representation and Use of a Lexicon in
Automatic Speech Recognition and Understanding 311
*W.A. WOODS / Control of Syntax and Semantics in Continuous
Speech Understanding 337
R. DE MORI and L. SAITTA / A Classification Method Based
on Fuzzy Naming Relations over Finite Languages 365
§ 4. SPEECH SYNTHESIS
*J. ALLEN / Speech Synthesis from Text
383
*J-S. LIENARD / An Over-view of Speech Synthesis 397
W.K. ENDRES and H.E. WOLF / Speech Synthesis for an
Unlimited Vocabulary, a Powerful Tool for Inquiry 413
and Information Services
X. RODET / Time-Domain Formant-Wave-Function Synthesis 429
R. LINGGARD and F.J. MARLOW / A Programmable, Digital
Speech Synthesiser 443
D. CHRISTINAZ, K.M. COLBY, S. GRAHAM, R. PARKISON and
L. CHIN / An Intelligent Speech Prosthesis with 455
a Lexical Memory
§ 5. SYSYTEMS AND APPLICATIONS
K. NAKATA / Industrial Applications of Speech Recognition 471
J. MARIANI/Some Points Concerning Speech Communication
with Computers 475
T ABLE OF CONTENTS vii
P. MEIER / Secure Speech Communication over a CCITT-Speech
Channel 485
S. MAITRA / Speech Compression/Recognition with Robust
Features 497
C. BELLISSANT / A Real-Time System for Speech Recognition 505
M-C. RATON / Speech Training of Deaf Children Using
the SIRENE System: First Results and 517
Conclusions
G. MERCIER, A. NOUHEN, P. QIUNTON and J. SIROUX / The KEAL
Speech Understanding System 525
R. BISIANI/The LOCUST System 545
J.M. PIERREL and J.P. RATON / The MYRTILLE II Speech
Understanding System 553
INDEX OF NAMES 571
ACKNOWLEDGEMENTS
In the first place, I wish to thank specially my colleagues
J.P. HATON and R. DE MORI, who have helped me to put together
this ASI and to make it a success on the spot. Dr L.C.W. POLS
has also given valuable advice.
The editing committee should also be thanked for
helping with the pUblication of this book, particularly Drs W.
HESS and D.R. HILL.
On the other hand, only through the financial support
and the framework provided by the NATO Scientific Affairs Division
could such a coherent, high-level meeting be made possible.
The material support of IRIA and Institut de Programma
tion should also be gratefully acknowledged.
Finally, I wish, as the director of the contributions
and also as the animateur of the Centre Culture 1 de BONAS, to
thank all the participants of the ASI for their friendly com
prehension during their stay ...
Director of the rJNrO ASI, J.C. SH10N, Institut de Programmation
Universite Pierre et Marie Curie, 4 place Jussieu, 75230, Paris
Cedex 05.
Advisory Committee: J.P. HATOlJ, Informatique, Universite Nancy I
Case Officielle 140, 54037 Nancy Cedex. France.
R. DE MORI, Universita di Torino, Istituto di Scienze
dell'Informazione, Corso M. d'Azeglio 42, Torino, Ita.
Editing Committee: J.S. BRIDLE (G.B.), J.P. HATON (Fr.), HESS
(R.F.A.) HILL (Can.), DE MORI (It.), J.C. SIMON (Fr.), J.J. HOLF
(U.S .A. ).
IX
1. C. Simon (ed.), Spoken Language Generation and Understanding, IX.
Copyright © 1980 by D. Reidel Publishing Company.
PREFACE
HOW TO READ THE BOOK: SOME COMMENTS
This book is the lasting result of the first NATO
Advanced Study Institute on Speech, held at the Centre Culturel
du Chateau de Bonas, from June 26 to July 7, 1979.
The intent of a NATO ASI is primarily to provide high
level tutorial coverage of a field in WhlCh research is active;
undoubtedly speech generation and understanding is one at the
present time.
Thus 12 surveys are offered by some of the best specia
lists in the field. As a consequence the book may be consldered
as a reference book on speech. The surveys are marked by a *
in the Table of Contents.
However, half of the meeting was devoted to dlSCuSSlons
and presentations of research. A reviewing Committee decided to
ask a number of participants to submit a contribution which would
complete the tutorials or would present original work.
A beginner in the subject should start by readlng the
reVlews of Hill, Wolf, Bridle, Haton, Woods, Al~en, Lienard;
preferably in the above order. He would then be familiar with the
terms, the problems and the techniques of speech understanding
(analysis) and generation (synthesis).
A reader who is already advanced in the field would also
profit from the tutorials and will easily find his way among the
five sections:
1. An overview, with the emphasis on psychology
An effort was made to bring to the ASI the research results of
psycholinguistics. The review of Marslen-Wilson is of prime
interest in that respect.
2. Acoustics and Phonemics
Speech recognition is a multilevel process. This section describes
xi
J. C. Simon (ed.). Spoken Language Generation and Understanding, xl-xiii.
Copyright © 1980 by D. ReIdel Publishing Company.
PREFACE
the very firs~ levels of signal treatment. Many different
techniques have been studied and implemented in experimental
systems. As in other fields of pattern recognition, preprocessing
and the determination of first level features relates more to an
art than to a science. A special emphasis has been given to the
syntactic approach by De Mori and K.S. Fu.
Though most of the data are extracted from vocoder type
systems, some direct measurements on the speech signal itself
are also in use, for example, the number of zero-crossings.
Hess, Baudry and Dupeyrat give examples of the use of operators
on the signal itself to determine the pitch and the formants
frequencies.
An agreement seems to exist on what are the essential
features to be measured at these first levels:
- for voiced speech, the pitch peak instants or frequency;
- the instant frequency and ampZitude of the different formants.
Further, many efforts have been made to detect the
phonemes, as they are proposed by the phoneticians. In our own
judgement, they seem a concrete reality. But the variability
of their representation in the signal and even in the instant
spectrum seems very great; cf. Chollet, Allen. This casts some
doubts on the possibility of reliable detection.
Still a lot of research work has to be done at these
acoustic and phonemic levels to achieve the some reliability as
an ordinary human being hearing meaningless words in difficult
surroundings (noisy or distorted).
3. Lexicon, Syntax, and Semantic
Being a multilevel recognition process, the upper levels may
contribute to the lower level detections. In other words, the
meaning of the sentence may allow one to correct faulty phonemic
determinations. There is strong evidence that a human being
hearing speech under poor conditions does make this sort of
restoration. The syntax and the semantics of a sentence may thus
help to obtain the correct determination, even if the first level
detections were faulty.
A lot of work has been done along these lines; In
particular in the U.S.A. Woods' paper gives an excellent reVlew.
Certainly there is a lot more to understanding con
tinuous speech than to understanding isolated words, with or
without the word being in a dictionary. But the initial hopes
of understanding very faulty phonemic determinations have not
been fulfilled; and one of the main results of these important
studies is that a better phonemic determination should be ob
tained in the first place.
4. Speech synthesis
The generation of speech from a written text transformed In a