Data Driven Approaches to Speech and Language Processing

Chollet, Gérard; McTait, Kevin; Petrovska-Delacrétaz, Dijana

doi:10.1007/11520153_8

Gérard Chollet²²,
Kevin McTait²² &
Dijana Petrovska-Delacrétaz²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

International School on Neural Networks, Initiated by IIASS and EMFCSC

1243 Accesses

Abstract

Speech and language processing systems can be categorised according to whether they make use of predefined linguistic information and rules or are data driven and therefore exploit machine learning techniques to automatically extract and process relevant units of information which are then indexed and retrieved as appropriate. As an example, most state of the art automatic speech processing systems rely on a representation based on predefined phonetic symbols. The use of language dependent representations, whilst linguistically intuitive, has several drawbacks i.e. portability across languages, development time. Therefore, in this article, we review and present our recent experiments exploiting the idea inherent in the ALISP (Automatic Language Independent Speech Processing) approach, with particular respect to speech processing, where the intermediate representation between the acoustic and linguistic levels area is automatically inferred from speech data. We then present prospective directions in which the ALISP principles could be exploited by different domains such as audio, speech, text, image and video processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Speech Recognition

Improving Automatic Speech Recognition with Dialect-Specific Language Models

Turkish Speech Recognition

References

Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice Conversion Through Vector Quantization. In: Proceedings ICASSP, New York, pp. 565–568 (1988)
Google Scholar
Aho, A.V.: Data Structures and Algorithms. Addison-Wesley, Reading (1983)
MATH Google Scholar
Ahlbom, G., Bimbot, F., Chollet, G.: Modeling Spectral Speech Transitions using Temporal Decomposition Techniques. In: Proceedings IEEE ICASSP, Dallas, pp. 13–16 (1987)
Google Scholar
Aleksic, P., Williams, J., Katsaggelos, A.: Speech-To-Video Synthesis Using MPEG-4 Compliant Visual Features. IEEE Trans. Circuits and Systems for Video Technology 14(5), 682–692 (2004)
Article Google Scholar
Ammicht, E., Gorin A.L., Alonso T.: Knowledge Collection for Natural Spoken Dialog Systems. In: Proceedings EUROSPEECH, Budapest, Hungary (1999).
Google Scholar
Atal B.: Efficient Coding of LPC Parameters by Temporal Decomposition. In: Proceedings ICASSP, pp. 81–84 (1983)
Google Scholar
Baudoin, G., Cernocky, J., Chollet, G.: Quantization of Spectral Sequences using Variable Length Spectral Segments for Speech Coding at Very Low Bit Rate. In: Proceedings EUROSPEECH, Rhodes, pp. 1295–1298 (1997)
Google Scholar
Baudoin, G., Cernocky, J., Gournay, P., Chollet, G.: Codage de la parole à bas et très bas débit. Annales des télécommunications 55, 462–482 (2000)
Google Scholar
Baudoin, G., Cernocky, J., El Chami, F., Charbit, M., Chollet, G., Petrovska- Delacretaz, D.: Advances in Very Low Bit Rate Speech Coding using Recognition and Synthesis Techniques. In: Proceedings of the 5th Text, Speech and Dialog Workshop, Brno, pp. 269–276. Czech Republic (2002) ISBN 3-540-44129-8
Google Scholar
Bayer, R., Unterauer, K.: Prefix B-Trees. ACM Transactions on Database Systems 2(1), 11–26 (1977)
Article Google Scholar
Berger, A., Brown, P., Della Pietra, S., Della Pietra, V., Gillett, J., Lafferty, J., Mercer, R., Printz, H., Ures, L.: The Candide System for Machine Translation. In: Proceedings of the ARPA Workshop on Human Language Technology (1994)
Google Scholar
Bimbot, F., Chollet, G., Deleglise, P., Montacié, C.: Temporal Decomposition and Acoustic-Phonetic decoding of Speech. In: Proceedings IEEE ICASSP, New York, pp. 445–448 (1988)
Google Scholar
Bimbot, F., Deleglise, P., Chollet, G.: Speech Synthesis by Structured Segments using Temporal Decomposition. In: Proceedings EUROSPEECH, Paris, pp. 183–186 (1989)
Google Scholar
Bimbot, F., Pieraccini, R., Levin, E., Atal, B.: Variable Length Sequence Modelling: Multigrams. IEEE Signal Processing Letters 2(6), 111–113 (1995)
Article Google Scholar
Black, E., Jelinek, F., Lafferty, J.D., Magerman, D.M., Mercer, R.L., Roukos, S.: Towards History-Based Grammars: Using Richer Models for Probabilistic Parsing. In: Proceedings DARPA Speech and Natural Language Workshop, Harriman, NY, pp. 134–139 (1992)
Google Scholar
Black, A., Brown, R.D., Frederking, R., Singh, R., Moody, J., Steinbrecher, E.: TONGUES: Rapid Development of a Speech-to-Speech Translation System. In: Proceedings of HLT 2002: Second International Conference on Human Language Technology Research, San Diego, CA , pp. 24–27 (2002)
Google Scholar
Blouet, R., Mokbel, C., Mokbel, H., Sanchez-Soto, E., Chollet, G., Greige, H.: BECARS: A Free Software for Speaker Verification. In: Proceedings ODYSSEY 2004 - The Speaker and Language Recognition Workshop, Toledo, Spain, pp. 145–148 (2004)
Google Scholar
Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio. In: Proceedings ACM SIGGRAPH 1997 (1997)
Google Scholar
Brown, P.F., Della Pietre, S.A., Della Pietra, V.J., Mercer, R.: Word-Sense Disambiguation using Statistical Methods. In: Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, pp. 264–270 (1991)
Google Scholar
Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Mercer, R., Roossin, P.: A Statistical Approach to Language Translation. In: Coling Budapest: Proceedings of the 12th International Conference on Computational Linguistics, Budapest, Hungary, pp. 71–77 (1998)
Google Scholar
Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J., Mercer, R.L., Roossin, P.S.: A Statistical Approach to Machine Translation. Computational Linguistics 16, 79–85 (1990)
Google Scholar
Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19, 263–311 (1993)
Google Scholar
Brown, R.D.: Example-Based Machine Translation in the PANGLOSS System. In: COLING 1996: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 169–174 (1996)
Google Scholar
Brown, R.D.: Automated Dictionary Extraction for Knowledge-Free Example- Based Translation. In: Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation, Santa Fe, New Mexico, pp. 111–118 (1997)
Google Scholar
Brown, R.D., Frederking, R.E.: Applying Statistical Language Modelling to Symbolic Machine Translation. In: Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation, Leuven, Belgium, pp. 354–372 (1995)
Google Scholar
Cappe, O., Stylianou, Y., Moulines, E.: Statistical Methods For Voice Quality Transformation. In: Proceedings of EUROSPEECH 1995, Madrid, Spain, pp. 447–450 (1995)
Google Scholar
Carpenter, G., Grossberg, S.: A Massively Parallel Architecture for a Self- Organizing Neural Pattern Recognition Machine. Proceedings of Computer Vision, Graphics and Image Processing 37, 54–115 (1987)
Article Google Scholar
Casacuberta, F., Vidal, E., Vilar, J.-M.: Architectures for Speech-to-Speech Translation using Finite-State Models. In: Proceedings of the Workshop on Speech-to- Speech Translation: Algorithms and Systems, Philadelphia, pp. 39–44 (2002)
Google Scholar
Cernocky, J., Baudoin, G., Chollet, G.: Speech Spectrum Representation and Coding using Multigrams with Distance. In: Proceedings IEEE ICASSP, Munich, pp. 1343–1346 (1997)
Google Scholar
Cernocky, J., Baudoin, G., Chollet, G.: Segmental Vocoder - Going Beyond the Phonetic Approach. In: Proceedings IEEE ICASSP, Seattle, pp. 605–608 (1998) ISBN 0-7803-4428-6
Google Scholar
Cernocky, J., Baudoin, G., Chollet, G.: Very Low Bit Rate Segmental Speech Coding using Automatically Derived Units. In: Proceedings RADIOELEKTRONIKA, Brno, Czech Republic, pp. 224–227 (1998) ISBN 80-214-0983-5
Google Scholar
Cernocky, J., Petrovska-Delacretaz, D., Pigeon, S., Verlinde, P., Chollet, G.: A Segmental Approach to Text-Independent Speaker Verification. In: Proceedings EUROSPEECH, Budapest, vol. 5, pp. 2203–2206 (1999)
Google Scholar
Cernocky, J., Kopecek I., Baudoin, G., Chollet, G.: Very Low Bit Rate Speech Coding: Comparison of Data-Driven Units with Syllable Segments. In: Proceedings of the Text, Speech and Dialog Workshop, Pilsen, Czech Republic, pp. 257–262 (1999) ISBN 3-540- 66494-7
Google Scholar
Cernocky, J., Baudoin, G., Petrovska-Delacretaz, D., Chollet, G.: Vers une analyse acoustico-phonétique de la parole indépendante de la langue, basée sur ALISP. Revue Parole 17, 191–226 (2001) ISSN 1373-1955
Google Scholar
Charniak, E.: Statistical Language Learning. MIT Press, Cambridge (1993)
Google Scholar
Charniak, E.: Statistical Parsing with a Context-Free Grammar and Word Statistics. In: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI 1997), Menlo Park, CA, pp. 598–603 (1997)
Google Scholar
Chollet, G., Galliano, J.-F., Lefevre, J.-P., Viara, E.: On the Generation and Use of a Segment Dictionary for Speech Coding, Synthesis and Recognition. In: Proceedings IEEE ICASSP, Boston, pp. 1328–1331 (1983)
Google Scholar
Chollet, G., Grenier, Y., Marcus, S.: Segmentation and Non-Stationary Modeling of Speech. In: Proceedings EUSIPCO, The Hague (1986)
Google Scholar
Chollet, G., Cernocky, J., Constantinescu, A., Deligne, S., Bimbot, F.: Toward ALISP: Automatic Language Independent Speech Processing. In: Ponting, K., Moore, R. (eds.) Computational Models for Speech Pattern Processing, pp. 375–387. Springer, Heidelberg (1999) ISBN 3-540-65478-X
Google Scholar
Chollet, G., Cernocky, J., Gravier, G., Hennebert, J., Petrovska-Delacretaz, D., Yvon, F.: Toward Fully Automatic Speech Processing Techniques for Interactive Voice Servers. In: Chollet, G., Di Benedetto, M.-G., Esposito, A., Marinaro, M. (eds.) Speech Processing, Recognition and Artificial Neural Networks, Springer, Heidelberg (1999)
Google Scholar
Chollet, G., Cernocky, J., Baudoin, G.: Unsupervised Learning for Very Low Bit Rate Coding. In: Proceedings of SCI-ISAS 2000, Orlando (2000)
Google Scholar
Chu-Carroll, J., Carpenter, B.: Vector-based Natural Language Call Routing. Computational Linguistcs 25(3), 361–388 (1999)
Google Scholar
Church, K.: A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In: Proceedings Second Conference on Applied Natural Language Processing, ACL, Austin, Texas, pp. 136–143 (1988)
Google Scholar
Collins, B., Cunningham, P.: Adaptation Guided Retrieval in EBMT: A Case- Based Approach to Machine Translation. In: Smith, I., Faltings, B.V. (eds.) EWCBR 1996. LNCS, vol. 1168, pp. 91–104. Springer, Heidelberg (1996)
Chapter Google Scholar
Cutting, D., Pedersen, J.: Optimizations for Dynamic Inverted Index Maintenance. In: Proceedings 13th International Conference on Research and Development in Information Retrieval, Brussels, Belgium, pp. 405–411 (1990)
Google Scholar
Cutting, D., Kupiec, J., Pedersen, J., Sibun, P.: A Practical Part-of-Speech Tagger. In: Third Conference on Applied Natural Language Processing, Trento, Italy, pp. 133–140 (1992)
Google Scholar
Daelemans, W., Zavrel, J., Berck, S.: MBT: A Memory Based Part of Speech Tagger-Generator. In: Proceedings of the 4th Workshop on Very Large Corpora, Copenhagen, Denmark, pp. 14–27 (1996)
Google Scholar
Dagan, I., Perreira, F., Lee, L.: Similarity Based Estimation ofWord Co-occurence Probabilities. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, pp. 272–278 (1994)
Google Scholar
Damper, R.I. (ed.): Data Driven Techniques in Speech Synthesis. Kluwer, Dordrecht (2001)
Google Scholar
Deligne, S., Bimbot, F.: Language Modeling by Variable Length Sequences: Theoretical Formulation and Evaluation of Multigrams. In: Proceedings ICASSP, Munich, pp. 1731–1734 (1997)
Google Scholar
Deligne, S., Bimbot, F.: Inference of Variable-length Linguistic and Acoustic Units by Multigrams. Speech Communication 23, 223–241 (1997)
Article Google Scholar
Deligne, S., Yvon, F., Bimbot, F.: Introducing Statistical Dependencies and Structural Constraints in Variable-Length Sequence Models. In: Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences, Montpellier, France, pp. 156–167 (1996)
Google Scholar
Doddington, G., Martin, A., Przybocki, M., Reynolds, D.: The NIST Speaker Recognition Evaluation - Overview, Methodology, Systems, Results, Perspectives. Speech Communications 31(2-3), 225–254 (2000)
Article Google Scholar
Dorr, B. J., Jordan, P. W., Benoit, J. W.: A Survey of Current Paradigms in Machine Translation. Technical Report: LAMP-TR-027, UMIACS-TR-98-72, CSTR- 3961, University of Maryland, College Park (December 1998)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley and Sons, Chichester (2001)
MATH Google Scholar
Du Jeu, C., Charbit, M., Chollet, G.: Very Low Rate Speech Compression by Indexation of Polyphones. In: Proceedings of EUROSPEECH, Geneva, pp. 1085–1088 (2003)
Google Scholar
Eatock, J.P., Mason, J.S.: A Quantitative Assessment of the Relative Speaker Discriminant Properties of Phonemes. In: Proceedings ICASSP, vol. 1, pp. 133–136 (1994)
Google Scholar
El Hannani, A., Petrovska-Delacretaz, D., Chollet, G.: Linear and Non-linear Fusion of ALISP- and GMM-Based Systems for Text-Independent Speaker Verification. In: Proceedings of ISCA Workshop: A Speaker Odyssey, Toledo, Spain, pp. 111–116 (2004)
Google Scholar
Farinas, J., Obrecht, R.A.: Modélisation phonotactique de grandes classes phonétiques en vue d’une approche différenciée en identification automatique des langues. In: Proceedings 18ème colloque GRETSI sur le traitement du signal et des images, Toulouse, France (2001)
Google Scholar
Frakes, W.B., Baeza-Yates, R.: Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs (1992)
Google Scholar
Fukunaga, K.: Statistical Pattern Recognition, 2nd edn. Academic Press, London (1990)
MATH Google Scholar
Gailly, J.-L., Nelson, M.: The Data Compression Book. John Wiley and Sons, Chichester (1995)
Google Scholar
Gale, W., Church, K.W., Yarowsky, D.: Work on Statistical Methods for Word Sense Disambiguation. In: Proceedings of the AAAI Fall Symposium: Probabilistic Approaches to Natural Language, Cambridge, MA, pp. 54–60 (1992)
Google Scholar
Gonnet, G.H., Baeza-Yates, R.: Handbook of Algorithms and Data Structures, 2nd edn. Addison-Wesley, Reading (1991)
Google Scholar
Gorin, A.L., Petrovska-Delacrétaz, D., Riccardi, G., Wright, J.H.: Learning Spoken Language without Transcriptions. In: Proceedings IEEE Workshop on Automatic Speech Recognition and Understanding (1999)
Google Scholar
Gorin, A.L.: How I Help You? Speech Communication 23, 113–127 (1997)
Article Google Scholar
Gorin, A.L.: On Automated Language Acquisition. Journal of the Acoustical Society of America JASA 97(6), 3441–3461 (1995)
Article Google Scholar
Gorin, A.L., Levinson, S., Sankar, A.: An Experiment in Spoken Language Acquisition. Proceedings IEEE Transactions on Speech and Audio 2, 224–240 (1994)
Article Google Scholar
Haines, D., Croft, W.B.: Relevance Feedback and Inference Networks. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburg, Penn, pp. 2–11 (1993)
Google Scholar
Hankerson, D., Harris, G.A., Johnson, P.D.: Introduction to Information Theory and Data Compression. CRC Press, Boca Raton (2003)
Book MATH Google Scholar
Harbeck, S., Ohler, U.: Multigrams for Language Identification. In: Proceedings EUROSPEECH, Budapest, Hungary (1999)
Google Scholar
Harman, D., Baeza-Yates, R., Fox, E., Lee, W.: Inverted Files. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures and Algorithms, Prentice Hall, Englewood Cliffs (1992)
Google Scholar
Ho, Y.: Application of Minimal Perfect Hashing in Main Memory Indexing. MITLCS-TM-508 (1994)
Google Scholar
Jensen, F.V.: Bayesian Networks and Decision Graphs. Springer (2001)
Google Scholar
Jelinek, F.: Self-Organized Language Modeling for Speech Recognition. In: Waibel, A., Lee, K.F. (eds.) Readings in Speech Recognition, pp. 450–506. Morgan Kaufmann Publishers, San Mateo (1990)
Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1999)
Google Scholar
Kain, A., Macon, M.W.: Spectral Voice Conversion for Text to Speech Synthesis. In: Proceedings ICASSP 88, New York, vol. 1, pp. 285–288 (1998)
Google Scholar
Kain, A., Macon, M.W.: Design and Evaluation of a Voice Conversion Algorithm Based on Spectral Envelope Mapping and Residual Prediction. In: Proceedingsd ICASSP 2001, Salt Lake City, USA (2001)
Google Scholar
Kaji, H., Kida, Y., Morimoto, Y.: Learning Translation Templates from Bilingual Text. In: Proceedings of the 14th Conference on Computational Linguistics, Nantes, France, vol. 2, pp. 672–678 (1992)
Google Scholar
Karam, W., Mokbel, C., Aversano, G., Pelachaud, C., Chollet, G.: An Audiovisual Imposture Scenario by Talking Face Animation. In: Chollet, G., Esposito, A., Faundez, M., Marinaro, M. (eds.) Nonlinear Speech Processing: Algorithms and Analysis, Springer, Heidelberg (2005) (in this volume)
Google Scholar
Knuth, D.E.: The Art of Computer Programming. Addison Wesley, Reading (1973)
Google Scholar
Kohonen, T.: Self Organizing Maps. Springer, Heidelberg (1995)
Google Scholar
Koza, J.R.: Genetic Programming. MIT Press, Cambridge (1992)
MATH Google Scholar
Kuo, H.-K.J., Lee, C.-H.: A Portability Study on Natural Language Call Steering. In: Proceedings EUROSPEECH, Aalborg, Denmark (2001)
Google Scholar
Lamel, L.F, Gauvain, J.-L., Eskénazi, M.: BREF, A Large Vocabulary Spoken Corpus for French. In: Proceedings of the European Conference on Speech Technology, EUROSPEECH, pp. 505–508 (1991)
Google Scholar
Laroche, J., Stylianou, Y., Moulines, E.: HNM: A Simple, Efficient Harmonic Plus Noise Model for Speech. In: Proceedings of IEEE ASSPWorkshop on Applications of Signal Processing to Audio and Acoustics (1993)
Google Scholar
Lee, K.-S., Cox, R.V.: A Segmental Speech Coder Based on a Concatenative TTS. Speech Communication 38(1), 89–100 (2002)
Article MATH MathSciNet Google Scholar
Levenshtein, V.I.: Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Cybernetics and Control Theory 10, 707–710 (1966)
MathSciNet Google Scholar
Levin, L., Lavie, A., Woszczyna, M., Gates, D., Gavaldá, M., Koll, D., Waibel, A.: The Janus-III Translation System: Speech-to-Speech Translation in Multiple Domains. Machine Translation Archive 15(1-2), 3–25 (2000)
Article MATH Google Scholar
Lloyd-Thomas, H., Parris, E., Wright, J.W.: Recurrent Substrings and Data Fusion for Language Recognition. In: Proceedings ICSLP, Sydney, Australia (1998)
Google Scholar
Lowrance, R., Wagner, R.A.: An Extension of the String-to-String Correction Problem. Journal of the Association of Computing Machinery 22(2), 177–183 (1975)
MATH MathSciNet Google Scholar
Manning, C.D., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
MATH Google Scholar
Martin, A., Przybocki, M.: The NIST Speaker Recognition Evaluations: 1996-2001. In: Proceedings Odyssey 2001, Crete, Greece, pp. 39–42 (2001)
Google Scholar
Marcu, D., Wong, W.: A Phrase-Based, Joint Probability Model for Statistical Machine Translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Philadelphia, PA, pp. 133–139 (2002)
Google Scholar
Mc-Tait, K.: Translation Patterns, Linguistic Knowledge and Complexity in an Approach to EBMT. In: Carl, M., Way, A. (eds.) Recent Advances in Example-Based Machine Translation, Kluwer Academic Press, Amsterdam (2003)
Google Scholar
McTait, K.: Translation Pattern Extraction and Recombination for Example- Based Machine Translation. Ph.D. Thesis, University of Manchester Institute of Science and Technology, Manchester, UK (2001)
Google Scholar
McTait, K., Trujillo, A.: A Language-Neutral Sparse-Data Algorithm for Extracting Translation Patterns. In: Proceedings of the 8th International Conference on Theoretical and Methodological Issues in Machine Translation TMI 1999, Chester, UK, pp. 98–108 (1999)
Google Scholar
McTait, K., Olohan, M., Trujillo, A.: A Building Blocks Approach to Translation Memory. In: Proceedings of the 21st ASLIB International Conference on Translating and the Computer, London, UK (1999)
Google Scholar
Melamed, I.D.: A Word-To-Word Model of Translation Equivalence. In: 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, Madrid, Spain, pp. 490–497 (1997)
Google Scholar
Merialdo, B.: Tagging English Text with a Probabilistic Model. Computational Linguistics 20(2), 155–172 (1994)
Google Scholar
Metze, F., McDonough, J., Soltau, H., Waibel, A., Lavie, A., Burger, S., Langley, C., Levin, L., Schultz, T., Pianesi, F., Cattoni, R., Lazzari, G., Mana, N., Pianta, E.: The NESPOLE! Speech-to-Speech Translation System. In: Proceedings of HLT 2002 Human Language Technology Conference, San Diego, CA (2002)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Mitchell, T.M.: Machine Learning and Data Mining. Communications of the ACM 42(11), 30–36 (1999)
Article Google Scholar
Morimoto, T., Takezawa, T., Yato, F., Sagayama, S., Tashiro, M., Nagata, M., Kurematsu, A.: ATR’s Speech Translation System: ASURA. In: Proceedings EUROSPEECH 1993, pp. 1291–1295 (1993)
Google Scholar
Nagao, M.: A Framework of a Mechanical Translation between Japenese and English by Analogy Principle. In: Elithorn, A., Banerji, R. (eds.) Artificial and Human Intelligence., pp. 173–180. NATO Publications (1984)
Google Scholar
Nakamura, S.: Fusion of Audio-Visual Information for Integrated Speech Processing. In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 127–143. Springer, Heidelberg (2001)
Chapter Google Scholar
Navrátil, J.: Spoken Language Recognition: A Step Towards Multilinguality. IEEE Trans. Audio and Speech Processing 9(6), 678–685 (2001)
Article Google Scholar
Nevill-Manning, C.G.: Inferring Sequential Structure. PhD Thesis, Univ. of Waikato (1996)
Google Scholar
Nirenburg, S., Beale, S., Domashnev, C.: A Full-Text Experiment in Example- Based Machine Translation. In: Proceedings of the International Conference on New Methods in Language Processing (NeMLaP), Manchester, UK, pp. 78–87 (1994)
Google Scholar
Nirenburg, S., Domashnev, C., Grannes, D.J.: Two Approaches to Matching in Example-Based Machine Translation. In: Proceedings of the Fifth International Conference on Theoretical and Methodological Issues in Machine Translation, TMI 1993: MT in the Next Generation, Kyoto, Japan, pp. 47–57 (1993)
Google Scholar
Olivier, D.C.: Stochastic Grammars and Language Acquisition Mechanism. Ph.D. Thesis, Harvard University (1968)
Google Scholar
Pasquariello, S., Pelachaud, C.: Greta: A Simple Facial Animation Engine. In: 6th Online World Conference on Soft Computing in Industrial Applications, Session on Soft Computing for Intelligent 3D Agents (September 2001)
Google Scholar
Perrot, P., Aversano, G., Chollet, G., Charbit, M.: Voice Forgery Using ALISP: Indexation in a Client Memory. To appear in proc. of ICASSP 2005
Google Scholar
Petrovska-Delacrétaz, D., Černocký, J., Hennebert, J., Chollet, G.: Text-Independent Speaker Verification Using Automatically Labeled Acoustic Segments. In: ICLSP, Sydney, Australia (1998)
Google Scholar
Petrovska-Delacretaz, D., Cernocky, J., Hennebert, J., Chollet, G.: Segmental Approaches to Automatic Speaker Verification. Digital Signal Processing: A Review Journal 10(1/2/3), 198–212 (2000)
Article Google Scholar
Petrovska-Delacrétaz, D., Gorin, A.L.,Wright, J.H., Riccardi G.: Detecting Acoustic Morphemes in Lattices for Spoken Language Understanding. In: Proceedings ICSLP, Beijing, China (2000)
Google Scholar
Petrovska-Delacretaz, D., Gorin, A.L., Riccardi, G., Wright, J.H.: Detecting Acoustic Morphemes in Lattices for Spoken Language Understanding. In: Proceedings of ICASSP, Beijing, China (2000)
Google Scholar
Petrovska-Delacretaz, D., Chollet, G.: Searching Through a Speech Memory for Efficient Coding, Recognition and Synthesis. In: Braun, A., Masthoff, H. (eds.) Phonetics and its Applications, pp. 453–464. Franz Steiner Verlag, Stuttgart (2002) ISBN 8094-5
Google Scholar
Petrovska-Delacretaz, D., Abalo, M., El Hannani, A., Chollet, G.: Data-Driven Speech Segmentation for Speaker Verification and Language Identification. In: Proceedings of NOLISP, Le Croisic (2003)
Google Scholar
Petrovska-Delacretaz, D., El Hannani, A., Chollet, G.: Searching through a Speech Memory for Text-Independent Speaker Verification. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, p. 84. Springer, Heidelberg (2003)
Google Scholar
Pighin, F., Szeliski, R., Salesin, D.: Modeling and Animating Realistic Faces from Images. International Journal of Computer Vision 50(2), 143–169 (2002)
Article MATH Google Scholar
Planas, E., Furuse, O.: Formalizing Translation Memory. In: Carl, M., Way, A. (eds.) Recent Advances in Example-Based Machine Translation., Kluwer Academic Press, Amsterdam (2003)
Google Scholar
Prudon, R., d’Alessandro, C.: A Selection/Concatenation Text-to-Speech Synthesis System: Database Development, System Design, Comparative Evaluation. In: Proceedings of the 4th Speech Synthesis Workshop, Pitlochy, Scotland (2001)
Google Scholar
Przybocki, M., Martin, A.: NIST’s Assessment of Text Independent Speaker Recognition Performance 2002. In: The Advent of Biometrics on the Internet, A COST 275 Workshop in Rome, Italy, November 7-8 (2002)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. DSP, Special Issue on the NIST 1999 Evaluations 10(1-3), 19–41 (2000)
Google Scholar
Ribeiro, C.M., Trancoso, I.M.: Improving Speaker Recognisability in Phonetic Vocoders. In: Proceedings of ICSLP, Sydney (1998)
Google Scholar
Ribeiro, C.M., Trancoso, I.M.: Phonetic Vocoder Assessment. In: Proceedings ICSLP, Beijing, vol. 3, pp. 830–833 (2000)
Google Scholar
Roy, D.: Learning Words from Sights and Sounds: A Computational Model. Ph.D. Thesis, MIT (1999)
Google Scholar
Sadler, V., Vendelmans, R.: Pilot Implementation of a Bilingual Knowledge Bank. In: Proceedings of the 13th International Conference on Computational Linguistics, Helsinki, vol. 3, pp. 449–451 (1990)
Google Scholar
Salton, G., McGill, M.S.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
MATH Google Scholar
Sayood, K.: Introduction to Data Compression. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Shiraki, Y., Honda, M.: LPC Speech Coding based on VLSQ. Proceedings IEEE Trans. on ASSP 3(9) (1988)
Google Scholar
Schroeter, J., Graf, H.P., Beutnagel, M., Cosatto, E., Syrdal, A., Conkie, A., Stylianou, Y.: Multimodal Speech Synthesis. In: Proceedings IEEE International Conference on Multimedia and Expo., NY, pp. 571–578 (2000)
Google Scholar
Simard, P.Y., Le Cun, Y., Denker, J.S.: Memory Based Character Recognition using a Transformation Invariant Metric. In: Proceedings of ICPR, Jerusalem, pp. 262–267 (1994)
Google Scholar
Simard, M., Langlais, P.: Sub-Sentential Exploitation of Translation Memories. In: MT Summit VIII: Machine Translation in the Information Age, Santiago de Compostela, Spain, pp. 335–339 (2001)
Google Scholar
Simons, A., Cox, S.: Generation of Mouth Shapes for a Synthetic Talking Head. Proceedings Inst. Acoust. 12, 475–482 (1990)
Google Scholar
Smith, T.C., Witten, I.H.: Learning Language using Genetic Algorithms. In: Wermter, S., Riloff, E., Scheler, G. (eds.) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, pp. 132–145. Springer, NY (1996)
Google Scholar
Somers, H., McLean, I., Jones, D.: Experiments in Multilingual Example-Based Generation. In: Proceedings CSNLP 1994: 3rd Conference on the Cognitive Science of Natural Language Processing, Dublin, Ireland
Google Scholar
Stolcke, A.: An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Computational Linguistics 21(2), 165–201 (1995)
MathSciNet Google Scholar
Stylianou, Y., Cappé, O., Moulines, E.: Statistical Methods for Voice Quality Transformation. In: Proceedings of EUROSPEECH, Madrid, pp. 447–450 (1995)
Google Scholar
Stylianou, Y., Cappé, O., Moulines, E.: Continuous Probabilistic Transform for Voice Conversion. Proceedings IEEE Transactions on SAP 6(2), 131–142 (1998)
Google Scholar
Suhm, B., Geutner, P., Kemp, T., Lavie, A., Mayfield, L., McNair, A.E., Rogina, I., Schultz, T., Sloboda, T., Ward, W., Woszczyna, M., Waibel, A.: JANUS: Towards Multilingual Spoken Language Translation. In: Proceedings ARPA Spoken Language Technology Workshop, Austin, TX (1995)
Google Scholar
Sumita, E., Tsutsumi, Y.: A Translation Aid System Using Flexible Text Retrieval Based on Syntax-Matching. In: TMI 1988 Proceedings Supplement, Pittsburgh (1988) (pages not numbered)
Google Scholar
Tamura, M., Masuko, T., Kobayashi, T., Tokuda, K.: Visual Speech Synthesis Based on Parameter Generation from HMM: Speech-Driven and Text-and-Speech- Driven Approaches. In: Proceedings Auditory-Visual Speech Processing (1998)
Google Scholar
Thomas, H.L., Parris, E., Wright, J.: Reccurent Substrings and Data Fusion for Language Recognition. In: Proceedings ICASSP 2000, Instanbul, Turkey, vol. 2, pp. 169–173 (2000)
Google Scholar
Tomokiyo, M., Chollet, G.: A Proposal to Represent Speech Control Mechanisms within the Universal Networking Digital Language. In: Proceedings of the International Conference on the Convergence of Knowledge, Culture, Language and Information Technologies, Alexandria, Egypt (2003)
Google Scholar
Turcato, D.: Automatically Creating Bilingual Lexicons for Machine Translation from Bilingual Text. In: Proceedings COLING-ACL 1998. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 1299–1305 (1998)
Google Scholar
Utsuro, T., Matsumoto, Y., Nagao, M.: Lexical Knowledge Acquisition from Bilingual Corpora. In: Proceedings of the fifteenth [sic] International Conference on Computational Linguistics, COLING 1992, Nantes, France, pp. 581–587 (1992)
Google Scholar
Valbret, H., Moulines, E., Tubach, J.-P.: Voice Transformation using PSOLA Technique. In: Proceedings ICASSP 1992, vol. 1, pp. 145–148 (1992)
Google Scholar
Valiant, L.G.: A Theory of the Learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar
Vogel, S., Och, F.J., Tillmann, C., Nießen, S., Sawaf, H., Ney, H.: Statistical Methods for Machine Translation. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, Springer, Berlin (2000)
Google Scholar
Wahlster, W.: First Results of Verbmobil: Translation Assistance for Spontaneous Dialogues. In: Proceedings ATR International Workshop on Speech Translation, Kyoto, Japan (1993)
Google Scholar
Waibel, A., Finke, M., Gates, D., Gavaldà, M., Kemp, T., Lavie, A., Maier, M., Mayfield, M., McNair, A., Rogina, I., Shima, K., Sloboda, T., Woszczyna, M., Zhan, P., Zeppenfeld, T.: Janus II - Advances in Spontaneous Speech Translation. In: Internatational Conference on Acoustics, Speech and Signal Processing, Atlanta, Georgia (1996)
Google Scholar
Waibel, A., Jain, A.M., McNair, A.E., Saito, H., Hauptmann, A.G., Tebelskis, J.: JANUS: A Speech-To-Speech Translation System Using Connectionist and Symbolic Processing Strategies. In: ICASSP 1991, Toronto, Canada, vol. 2, pp. 793–796 (1991)
Google Scholar
Wang, Y.-Y., Waibel, A.: Modeling with Structures in Statistical Machine Translation. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 1357–1363 (1998)
Google Scholar
Wang, Y., Waibel, A.: Decoding Algorithm in Statistical Machine Translation. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics ACL/EACL 1997, Madrid, Spain, pp. 366–372 (1997)
Google Scholar
Watanabe, H.: A Method for Extracting Translation Patterns from Translation Examples. In: Proceedings of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 1993): MT in the Next Generation, Kyoto, Japan, pp. 292–301 (1993)
Google Scholar
Williams, J., Katsaggelos, A.: An HMM-Based Speech-to-Video Synthesizer. Proceedings IEEE Transactions on Neural Networks 13(4), 900–915 (2002)
Article Google Scholar
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Yamamoto, E., Nakamura, S., Shikano, K.: Lip Movement Synthesis from Speech Based on Hidden Markov Models. Speech Communication 26(12), 105–115 (1998)
Article Google Scholar
Yi, J., Glass, J.: Information-Theoretic Criteria for Unit Selection Synthesis. In: Proceedings of ICSLP, Denver, Colorado, pp. 2617–2620 (2002)
Google Scholar
Yvon, F.: Paradigmatic Cascades: A Linguistically Sound Model of Pronunciation by Analogy. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Somerset, NJ, pp. 428–435 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

CNRS-LTCI, GET-ENST, 46 rue Barrault, 75634, Paris cedex 13, France
Gérard Chollet & Kevin McTait
GET-INT, Institut National des Télécommunications, 9 rue Charles Fourier, F-91011, Evry cedex, France
Dijana Petrovska-Delacrétaz

Authors

Gérard Chollet
View author publications
You can also search for this author in PubMed Google Scholar
Kevin McTait
View author publications
You can also search for this author in PubMed Google Scholar
Dijana Petrovska-Delacrétaz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNRS LTCI/TSI Paris, 46 rue Barrault, 75634, Paris Cedex 13, France
Gérard Chollet
Department of Psychology, Second University of Naples, and IIASS, Via Pellegrino 19, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Escola Universitària Politècnica de Mataró, Universitat Politècnica de Catalunya, Barcelona, Spain
Marcos Faundez-Zanuy
Dipartimento di Fisica “E.R. Caianiello”, Università degli Studi di Salerno, Via S. Allende, 84081, Baronissi, SA, Italy
Maria Marinaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chollet, G., McTait, K., Petrovska-Delacrétaz, D. (2005). Data Driven Approaches to Speech and Language Processing. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_8

Download citation

DOI: https://doi.org/10.1007/11520153_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27441-4
Online ISBN: 978-3-540-31886-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics