Abstract
In his LREC 2004 invited talk when awarded by the first ever Antonio Zampolli prize for his essential contributions to the use of spoken and written language resources, Frederick Jelinek has used the title “Some of My Best Friends Are Linguists”. He did so for many reasons, one of them being that he wanted to remove the perception that he dislikes linguists and linguistics after so many people used to cite his famous line from an old presentation at a Natural Language Processing Evaluation workshop in 1988, in which he said “Whenever I fire a linguist our system performance improves.”
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bahl, L.R., Mercer, R.L.: Part-of-speech assignment by a statistical decision algorithm. In: Proceedings of the IEEE International Symposium on Information Theory, pp. 88–89. IEEE Computer Society Press, Los Alamitos (1976)
Banko, M., Brill, E.: Scaling to Very Very Large Corpora for Natural Language Disambiguation. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Toulouse, France (2001)
Banko, M., Brill, E.: Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing. In: Proceedings of the First International Conference on Human Language Technology. San Diego, California, pp. 1–5 (2001)
Berger, A.L., Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Gillett, J.R., Lafferty, J.D., Mercer, R.L., Printz, H., Ureš, L.: The Candide System for Machine Translation. In: Proceedings of the ARPA Conference on Human Language Technology. Plainsborough, New Jersey (1994)
Brants, S., Dipper, S., Hansen, S., Lezius, W., Smith, G.: TIGER treebank. In: Proceedings of the First Workshop on Treebanks and Linguistic Theories (TLT 2002), Sozopol, Bulgaria, pp. 24–42 (2002)
Brill, E.: Paucity Shmaucity–What Can We Do With A Trillion Words? In: Invited talk at EMNLP-NAACL 2001 Conference. Pittsburgh, PA, USA (2001)
Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin, P.S.: A Statistical Approach to Machine Translation. Computational Linguistics 16(2), 79–85 (1990)
Charniak, E.: Statistical Language Learning. The MIT Press, Cambridge, MA (1996)
Chomsky, N.: Syntactic Structures. Mouton, The Hague (1957)
Church, K.W.: A Stochastic PARTS Program and Noun Phrase Parser for Unrestricted Text. In: Proceedings of the Second Conference on Applied Natural Language Processing. 26th Annual Meeting of the ACL. Austin, Texas, pp. 136–143 (1988)
Church, K.W.: Speech and Language Processing: Where Have We Been and Where Are We Going? In: Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH/INTERSPEECH-2003). Geneva, Switzerland (2003)
Czech National Corpus, http://ucnk.ff.cuni.cz
Fillmore, C.J.: The case for case. In: Bach, E., Harms, R. (eds.) Universals in Linguistic Theory. New York, pp. 1–90 (1968)
Francis, N. F.: Standard Corpus of Edited Present-day American English. College English 26, 267-273. Reprinted in Geoffrey Sampson and Diana McCarthy (eds.) Corpus Linguistics: Readings in a Widening Discipline. Continuum 2004, London/New York, pp. 27–34 (1965)
Hajič, J.: Building a syntactically annotated corpus: The Prague Dependency Treebank. In: Issues of Valency and Meaning. Studies in Honour of Jarmila Panevová, Karolinum, pp. 106–132. Charles University Press, Prague, Czech Republic (1998)
Hajič, J., et al.: Prague Dependency Treebank 2.0. CDROM. Cat. No. LDC2006T01. Linguistic Data Consortium, Philadelphia, PA (2006), http://ufal.mff.cuni.cz/pdt2.0 ISBN: 1-58563-370-4
Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová, V., Pajas, P.: PDT-VALLEX: Creating a Large-Coverage Valency Lexicon for Treebank Annotation. In: Proceedings of the 2nd Treebanks and Linguistic Theories Workshop. Växjö, Sweden, November 14-15, pp. 57–68 (2003)
Hajičová, E.: Old linguists never die, they only get obligatorily deleted. Computational Linguistics 32(4), 457–469 (2006)
Jelinek, F.: Continuous Speech Recognition by Statistical Methods. Proceedings of the IEEE 64(4), 532–536 (1976)
Jelinek, F.: Statistical Methods For Speech Recognition. The MIT Press, Cambridge, MA (1998)
Jelinek, F.: Some of My Best Friends Are Linguists. In: Invited talk at the occasion of the Antonio Zampolli Award presented to Frederick Jelinek at the LREC 2004 conference, Lisbon, Portugal (2004)
Jelinek, F., Bahl, L.R., Mercer, R.L.: Design of a Linguistic Statistical Decoder for the Recognition of Continuous Speech Recognition by Statistical Methods. IEEE Transactions on IT 21(3), 250–256 (1975)
Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice-Hall, Englewood Cliffs (2000)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, MA (2000)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Lingusitics 19(2), 313–330 (1993)
Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinska, V., Young, B., Grishman, R.: The NomBank Project: An Interim Report. In: HLT-NAACL Workshop: Frontiers in Corpus Annotation. Boston, Massachusetts, USA, pp. 24–31 (2004)
http://www.coli.uni-saarland.de/projects/sfb378/NEGRA-en.html
Och, F.J.: Large-scale Machine Translation: Challenges and Opportunities. In: Invited talk at NAACL/HLT 2007, Rochester, NY, USA (April 22-27, 2007)
Palmer, M.S., Gildea, D., Kingsbury, P.: The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Lingusitics 31(1), 71–105 (2005)
Ribarov, K., Bémová, A., Vidová Hladká, B.: When a statistically oriented parser was more efficient than a linguist: a case of treebank conversion. The Prague Bulletin of Mathematical Linguistics 86, 21–38 (2006)
Robinson, J.J.: Case, category and configuration. Journal of Linguistics 6, 57–80 (1969)
Robinson, J.J.: Depenency structures and transformational rules. Language 46, 259–285 (1970)
Sgall, P.: Zur Frage der Ebenen im Sprachsystem. Travaux linguistiques de Prague 1, 95–106 (1964)
Sgall, P.: Generative Bschreibung und die Ebenen des Sprachsystems. In: presented at the Second International Symposium in Magdeburg, Germany. Zeichen und System der Sprache III 1966, Berlin, pp. 225–239 (1964)
Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects. Reidel - Academia, Dordrecht - Prague (1986)
Vidová Hladká, B.: The Czech Academic Corpus version 1.0 has been released. The Prague Bulletin of Mathematical Lingustics 86, 57–58 (2006)
Weaver, W.: Translation. Memorandum. Reprinted. In: Locke, W.N., Booth, A.D. (eds.) Machine Translation of Languages: Fourteen Essays, pp. 15–23. MIT Press, Cambridge (1949)
Žabokrtský, Z., Lopatková, M.: Valency Frames of Czech Verbs in VALLEX 1.0. In: HLT-NAACL Workshop: Frontiers in Corpus Annotation. Boston, Massachusetts, USA, pp. 70–77 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hajič, J., Hajičová, E. (2007). Some of Our Best Friends Are Statisticians. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-74628-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74627-0
Online ISBN: 978-3-540-74628-7
eBook Packages: Computer ScienceComputer Science (R0)