Data-driven methods for linguistic style control

François Mairesse

doi:10.1017/CBO9780511844492.009

9 - Data-driven methods for linguistic style control

from Part IV - Engagement

Published online by Cambridge University Press: 05 July 2014

François Mairesse

Edited by

Amanda Stent and

Srinivas Bangalore

Show author details

François Mairesse: Affiliation:
University of Sheffield
Amanda Stent: Affiliation:
AT&T Research, Florham Park, New Jersey
Srinivas Bangalore: Affiliation:
AT&T Research, Florham Park, New Jersey

Book contents

Get access

Summary

Introduction

Modern spoken language interfaces typically ignore a fundamental component of human communication: human speakers tailor their speech and language based on their audience, their communicative goal, and their overall personality (Scherer, 1979; Brennan and Clark, 1996; Pickering and Garrod, 2004). They control their linguistic style for many reasons, including social (e.g., to communicate social distance to the hearer), rhetorical (e.g., for persuasiveness), or task-based (e.g., to facilitate the assimilation of new material). As a result, a close acquaintance, a politician, or a teacher are expected to communicate differently, even if they were to convey the same underlying meaning. In contrast, the style of most human–computer interfaces is chosen once for all at development time, typically resulting in cold, repetitive language, or machinese. This chapter focuses on methods that provide an alternative to machinese by learning to control the linguistic style of computer interfaces from data.

Natural Language Generation (NLG) differs from other areas of natural language processing in that it is an under-constrained problem. Whereas the natural language understanding task requires learning a mapping from linguistic forms to the corresponding meaning representations, NLG systems must learn the reverse one-to-many mapping and choose among all possible realizations of a given input meaning. The criterion to optimize is often unclear, and largely dependent on the target application. Hence it is important to identify relevant control dimensions – i.e., linguistic styles – to optimize the generation process based on the application context and the user.

Type: Chapter
Information: Natural Language Generation in Interactive Systems , pp. 207 - 226

DOI: https://doi.org/10.1017/CBO9780511844492.009 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Belz, A. (2009). Prodigy-METEO: Pre-alpha release notes. Technical Report NLTG-09-01, Natural Language Technology Group, CMIS, University of Brighton.Google Scholar

Bilmes, J. and Kirchhoff, K. (2003). Factored language models and generalized parallel backoff. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pages 4-6, Edmonton, Canada. Association for Computational Linguistics.Google Scholar

Brennan, S. E. and Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology, 22(6):1482-1493.Google Scholar PubMed

Campana, E., Tanenhaus, M. K., Allen, J. F., and Remington, R. W. (2004). Evaluating cognitive load in spoken language interfaces using a dual-task paradigm. In Proceedings of the International Conference on Spoken Language Processing (INTERSPEECH), pages 1721-1724, Jeju Island, Korea. International Speech Communication Association.Google Scholar

Coltheart, M. (1981). The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology, 33A:497-505.Google Scholar

Costa, P. T. and McCrae, R. R. (1992). NEO PI-R Professional Manual. Psychological Assessment Resources, Odessa, FL.Google Scholar

DiMarco, C. and Hirst, G. (1993). A computational theory of goal-directed style in syntax. Computational Linguistics, 19(3):451-499.Google Scholar

Furnham, A. (1990). Language and personality. In Giles, H. and Robinson, W. P., editors, Handbook of Language and Social Psychology, pages 73-95. Wiley, New York, NY.Google Scholar

Gales, M. and Woodland, P. (1996). Mean and variance adaptation within the MLLR framework. Computer Speech & Language, 10(4):249-264.CrossRef Google Scholar

Gill, A. J. and Oberlander, J. (2002). Taking care of the linguistic features of extraversion. In Proceedings of the Annual Conference of the Cognitive Science Society (CogSci), pages 363-368, Fairfax, Virginia. Cognitive Science Society.Google Scholar

Goldberg, L. R. (1990). An alternative “description of personality”: The Big-Five factor structure. Journal of Personality and Social Psychology, 59(6):1216-1229.CrossRef Google Scholar PubMed

Gosling, S. D., Rentfrow, P. J., and Swann, W. B. (2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37(6):504-528.CrossRef Google Scholar

Green, S. J. and DiMarco, C. (1996). Stylistic decision-making in natural language generation. In Adorni, G. and Zock, M., editors, Trends in Natural Language Generation: An Artificial Intelligence Perspective, volume 1036, pages 125-143. Springer LNCS, Berlin, Germany.Google Scholar

Hovy, E. (1988). Generating Natural Language under Pragmatic Constraints. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar

Isard, A., Brockmann, C., and Oberlander, J. (2006). Individuality and alignment in generated dialogues. In Proceedings of the International Conference on Natural Language Generation (INLG), pages 25-32, Sydney, Australia. Association for Computational Linguistics.Google Scholar

John, O. P., Donahue, E. M., and Kentle, R. L. (1991). The Big Five inventory-versions 4a and 54. Technical report, Berkeley: University of California, Institute of Personality and Social Research.Google Scholar

Langkilde, I. and Knight, K. (1998). Generation that exploits corpus-based statistical knowledge. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Conference on Computational Linguistics (COLING-ACL), pages 704-710, Montreal, Canada. Association for Computational Linguistics.Google Scholar

Langkilde-Geary, I. (2002). An empirical verification of coverage and correctness for a generalpurpose sentence generator. In Proceedings of the International Conference on Natural Language Generation (INLG), pages 17-24, Arden Conference Center, NY. Association for Computational Linguistics.Google Scholar

Lavoie, B. and Rambow, O. (1997). A fast and portable realizer for text generation systems. In Proceedings of the Applied Natural Language Processing Conference (ANLP), pages 1-7, Washington, DC. Association for Computational Linguistics.Google Scholar

Mairesse, F. (2008). Learning to Adapt in Dialogue Systems: Data-driven Models for Personality Recognition and Generation. PhD thesis, Department of Computer Science, University of Sheffield.Google Scholar

Mairesse, F., Gašić, M., Jurčíček, F., Keizer, S., Thomson, B., Yu, K., and Young, S. (2010). Phrase-based statistical language generation using graphical models and active learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 1552-1561, Uppsala, Sweden. Association for Computational Linguistics.Google Scholar

Mairesse, F. and Walker, M. A. (2007). PERSONAGE: Personality generation for dialogue. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 496-503, Prague, Czech Republic. Association for Computational Linguistics.Google Scholar

Mairesse, F. and Walker, M. A. (2010). Towards personality-based user adaptation: Psychologically informed stylistic language generation. User Modeling and User-Adapted Interaction, 20(3):227-278.CrossRef Google Scholar

Mann, W. C. and Thompson, S. A. (1988). Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243-281.CrossRef Google Scholar

Mehl, M. R., Gosling, S. D., and Pennebaker, J. W. (2006). Personality in its natural habitat: Manifestations and implicit folk theories of personality in daily life. Journal of Personality and Social Psychology, 90(5):862-877.CrossRef Google Scholar PubMed

Norman, W. T. (1963). Toward an adequate taxonomy of personality attributes: Replicated factor structure in peer nomination personality rating. Journal ofAbnormal and Social Psychology, 66(6):574-583.Google Scholar

Paiva, D. S. and Evans, R. (2005). Empirically-based control of natural language generation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 58-65, Ann Arbor, MI. Association for Computational Linguistics.Google Scholar

Pennebaker, J. W., Francis, M. E., and Booth, R. J. (2001). Linguistic inquiry and word count. Available from http://www.liwc.net/. Accessed on 11/24/2013.Google Scholar

Pennebaker, J. W. and King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6):1296-1312.CrossRef Google Scholar PubMed

Pickering, M. and Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27(2):169-226.CrossRef Google Scholar

Porayska-Pomsta, K. and Mellish, C. (2004). Modelling politeness in natural language generation. In Proceedings of the International Conference on Natural Language Generation (INLG), pages 141-150, Brockenhurst, UK. Springer.Google Scholar

Rabiner, L. R. (1989). Tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257-286.CrossRef Google Scholar

Reeves, B. and Nass, C. (1996). The Media Equation. University of Chicago Press, Chicago, IL.Google Scholar

Reiter, E. and Dale, R. (2000). Building Natural Language Generation Systems. Cambridge University Press, Cambridge, UK.CrossRef Google Scholar

Scherer, K. R. (1979). Personality markers in speech. In Scherer, K. R. and Giles, H., editors, Social Markers in Speech, pages 147-209. Cambridge University Press, Cambridge, UK.Google Scholar

Stent, A., Prasad, R., and Walker, M. A. (2004). Trainable sentence planning for complex information presentation in spoken dialog systems. In Proceedings of the Annual Meeting of the Associationfor Computational Linguistics (ACL), pages 79-86, Barcelona, Spain. Association for Computational Linguistics.Google Scholar

Stent, A., Walker, M., Whittaker, S., and Maloor, P. (2002). User-tailored generation for spoken dialogue: An experiment. In Proceedings of the International Conference on Spoken Language Processing (INTERSPEECH), pages 1281-1284, Denver, CO. International Speech Communication Association.Google Scholar

Traum, D., Fleischman, M., and Hovy, E. (2003). NL generation for virtual humans in a complex social environment. In Working Papers of the AAAI Spring Symposium on Natural Language Generation in Spoken and Written Dialogue, pages 151-158, Stanford, CA. AAAI Press.Google Scholar

Tsochantaridis, I., Hofmann, T., Joachims, T., and Altun, Y. (2004). Support vector learning for interdependent and structured output spaces. In Proceedings of the International Conference on Machine Learning (ICML), Banff, Canada. The International Machine Learning Society, Association for Computing Machinery.Google Scholar

Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J. D., Johnston, M., and Vasireddy, G. (2002). Speech-Plans: Generating evaluative responses in spoken dialogue. In Proceedings of the International Conference on Natural Language Generation (INLG). Association for Computational Linguistics.Google Scholar

Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., and Vasireddy, G. (2004). Generation and evaluation of user tailored responses in multimodal dialogue. Cognitive Science, 28(5):811-840.CrossRef Google Scholar

Walker, M. A., Stent, A., Mairesse, F., and Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30(1):413-456.Google Scholar

Wang, N., Johnson, W. L., Mayer, R. E., Rizzo, P., Shaw, E., and Collins, H. (2008). The politeness effect: Pedagogical agents and learning outcomes. International Journal of Human-Computer Studies, 66(2):98-112.CrossRef Google Scholar

Book contents

9 - Data-driven methods for linguistic style control

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive