The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Jorge-Botana, Guillermo; Olmos, Ricardo; Barroso, Alejandro

doi:10.1007/s10772-012-9129-5

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Published: 01 February 2012

Volume 15, pages 151–164, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Guillermo Jorge-Botana¹,
Ricardo Olmos² &
Alejandro Barroso³

203 Accesses
4 Citations
Explore all metrics

Abstract

Semantic technology is commonly used for two purposes in the field of IVR (Interactive Voice Response). The first is to correct the output of voice recognition devices based on coherence with a context. The second is to perform what is referred to as “call routing”, requiring technology that categorizes utterances and returns a list of the most credible routes. Our paper focuses on the latter, aiming to use the Latent Semantic Analysis (LSA henceforth) computational model (Deerwester et al. in J. Am. Soc. Inf. Sci. 41:391–407, 1990) together with the Construction-Integration model (C-I henceforth), a psycholinguistically motivated algorithm (Kintsch in Int. J. Psychol. 33(6):411–420, 1998), to interpret, manage and successfully route user requests in an efficient and reliable manner. By efficient we mean that training is unnecessary when the destination model is altered, and exhaustive labeling of all utterances is not required, concentrating instead only on some sample destinations. By reliable we mean that the construction-integration algorithm attenuates the risks from intra-destination variability and word saliency. Technical and theoretical aspects are discussed. In addition, some destination assignment methods are tested and debated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Collectives of Term Weighting Methods for Natural Language Call Routing

Building an Intelligent Call Distributor

Towards a Framework for Improving Quality of User-Centered Services in Socio-technical Systems: A Case Study of Airport System

Notes

Folding-In is specified later.
Matrix of contexts is generically known in Information Retrieval as matrix of documents. In order to be accurate in naming the context window, we prefer to call it “utterances”.
Strictly speaking, in LSA training these labels do not need to be exhaustive, so they can originate from different means of classification or even different corpora. In extreme cases, a sufficiently large corpus might even be trained without labels. Testing this last hypothesis is one of the aims of the present study.
The space is set up according to the method followed by Cox and Shahshahani (2001), with a matrix built from terms and utterances, and not terms and grouped categories, like Chu-Carroll and Carpenter (1999). The rows of the occurrence matrix were terms (and also labels in the labeled condition) and the columns were utterances.
As well as having no labels, this decrease in the size of the matrix is due to single-term utterances (for example “credit”) being excluded from the training—undoubtedly this is a disadvantage of an LSA model without labels.
We chose such a dimensionalization based on the assumptions made in some previous studies. In those studies it has been suggested that the optimal number of dimensions for specific domain corpora does not have to be extremely low, sometimes even approaching the 300 dimensions recommended by Landauer and Dumais (1997) for general domain corpora (see Jorge-Botana et al. 2010b). Some of the most recent studies simply use 300 dimensions (Wild et al. 2011).
These models must cover all the functionality of the service.
Because call utterances are shorter and simpler than propositions within colloquial language, the algorithm which is used is not exactly the original Construction-Integration algorithm. The integration part proposed by Kintsch is a spreading activation algorithm which is iterative until the net is stable (the cycle when the change in the mean activation is lower than a parameterized value), whereas our algorithm is a “one-shot” mechanism. The activation of each node is calculated based on the connections received. Another difference with the CI algorithm as proposed by Kintsch is that we only consider words and not propositions nor situations. In any case, note that the original C-I is more complete and fine grained, but our mechanism is sufficient for our purposes and may be more flexibly programmed, because an OOP (Object Oriented Programming) paradigm has been used, with classes such as net, layer, node, connection, etc., instead of the iterative vector * matrix multiplication in the original (see Kintsch and Welsch 1991 for details of the original conception).
The cosines are calculated using the previously trained semantic space, in other words each of the terms to be compared is represented by a vector in this space. Any term vector might then be compared with another term vector using the cosine.

References

Bellegarda, J. R. (2000). Exploiting latent semantic information in statistical language modeling. Proceedings of the IEEE, 88(8), 1279–1296.
Article Google Scholar
Chu-Carroll, J., & Carpenter, B. (1999). Vector-based natural language call routing. Computational Linguistics, 25(3), 361–388.
Google Scholar
Cox, S., & Shahshahani, B. (2001). A comparison of some different techniques for vector based call-routing. In Proceedings of 7th European conf. on speech communication and technology, Aalborg.
Google Scholar
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391–407.
Article Google Scholar
Foltz, P. W. (1996). Latent semantic analysis for text-based research. Behavior Research Methods, Instruments, & Computers, 28(2), 197–202.
Article Google Scholar
Haley, D. T., Thomas, P., De Roeck, A., & Petre, M. (2005). A research taxonomy for latent semantic analysis based educational applications. Technical Report no. 2005/09, Open University.
Haley, D. T., Thomas, P., Petre, P., & De Roeck, A. (2007). Seeing the whole picture: comparing computer assisted assessment systems using LSA-based systems as an example. Technical Report Number 2007/07, Open University.
Jones, M. P., & Martin, J. H. (1997). Contextual spelling correction using latent semantic analysis. In Proceedings of the fifth conference on applied natural language processing (pp. 163–176).
Google Scholar
Jorge-Botana, G., Olmos, R., & León, J. A. (2009). Using LSA and the predication algorithm to improve extraction of meanings from a diagnostic corpus. Spanish Journal of Psychology, 12(2), 424–440.
Google Scholar
Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2010a). Latent semantic analysis parameters for essay evaluation using small-scale corpora. Journal of Quantitative Linguistics, 17(1), 1–29.
Article Google Scholar
Jorge-Botana, G., León, J. A., Olmos, R., & Hassan-Montero, Y. (2010b). Visualizing polysemy using LSA and the predication algorithm. Journal of the American Society for Information Science and Technology, 61(8), 1706–1724.
Google Scholar
Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2011). The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA. International Journal of Continuing Engineering Education and Long Learning, 21(4).
Kintsch, W. (1998). The representation of knowledge in minds and machines. International Journal of Psychology, 33(6), 411–420.
Article Google Scholar
Kintsch, W. (2000). Metaphor comprehension: a computational theory. Psychonomic Bulletin & Review, 7, 257–266.
Article Google Scholar
Kintsch, W. (2001). Predication. Cognitive Science, 25, 173–202.
Article Google Scholar
Kintsch, W. (2002). On the notions of theme and topic in psychological process models of text comprehension. In M. Louwerse & W. van Peer (Eds.), Thematics: interdisciplinary studies (pp. 157–170). Amsterdam: Benjamins.
Google Scholar
Kintsch, W. (2007). Meaning in context. In T. K. Landauer, D. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of latent semantic analysis (pp. 89–105). Mahwah: Erlbaum.
Google Scholar
Kintsch, W. (2008). Symbol systems and perceptual representations. In M. de Vega, A. M. Glenberg, & A. C. Graesser (Eds.), Symbols and embodiment: debates on meaning and cognition (pp. 145–164). Oxford: Oxford University Press.
Chapter Google Scholar
Kintsch, W., & Bowles, A. (2002). Metaphor comprehension: what makes a metaphor difficult to understand? Metaphor and Symbol, 17, 249–262.
Article Google Scholar
Kintsch, W., & Welsch, D. (1991). The construction-integration model: a framework for studying memory for text. In W. E. Hockley & S. Lewandowsky (Eds.), Relating theory and data: essays on human memory in honor of Bennet B. Murdock (pp. 367–385). Hillsdale: Erlbaum.
Google Scholar
Kintsch, W., Patel, V., & Ericsson, K. A. (1999). The role of long-term working memory in text comprehension. Psychologia, 42, 186–198.
Google Scholar
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.
Article Google Scholar
Li, L., & Chou, W. (2002). Improving latent semantic indexing based classifier with information gain. In Proceedings of the 7th international conference on spoken language processing, ICSLP-2002, Denver, Colorado, USA, September 16–20, 2002 (pp. 1141–1144).
Google Scholar
Lim, B. P., Ma, B., & Li, H. (2005). Using semantic context to improve voice keyword mining. In Proceedings of the international conference on Chinese computing (ICCC 2005), Singapore, 21–23 March 2005.
Google Scholar
Louwerse, M. M. (2008). Embodied representations are encoded in language. Psychonomic Bulletin & Review, 15, 838–844.
Article Google Scholar
Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge: MIT Press.
MATH Google Scholar
McCauley, L. (unpublished). Using latent semantic analysis to aid speech recognition and understanding. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.3694.
Nakov, P., Popova, A., & Mateev, P. (2001). Weight functions impact on LSA performance. In Proceedings of the recent advances in natural language processing conference—RANLP 2001, Tzigov Chark, Bulgaria.
Google Scholar
Olmos, R., León, J. A., Jorge-Botana, G., & Escudero, I. (2009). New algorithms assessing short summaries in expository texts using latent semantic analysis. Behavior Research Methods, 41(3), 944–950.
Article Google Scholar
Quesada, J. (2008). Latent problem solving analysis (LPSA): a computational theory of representation in complex, dynamic problem solving tasks. PhD thesis, Psychology, University of Granada.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGrawHill.
MATH Google Scholar
Serafin, R., & Di Eugenio, B. (2004). FLSA: extending latent semantic analysis with features for dialogue act classification. In Proceedings of ACL04, 42nd annual meeting of the association for computational linguistics Barcelona, Spain, July.
Google Scholar
Shi, Y. (2008). An investigation of linguistic information for speech recognition error detection. PhD University of Maryland, Baltimore County, Baltimore.
Tyson, N., & Matula, V. C. (2004). Improved LSI-based natural language call routing using speech recognition confidence scores. In Proceedings of EMNLP.
Google Scholar
Wild, F., Haley, D., & Bülow, K. (2011). Using latent-semantic analysis and network analysis for monitoring conceptual development. Journal for Language Technology and Computational Linguistics, 26(1), 9–21.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Psicología Evolutiva y de la Educación, Facultad de Psicología, Universidad Nacional de Educación a Distancia (UNED), C/Juan del Rosal, 10 (Ciudad Universitaria), Madrid, Spain
Guillermo Jorge-Botana
Departamento de Metodología de las Ciencias del Comportamiento, Facultad de Psicología, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049, Madrid, Spain
Ricardo Olmos
PlusNet Solutions, C/Albarracín, 58. Local 12, 28037, Madrid, Spain
Alejandro Barroso

Authors

Guillermo Jorge-Botana
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Olmos
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Barroso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guillermo Jorge-Botana.

Appendix

Destinations: Activate or disactivate a line, Check balance, Voicemail, Coverage, Contract details, Internet tariff, Businesses, Phonebill, Lost or stolen, General information, Internet, Logistics, Migration, Games and Software, Special offers, Call options, Agent, Switching service provider, Top-ups, Complaint, Roaming, Credit Balance, SMS and MMS, Call rates, TV, Phones, Shops, Languages, Incidents.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jorge-Botana, G., Olmos, R. & Barroso, A. The Construction-Integration framework: a means to diminish bias in LSA-based call routing. Int J Speech Technol 15, 151–164 (2012). https://doi.org/10.1007/s10772-012-9129-5

Download citation

Received: 16 September 2011
Accepted: 16 January 2012
Published: 01 February 2012
Issue Date: June 2012
DOI: https://doi.org/10.1007/s10772-012-9129-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Abstract

Access this article

Similar content being viewed by others

Collectives of Term Weighting Methods for Natural Language Call Routing

Building an Intelligent Call Distributor

Towards a Framework for Improving Quality of User-Centered Services in Socio-technical Systems: A Case Study of Airport System

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Abstract

Access this article

Similar content being viewed by others

Collectives of Term Weighting Methods for Natural Language Call Routing

Building an Intelligent Call Distributor

Towards a Framework for Improving Quality of User-Centered Services in Socio-technical Systems: A Case Study of Airport System

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation