Abstract
Text mining (TM) and computational linguistics (CL) are computationally intensive fields where many tools are becoming available to study large text corpora and exploit the use of corpora for various purposes. In this chapter we will address the problem of building conversational agents or chatbots from corpora for domain-specific educational purposes. After addressing some linguistic issues relevant to the development of chatbot tools from corpora, a methodology to systematically analyze large text corpora about a limited knowledge domain will be presented. Given the Artificial Intelligence Markup Language as the “assembly language” for the artificial intelligence conversational agents we present a way of using text corpora as seed from which a set of “source files” can be derived. More specifically we will illustrate how to use corpus data to extract relevant keywords, multiword expressions, glossary building and text patterns in order to build an AIML knowledge base that could be later used to build interactive conversational systems. The approach we propose does not require deep understanding techniques for the analysis of text.
As a case study it will be shown how to build the knowledge base of an English conversational agent for educational purpose from a child story that can answer question about characters, facts and episodes of the story. A discussion of the main linguistic and methodological issues and further improvements is offered in the final part of the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agostaro, F., Augello, A., Pilato, G., Vassallo, G., Gaglio, S.: A Conversational Agent Based on a Conceptual Interpretation of a Data Driven Semantic Space. In: Bandini, S., Manzoni, S. (eds.) AI*IA 2005. LNCS (LNAI), vol. 3673, pp. 381–392. Springer, Heidelberg (2005)
Augello, A., Vassallo, G., Gaglio, S., Pilato, G.: A Semantic Layer on Semi-Structured Data Sources for Intuitive Chatbots. In: International Conference on Complex, Intelligent and Software Intensive Systems, pp. 760–765 (2009)
Augello, A., Gambino, O., Cannella, V., Pirrone, R., Gaglio, S., Pilato, G.: An Emotional Talking Head for a Humoristic Chatbot. In: Applications of Digital Signal Processing. InTech (2011)
Batacharia, B., Levy, D., Catizone, R., Krotov, A., Wilks, Y.: CONVERSE: a conversational companion. Kluwer Iternational Series in Engineering and Computer Science, pp. 205–216. Kluwer Academic Publishers Group (1999)
Chantarotwong, B.: The learning chatbot. Ph.D. Thesis. UC Berkeley School of Information (2006)
Chomsky, N.: Turing on the ”Imitation game”. In: Epstein, R., Roberts, G., Beber, G. (eds.) Parsing the Turing test: Philosophical and Methodological Issues in the Quest for the Thinking Computer, pp. 103–106. Springer, New York (2008)
Colby, K.M., Weber, S., Hilf, F.D.: Artificial Paranoia. Artificial Intelligence 2(1), 1–15 (1971)
Cliff, D., Atwell, E.: Leeds Unix Knowledge Expert: a domain-dependent Expert System generated with domain-independent tools. BCS-SGES: British Computer Society Specialist Group on Expert Systems Journal 19, 49–51 (1987)
De Gasperis, G.: Building an AIML Chatter Bot Knowledge-Base Starting from a FAQ and a Glossary. JE-LKS. Journal of e-Learning and Knowledge Society 2, 79–88 (2010)
De Gasperis, G., Florio, N.: Learning to read/type a second language in a chatbot enhanced environment. In: Proceedings of ebTEL 2012: International Workshop on Evidenced-based Technology Enhanced Learning, University of Salamanca, March 28-30 (accepted for publication, 2012)
De Pietro, O., Frontera, G.: TutorBot: An Application AIML-based for Web-Learning. In: Advanced Technology for Learning, vol. 2(1), ACTA Press (2005)
Epstein, R., Roberts, G., Beber, G.: Parsing the Turing test: philosophical and methodological issues in the quest for the thinking computer. Springer, New York (2008)
Eynon, R., Davie, C., Wilks, Y.: The Learning Companion: an Embodied Conversational Agent for Learning. In: Conference on WebSci 2009: Society On-Line (2009)
Fellbaum, C.: WordNet: an electronic lexical database. MIT Press, Cambridge (1998)
Fellbaum, C.: WordNet and wordnets. In: Brown, K. (ed.) Encyclopedia of Language and Linguistics, pp. 665–670. Elsevier, Oxford (2005)
Feng, D., Shaw, E., Kim, J., Hovy, E.: An intelligent Discussion-bot for answering student queries in threaded discussions. In: Proceeding of the International Conference on Intelligent User Interfaces, IUI, pp. 171–177 (2006)
Guiraud, P.: Problèmes et méthodes de la statistique linguistique. Presses universitaires de France, Paris (1960)
Heller, B., Procter, M., Mah, D., Jewell, L., Cheung, B.: Freudbot: An investigation of chatbot technology in distance education. In: Proceedings of the World Conference on Multimedia, Hypermedia and Telecommunication (2005)
Hutchens, J.L.: How to pass the Turing test by cheating. School of Electrical, Electronic and Computer Engineering research report TR97-05. University of Western Australia, Perth (1996)
Hutchens, J.L., Alder, M.D.: Introducing MegaHAL. In: Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning, pp. 271–274 (1998)
Jia, J.: The study of the application of a keywords-based chatbot system on the teaching of foreign languages, Arxiv preprint cs/0310018 (2003)
Jia, J.: The study of the application of a web-based chatbot system on the teaching of foreign languages. In: Ferdig, R.E., Crawford, C., Carlsen, R., Davis, N., Price, J., Weber, R., Willis, D.A. (eds.) Proceedings of Society for Information Technology and Teacher Education International Conference 2004, pp. 1201–1207 (2004)
Jia, J.: CSIEC: A computer assisted English learning chatbot based on textual knowledge and reasoning. Knowledge-Based Systems 22(4), 249–255 (2009)
Kerly, A., Hall, P., Bull, S.: Bringing chatbots into education: Towards natural language negotiation of open learner models. Know.-Based Syst. 20(2), 177–185 (2007)
Kerry, A., Ellis, R., Bull, S.: Conversational Agents in E-Learning. In: Applications and Innovations in Intelligent Systems XVI, pp. 169–182 (2009)
Kim, Y.G., Lee, C.H., Han, S.G.: Educational Application of Dialogue System to Support e-Learning. In: Association for the Advancement of Computing in Education, AACE (2002)
Knill, O., Carlsson, J., Chi, A., Lezama, M.: An artificial intelligence experiment in college math education (2004), Preprint, http://www.math.harvard.edu/~knill/preprints/sofia.Pdf
Leech, G., Rayson, P., Wilson, A.: Word frequencies in written and spoken English: based on the British National Corpus. Longman, London (2001)
Mauldin, M.L.: Chatterbots, tinymuds, and the turing test: Entering the loebner prize competition. In: AAAI 1994 Proceedings of the Twelfth National Conference on Artificial Intelligence, vol. 1, pp. 16–21 (1994)
Moor, J.: The Turing test: the elusive standard of artificial intelligence, vol. 6, p. 273. Kluwer Academic Publishers, Dordrecht (2003)
Pirner, J.: The beast can talk (2012), Pdf. Published online, http://www.abenteuermedien.de/jabberwock/how-jabberwock-works.pdf (accessed February 2012)
Pirrone, R., Cannella, V., Russo, G.: Awareness mechanisms for an intelligent tutoring system. In: Proc. of 23th Association for the Advancement of Artificial Intelligence (2008)
Santos-Pérez, M., González-Parada, E., Cano-García, J.M.: AVATAR: An Open Source Architecture for Embodied Conversational Agents in Smart Environments. In: Bravo, J., Hervás, R., Villarreal, V. (eds.) IWAAL 2011. LNCS, vol. 6693, pp. 109–115. Springer, Heidelberg (2011)
Schmid, H.: Probabilistic Part-of-Speech Tagging Using DecisionTrees. Paperpresented to the Proceedings of International Conference on New Methods in Language Processing (1994)
Shawar, B.A., Atwell, E.: Using dialogue corpora to train a chatbot. In: Archer, D., Rayson, P., Wilson, A., McEnery, T. (eds.) Proceedings of the Corpus Linguistics 2003 Conference, pp. 681–690. Lancaster University (2003)
Shawar, B.A., Atwell, E.: Machine Learning from dialogue corpora to generate chatbots. Expert Update Journal 6(3), 25–29 (2003)
Shawar, B.A., Atwell, E.: A chatbot system as a tool to animate a corpus. ICAME J. 29, 5–24 (2005)
Shawar, B.A., Atwell, E.: Chatbots: are they really useful? LDV Forum 22, 29–49 (2007)
Shawar, B.A., Atwell, E.: Different measurements metrics to evaluate a chatbot system. In: Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies, pp. 89–96 (2007)
Shieber, S.M.: The Turing test: verbal behavior as the hallmark of intelligence. MIT Press, Cambridge (2004)
Turing, A.M.: Computing machinery and intelligence. Mind 59, 433–460 (1950)
Ueno, M., Mori, N., Matsumoto, K.: Novel Chatterbot System Utilizing Web Information. In: Distributed Computing and Artificial Intelligence, pp. 605–612 (2010)
Veletsianos, G., Heller, R., Overmyer, S., Procter, M.: Conversational agents in virtual worlds: Bridging disciplines. Wiley Online Library, British Journal of Educational Technology 41(1), 123–140 (2010)
Vieira, A.C., Teixeria, L., Timteo, A., Tedesco, P., Barros, F.: Analyzing online collaborative dialogues: The OXEnTCH-Chat. In: Proceedings of the Intelligent Tutoring Systems 7th International Conference, pp. 72–101. IEEE (2004)
Vrajitoru, D.: Evolutionary sentence building for chatterbots. In: GECCO 2003 Late Breaking Papers, pp. 315–321 (2003)
Vrajitoru, D.: NPCs and Chatterbots with Personality and Emotional Response. In: 2006 IEEE Symposium on Computational Intelligence and Games, pp. 142–147 (2006)
Wallace, R.S., Tomabechi, H., Aimless, D.: Chatterbots Go Native: Considerations for an eco-system fostering the development of artificial life forms in a human world (2003), http://www.pandorabots.com/pandora/pics/chatterbotsgonative.doc (accessed February 2012)
Wallace, R.S.: The Anatomy of A.L.I.C.E. In: Epstein, R., Roberts, G., Beber, G. (eds.) Parsing the Turing Test, pp. 181–210. Springer, Netherlands (2009)
Weizenbaum, J.: ELIZA A computer program for the study of natural language communication between man and machine. Communications of the ACM 10(8), 36–45 (1966)
Wilensky, R., Chin, D.N., Luria, M., Martin, J., Mayfield, J., Wu, D.: The Berkeley UNIX consultant project. Computational Linguistics 14(4), 35–84 (1988)
Wu, Y., Wang, G., Li, W., Li, Z.: Automatic Chatbot Knowledge Acquisition from Online Forum via Rough Set and Ensemble Learning. In: IFIP International Conference on Network and Parallel Computing, NPC 2008, pp. 242–246. IEEE (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
De Gasperis, G., Chiari, I., Florio, N. (2013). AIML Knowledge Base Construction from Text Corpora. In: Yang, XS. (eds) Artificial Intelligence, Evolutionary Computing and Metaheuristics. Studies in Computational Intelligence, vol 427. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29694-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-29694-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29693-2
Online ISBN: 978-3-642-29694-9
eBook Packages: EngineeringEngineering (R0)