ABSTRACT
This paper presents a SVM (Support Vector Machine) classification system which divides contact-center call transcripts into "Greeting", "Question", "Refine", "Research", "Resolution", "Closing" and "Out-of-topic" sections. This call section segmentation is useful to improve search and retrieval functions and to provide more detailed statistics on calls. We use an off-the-shelf automatic speech recognition (ASR) system to generate call transcripts from recorded calls between customers and service representatives.
We first classify an individual utterance into a call section by applying the SVM classifier and then merge adjacent utterances classified into a same call section. We experiment with the proposed system on 100 automatically transcribed calls. The 10-fold cross validation shows 87.2% classification accuracy. We also compare the proposed algorithm with two other approaches - the most frequent section only method and a maximum entropy-based segmentation. The evaluation shows that our system's accuracy is 12% higher than the first baseline system and 6% higher than the second baseline system respectively.
- J. Baldridge. The opennlp project. http://opennlp.sourceforge.net/, 2002.Google Scholar
- C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, (2):121--167, 1998. Google ScholarDigital Library
- S. Busemann, S. Schmeier, and R. G. Arens. Message classification in the call center. In Proceedings of the sixth conference on Applied natural language processing, pages 158--165, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- C. Chang and C. Lin. Libsvm: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm, 2001.Google Scholar
- H. Christensen, B. Kolluru, Y. Gotoh, and S. Renals. Maximum entropy segmentation of broadcast news. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.Google ScholarCross Ref
- J. Chu-Carroll and B. Carpenter. Vector-based natural language call routing. Computational Linguistics, 1999. Google ScholarDigital Library
- A. S. G. Tur, D. Hakkani-Tur and E. Shriberg. Integrating prosodic and lexical cues for automatic topic segmentation. Computational Linguistics,(27):31--57, 2001. Google ScholarDigital Library
- J. Garofolo, G. Auzanne, and E. Voorhees. The trec spoken document retrieval track: A success story. In Proceedings of the Ninth Text Tretrieval Conference (TREC-9). National Institute of Standards and Technology (NIST), 2000.Google Scholar
- Y. Gotoh and S. Renals. Sentence boundary detection in broadcast speech transcripts. Proceedings of ISCA Workshop: Automatic Speech Recognition: Challenges for the New millennium ASR-2000, 2000.Google Scholar
- L. M. D. P. G. S. H. Soltau, B. Kingsbury and G. Zweig. The ibm 2004 conversational telephony system for rich transcription. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2004.Google Scholar
- T. Joachims. Text categorization with support vector machines: learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning, pages 137--142. Google ScholarDigital Library
- Y. Liu, M. Harper, E. Shriberg, and A. Stolcke. Using conditional random fields for sentence boundary detection in speech. In Proceedings of the Association of Computational Linguistics (ACL), 2005. Google ScholarDigital Library
- J. Mamou, D. Carmel, and R. Hoory. Spoken document retrieval from call-center conversations. In Proceedings of SIGIR'06. Google ScholarDigital Library
- G. Mishne, D. Carmel, and R. Hoory. Automatic analysis of call-center conversations. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM), 2005. Google ScholarDigital Library
- G. T. P. Haffner and J. Wright. Optimizing svms for complex call classification. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03), 2003.Google ScholarCross Ref
- Y. Park, R. J. Byrd, and B. K. Boguraev. Automatic glossary extraction: Beyond terminology identification. In Proceedings of the Nineteenth International Conference on Computational Linguistics (COLING02), 2002. Google ScholarDigital Library
- A. Ratnaparkhi. Maximum entropy models for natural language ambiguity resolution. Ph.D. thesis, University of Pennsylvania, Philadelphia, PA., 1998. Google ScholarDigital Library
- G. Riccardi, A. Gorin, A. Ljolje, and M. Riley. A spoken language system for automated call routing. In Proceedings of Proceedings of International Conference on Acoustics, Speech, and Signal Processing(ICASSP'97), 1997. Google ScholarDigital Library
- S. Roy and L. Subramaniam. Automatic generation of domain models for call-centers from noisy transcriptions. In Proceedings of COLING-ACL 2006. Google ScholarDigital Library
- M. Tang, B. Pellom, and K. Hacioglu. Call-type classification and unsupervised training for the call center domain. In IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, 2003.Google Scholar
- V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 1995. Google ScholarDigital Library
Index Terms
- Automatic call section segmentation for contact-center calls
Recommendations
Towards real-time measurement of customer satisfaction using automatically generated call transcripts
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementCustomer satisfaction is a very important indicator of how successful a contact center is at providing services to the customers. Contact centers typically conduct a manual survey with a randomly selected group of customers to measure customer ...
Automatic Liver Segmentation in Abdomen CT Images using SLIC and AdaBoost Algorithms
ICBBB '18: Proceedings of the 2018 8th International Conference on Bioscience, Biochemistry and BioinformaticsThis study is an implementation of liver segmentation on abdomen CT images. The liver organ was segmented by using SLIC super-pixel and AdaBoost algorithms. Firstly, the images were clustered by SLIC super-pixel algorithm. Then, the liver was segmented ...
Robust automatic accent identification based on the acoustic evidence
AbstractThe paper describes a novel approach to automated accent identification by training a speech recogniser to distinguish between different versions of the phonemes that make up the language. In this approach, a standard speech recogniser is trained ...
Comments