A Data-Driven Design Framework for Customer Service Chatbot

Hwang, Shinhee; Kim, Beomjun; Lee, Keeheon

doi:10.1007/978-3-030-23570-3_17

Shinhee Hwang¹⁶,
Beomjun Kim¹⁷ &
Keeheon Lee¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11583))

Included in the following conference series:

International Conference on Human-Computer Interaction

6833 Accesses

Abstract

User experience in customer service is critical. It is because customer service is what a customer first requests for a service. The service fails to satisfactory response will cause a crucial damage. Albeit business includes a chatbot for better responsiveness, customization is still necessary to fulfill the satisfaction from customer service. For customization, a designer performs qualitative research such as surveys, self-reports, interviews, and user observation to pull out key characteristics and to build personas based on the characteristics. However, a small sample size and cognitive limitation of a researcher demand more data to model persona better. Therefore, in this study, we introduce a data-driven framework for designing customer service chatbot that utilizes the past customer behavior data from clickstreams and a customer service chatbot. We apply this framework to a cartoon streaming service, Laftel. In result, we generate three types of customer service chatbots for three personas such as explorer, soft user, and hard user. In the future, we will validate our result by conducting a field experiment.

You have full access to this open access chapter, Download conference paper PDF

Understanding the User Experience of Customer Service Chatbots: What Can We Learn from Customer Satisfaction Surveys?

Improving Customer Service Through the Use of Chatbot at Enma Spa Huancayo, Peru

Chatbots Assisting German Business Management Applications

Keywords

1 Introduction

It is important for a service provider to increase satisfaction of all the users, especially, in the time when a problem occurs when the user uses the service. Meeting the user problems and providing more customer service than expected improves user satisfaction and competitiveness of the company [18]. Chatbots can answer customers’ inquiries cheaply, quickly and in real-time. In the field of CS, chatbots are mainly used to provide answers to repeated questions, and as a result, CS personnel are more practical and cost-effective in that they can give higher value answers to customers [8]. Thus, more and more businesses are choosing chatbots for customer service [19].

In UX design, there are attempts to improve user satisfaction and service completeness by providing a service that covers multimodal user [22]. Also, chatbot has evolved to provide an optimized response for the use [24]. However, the various users are too quick and massive to follow their problems with the present but popular research methods, in designing customer service, to extract key features that may improve user experience. The methods are surveys, self-reports, interviews, and user observation. They usually take a lot of time, effort, and cost. At the same time, the amount of data collection is restricted due to a small sample size and cognitive limitation. It is often not enough to model the actual behavior of responsive users that can be used for customer service customization.

This study is based on the case of Laftel, cartoon streaming service [11]. Laftel is a streaming service that recommends animation and webtoon based on user preference. The service provides contents based on interests with little expertise. Therefore, users change faster because service deviance rate is higher than service that provides professional content, and the main user base is in their 10 s and 20 s. Also, the service provider is a startup that requires an efficient but effective way for customer service so that we choose Laftel as our case.

In this study, we introduce a data-driven framework for designing customer service chatbot that utilizes the past customer behavior data from clickstreams and a customer service chatbot. We apply cluster analysis to user data and segments users to build personas. In order to create a service list of CS, the company’s CS data is processed by Natural Language Processing (NLP) to derive words with high frequency of use and words similar to those words. In result, we generate types of customer service chatbots for each personas.

This study suggests a way to provide corporate customer service effectively and effectively, and it is expected that it will contribute to the improvement of corporate value.

2 Literature Review

2.1 Data-Driven Personas

User experience (UX) design is to elicit positive experience by using designs customized to users. In UX design, persona is often used to understand users. Persona categorizes users based on their behavior, goals, needs, and context. Namely, persona is an artificial character that represents various user types in the population of the potential target users. Cooper first utilized Persona concept for Design and User Experience practice [7]. He considered a persona as an archetypal user. Pruitt and Grudin argued that persona is helpful to understand users and their needs because we can perceive user closely as a person [21]. Norman insisted that, in UX design, one can design the experience a person will have when the one empathizes the person totally based on persona [17].

Traditionally, in user-oriented design, personas are built based on collected data from surveys, self-reports, interviews and user observation. But this process generates limited amount of data compared to costs of labor, budget and time. In addition, there is a gap between users’ actual behavior and users’ realization of their behavior. Additionally, the traditional user research methods are insufficient to support flexible services for fast-changing industry due to 4th industrial revolution and responsive users.

2.2 Telemetry and Click Stream

Clickstream is a digital path of user through a web site. A series of web pages requested by a visitor in a single visit is referred to as a session. Clickstream data includes click path information that shows the goal of service uses and their associated information such as timestamp, IP address, URL, status, number of transferred bytes, referrer, user agent, and cookie data in real time. And thus, collecting and analyzing clickstream data is an effective and efficient way to know user behavior data compared to traditional methods.

We can predict user’s needs and the user’s behavior by analyzing clickstream data. And, in UX research, clickstream data is utilized in order to understand the users of a website and improve the quality of service [4]. Singh and Cancel used clickstream data to show users of a website have different needs for services and functions [26]. They also showed that the outcome of the service improves when they personalized web designs and product offerings based on a user’s path. Mobasher collected and analyzed clickstream data to design a personalized web page [16]. Xiang, Hans-Frederick and Anil made personas based on clickstream data and UX design methodology. They showed that it actually reflects the actual behavior of the users [28].

2.3 Chatbot

Chatbot is a computer program designed to perform certain tasks through communication with humans through text messages, combined with artificial intelligence and messenger functions. Gartner predicted that by 2021, more than 50% of companies will be managing AI-based chatbots within their apps [14]. Chatbots are suitable for providing answers to simple questions, and real-time answers are possible. Therefore, the use of chatbots in the CS field can reduce the labor cost and improve the CS satisfaction of the users because the CS consulting staff can use them in more productive fields [9].

There are two types of chatbots: open type and closed type. Closed chatbots are mainly used when certain functions are limited, or when there are not many data sets. This type of chatbot restricts the user’s questions so that the answer is more accurate, but it does not feel as much interaction with the user. Closed chatbots provide a relatively comprehensive service and are used when there are many datasets. This type of chatbot has a high degree of freedom for the user to ask questions, but the accuracy of the answer to a specific question is also low. However, it has the advantage of giving users a sense of interacting with Service. Recently, it is easy to see a mixed chatbot partially borrowing each form in order to take advantage of the closed type and the open type (Tables 1 and 2).

Table 1. Comparison of closed chatbot and open chatbot

Full size table

Table 2. Chatbot by input method

Full size table

In UX design, there are attempts to improve user satisfaction and service completeness by providing a service that covers multimodal user [22]. Also, chatbot has evolved to provide an optimized response for the use [24]. Makar and Allen studied an algorithm that passes different sentences by each personas in Chatbot Service [1, 15]. Liu classified user types based on postings posted by users, and studied Chatbot, which provides different sentences for each users [13].

3 A Data-Driven Design Framework for Customer Service Chatbot

Basically, we collect clickstreams as data from non-verbal user behavior and cluster them into a several groups that segments users. On the other hand, we collect the conversations of users with CS chatbots as data from verbal user behavior and classify them into a certain number of labels that follows a predefined category system. In this case, the system consists of services that a business provides. Lastly, each user group is defined by a combination of services so that the relationship between user groups and services are one-to-many relationships (Fig. 1).

3.1 Identify User Groups (Personas)

We build a persona using hierarchical clustering. Hierarchical clustering is a method of grouping targets based on their similarities using Euclidean distances and is especially useful when the total number of clusters is unknown. The process constructing personas with hierarchical clustering includes following steps.

3.2 Identify Service Types

We analyze the conversations of users with CS chatbots in Laftel to format the service provided by CS Chetbot. The process of classifying data and typesetting service type is as follows. In this study, the top 20 nominal words are defined as ‘key words’ and the top 10 words with high specific word and word vector values are defined as ‘related words’. The procedure for defining the service type is as follows. After proceeding step 1 and step 2, make the list in the table as shown in the Table 3. The main contents of the table are key words, the number of times key words are used, and related words of key words.

Table 3. Examples of key words and related words

Full size table

3.3 Distance Between Clickstreams

Users visited Laftel with ten routes we extracted twenty groups of clickstream data in total which are clickstreams of new visitors and re-visitors from the ten routes (Table 4). However, eight groups whose PVs are under fifty are excluded because of not enough data to analyze. The remaining 12 groups of clickstreams were labeled as Table 1.

Table 4. Example of correcting related words by word

Full size table

Second, make a square matrix of the same number of related words between key words. Table b is an example of square matrix of Table a. In Table 3, word1 and word2 have two identical related words, ‘Inquiry’ and ‘(Monthly) fee’, so the value of (2,1) Finally, the square matrix is classified into n groups by H-Clustering and representative keywords representing each group are selected as shown in Table 5.

Table 5. Example of selecting representative keyword

Full size table

3.4 Matching Service List with Persona

The service type defined in Sect. 3.2 and the persona defined in Sect. 3.1 are matched as shown in Table 6 of the receiver.

Table 6. Example of service type matching with person

Full size table

4 Result

4.1 Collecting Clickstream Data

We used Beusable to track visiting users to Laftel. Beusable provides basic statistics such as page view, average residence time, dropout rate, device statistics, monitor resolution distribution, and access routes, click stream data by user types (new visit and re-visit). We concentrated on access routes and clickstream data of three weeks. During the three weeks, 30,000 page views and 15,000 unique views are collected.

4.2 Calculating the Distances Between Clickstream

Users visited Laftel with ten routes we extracted twenty groups of clickstream data in total which are clickstreams of new visitors and re-visitors from the ten routes. However, eight groups whose PVs are under fifty are excluded because of not enough data to analyze. The remaining 12 groups of clickstreams were labeled as Table 7.

Table 7. Page View for each type of access routes and user types

Full size table

4.3 Selecting Representative Clickstreams Using H-Clustering

We computed the distances between twelve clickstreams using Euclidean distance and generated a n by n matrix as Table 2. The element at i-th row and j-th column represents the distance between i-th clickstream and j-th clickstream (Table 8).

Table 8. The distance between clickstreams

Full size table

Figure 2 shows the result of hierarchical clustering of the matrix of Table 2. We found six clusters of clickstreams from the result. We regarded the clickstream with the highest PV in a cluster as the representative of the cluster. The access routes and the user types of the selected representatives are S3 (Direct-New), S4 (Direct-Return), S7 (Search-New), S8 (Search-Return), S10 (about.laftel.net-New), and S11 (msn.com/sprtan/ntp-Return).

4.4 Mapping Clickstreams to Common Workflows

We mapped the coordinates where a certain number of users of the selected clickstreams stayed with the functional items in Laftel website. We also recorded the time of stay for each coordinate. And, we compared the trends of six clustered clickstreams each other. We discovered three personas: service explorers, soft users, and hard users as Table 9.

Table 9. The data-driven personas

Full size table

Explorers.

These people visited a website through corporate introduction. They traversed the webpage as exploring services. And, they checked if the animations and the cartoons of their interests are provided. Also, they tried to know if purchase of the animations and the cartoons is allowed.

Soft Users.

These people came to a web page through a search engine. They tended to consume the animation and the webtoon what they have been consumed. Also, they searched for other contents that can be consumed with the present animation and webtoons. They tended to visit a website in a short time. Within the time period, they consume contents in 50% of the period and watch commercials in the rest.

Hard Users.

These people visits a web page through URL. They visit the web page to see the contents consumed before. They had a tendency to stay in the web page a long time, relatively. The 70% of the time is used for content consumption and the others are used for commercials and search).

4.5 Extract Key Words and Related Words

Laftel is a Korean language service. We use konlpy and kkma, which are Korean natural language processing tools, to find the frequency of words, and Word2Vec, which is a tool to assign word vector values to confirm the similarity between words, was used. As a result of the NLP analysis, ‘Key words’ are frequently used in the top 20 words, such as Payment (1910), Refund (1103), Point (792), Monthly (714), Purchase (637) Possibility (561), work (539), possibility (405), playback (325), video (278), animation (271), free (of charge) (266), Cancellation (256), Advertisement (243), Viewing (237), Confirm (234), Authentication (228), Cancel (227) and Publication right (225). Table 10 shows the related words for each key word.

Table 10. The key words and related words in CS List

Full size table

4.6 Service Classification and Typing

We grouped words using the similarity of each word and defined the service type. Table 11 shows the similarity between words. The value is the number of the same words among the related words of two words in x and y.

Table 11. Similarity between keywords

Full size table

The result of H-clustering the table is shown in Fig. 3. Based on the results, keywords of each service type are selected as shown in Table 12. There are two major service types, ‘Content’ and ‘Account’. There are 3 service types for each major category, 6 for each service type. Content Advertisement, Content consumption, Content etc., Account membership, Account-Authentication, Account benefit’. The key word of the first service type ‘Content Advertisement’ is ‘Advertisement’ and ‘Confirm’. The second service type ‘Content consumption’ key words are ‘Animation’, ‘Free (of charge)’, ‘Cancellation’, ‘Viewing’, ‘Authentication’, and ‘Cancel’. The third service type, ‘Content etc.’, is the ‘publication right’ key word. The key word in the fourth service type ‘Account membership’ is ‘Payment’, ‘Refund’, ‘Monthly fee’, ‘purchase’, ‘Inquiry’, ‘Consumption’. The key word for the fifth service type, ‘Account authentication’, is ‘possibility’, ‘Video’. The key words of the last service type ‘account benefit’ are ‘Point’ and ‘Work’.

Table 12. The list of service type classifications and representative keywords selected

Full size table

4.7 Matching Service List to Persona

Table 13 shows the service types and key words classified according to the needs of the persona. The explorer, a new user of Laftel Service, matched the ‘Content consumption’ service related to the information of the content provided by Laftel and ‘Account membership’ which is the service for the membership information. ‘Content consumption’, ‘Content advertisement’, ‘Account membership’, and ‘Authentication’ service, which is an authentication related service required at the initial stage of the account, are required for a soft user who is an existing user but has a relatively low service utilization degree, authentication’. For hard users who have a lot of service frequency and time, they match ‘content consumption’, ‘account membership’, and ‘account benefit’ which is an additional reward service for each account.

Table 13. Matching service type by persona

Full size table

5 Conclusion

In this study, we introduce a data-driven framework for designing customer service chatbot. First, we used Beusable to collect clickstream data of Laftel, utilized hierarchical clustering to generate personas representing explorer, soft user, and hard user. In result, explorers visit the website to see if there are animations and webtoons of their interests as well as if they can be purchased. Soft users stay in a website in a short time. The 50% of the time is used for content consumption and the rest is utilized for commercials. Hard users spend a long time in a web site. The 70% of the time is used for content consumption and the rest is utilized for commercials and content search. Second, we defined the CS service type as NLP processing of corporate CS data. We extracted key words with high frequency of use and extracted related words that are close to vector distance from key word. We define that the distance between key words is proportional to the number of related words, and clustering key words by H-clustering the same number of related words. We grouped the service types into 6 groups, and grouped the 6 clusters into ‘Content’ and ‘Account’. The first group, Content, has 3 service types. ‘Advertisement’, ‘Consumption’ and ‘etc’. Also, the second group, Account, has 3 service types. ‘Membership’, ‘Authentication’ and ‘Benefit’. In result, we generate three types of customer service chatbots for each personas. Content consumption’, ‘Account membership’ and ‘Account authentication’ services for Soft users, ‘Content consumption’, ‘Content’, ‘Advertisement’, ‘Account membership’ and ‘Account benefit’ Service.

We confirmed the possibility of persona using the data through the literature review. Along with the study of Xiang, Hans-Frederick and Anil [28], this study showed a way to make persona using clickstream data of users. However, not every service can collect every single click stream. Rather, often, a collection of anonymous clickstreams can be accessible and retrievable using tools such as Beusable. And little streaming service users have been analyzed through clickstream data. Yet, Laftel is a popular streaming service that the public uses. And thus, our study can show the potential of data-driven design in general streaming services. We also confirmed the possibility of CS chatbot customized by person. However, previous studies have focused on answering the same answer with different sentences [1, 13, 15]. However, in this study, it is aimed to recognize service which is mainly used for each user and to provide optimized service for each persona.

This study is meaningful in that all of the methodologies used are data, and data processing is applied to UX methodology. It means that we quantify the usefulness of design based on user behavior data. In the basis of our result, Laftel can modify a CS service and validate the usefulness of our approach using A/B testing. We may increase the size of data and see the minimum number of data that can be useful enough for a service provider to have a meaningful result.

**All of the data used in the study are anonymous and there is no Problem to protect users’ privacy.

References

Allen, C.O., Freed, A.R.: Persona-based conversation. U.S. Patent Application No. 14/557, 618 (2016)
Google Scholar
Allo Homepage. https://allo.google.com/. Accessed 15 Feb 2019
Astro Homepage. https://www.astro.ai/. Accessed 15 Feb 2019
Au, I., et al.: User experience at Google: focus on the user and all else will follow. In: CHI’08 Extended Abstracts on Human Factors in Computing Systems, pp. 3681–3686. ACM, New York (2008)
Google Scholar
Babylon Health Homepage. https://www.babylonhealth.com/. Accessed 15 Feb 2019
Beusable Homepage. https://www.beusable.net. Accessed 15 Feb 2019
Cooper, A.: The Inmates are Running the Asylum, The: Why High-Tech Products Drive us Crazy and How to Restore the Sanity. Sams, Indianapolis (2004)
Google Scholar
Cui, L., Huang, S., Wei, F., Tan, C., Duan, C., Zhou, M.: Superagent: a customer service chatbot for e-commerce websites. In: Proceedings of ACL 2017, System Demonstrations, pp. 97–102. ACL, Pennsylvania (2017)
Google Scholar
Erik, D.: The 2018 State of Chatbots Report: How Chatbots Are Reshaping Online Experiences. https://www.drift.com/blog/chatbots-report/. Accessed 15 Feb 2019
Jobpal Homepage. https://jobpal.ai/en/. Accessed 15 Feb 2019
Laftel Homepage. http://www.laftel.net. Accessed 15 Feb 2019
Leena Homepage. https://www.leena.ai/. Accessed 15 Feb 2019
Liu, B., et al.: Content-oriented user modeling for personalized response ranking in chatbots. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 26(1), 122–133 (2018)
Article MathSciNet Google Scholar
Louis, C.: Gartner’s Top 10 Predictions For IT In 2018 And Beyond, Gartner. https://www.forbes.com/sites/louiscolumbus/2017/10/03/gartners-top-10-predictions-for-it-in-2018-and-beyond/#2b07633345bb. Accessed 15 Feb 2019
Makar, M.G., Tindall, T.A.: Automatic message selection with a chatbot. U.S. Patent No. 8,738,739 (2014)
Google Scholar
Mobasher, Bamshad: Data mining for web personalization. In: Brusilovsky, Peter, Kobsa, Alfred, Nejdl, Wolfgang (eds.) The Adaptive Web. LNCS, vol. 4321, pp. 90–135. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72079-9_3
Chapter Google Scholar
Norman, D.: Human-centered design considered harmful. Interactions 12(4), 14–19 (2005)
Article Google Scholar
Parasuraman, A., Berry, L.L., Zeithaml, V.A.: Understanding customer expectations of service. Sloan Manag. Rev. 32(3), 39–48 (1991)
Google Scholar
Piccardi, T., Convertino, G., Zancanaro, M., Wang, J., Archambeau, C.: Towards crowd-based customer service: a mixed-initiative tool for managing Q&A sites. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2725–2734. ACM, New York (2014)
Google Scholar
Pittman, R.J.: Say “Hello” to eBay ShopBot Beta. https://www.ebayinc.com/stories/news/say-hello-to-ebay-shopbot-beta/. Accessed 15 Feb 2019
Pruitt, J., Grudin, J.: Personas: practice and theory. In: Proceedings of the 2003 Conference on Designing for User Experiences, pp. 1–15. ACM, New York (2003)
Google Scholar
Reeves, L.M., Lai, J., Larson, J.A., Oviatt, S., Balaji, T.S., Buisine, S.: Guidelines for multimodal user interface design. Commun. ACM 47(1), 57–59 (2004)
Article Google Scholar
Replika Homepage. https://replika.ai/. Accessed 15 Feb 2019
Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, United Kingdom, pp. 583–593 (2011)
Google Scholar
Say hello to Kai. https://kasisto.com/blog/say-hello-to-kai/. Accessed 15 Feb 2019
Singh, M.J., Cancel, D.: U.S. Patent No. 8,095,589. U.S. Patent and Trademark Office, Washington, DC (2012)
Google Scholar
Woebot Homepage. https://woebot.io/. Accessed 15 Feb 2019
Zhang, X., Brown, H., Shankar, A.: Data-driven personas: constructing archetypal users with clickstreams and user telemetry. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5350–5359. ACM, New York (2016)
Google Scholar

Download references

Acknowledgements

This research was supported by Korea Institute for Advancement of Technology (KIAT) Grant funded by the Korea Government (MOTIE) (N0001436, The Competency Development Program for Industry Specialist).

Author information

Authors and Affiliations

Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, Republic of Korea
Shinhee Hwang & Keeheon Lee
Laftel, 14, World Cup-ro 1-gil, Mapo-gu, Seoul, Republic of Korea
Beomjun Kim

Authors

Shinhee Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Beomjun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Keeheon Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinhee Hwang .

Editor information

Editors and Affiliations

Aaron Marcus and Associates, Berkeley, CA, USA
Aaron Marcus
Zuoyebang, K12 education, Beijing, China
Wentao Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hwang, S., Kim, B., Lee, K. (2019). A Data-Driven Design Framework for Customer Service Chatbot. In: Marcus, A., Wang, W. (eds) Design, User Experience, and Usability. Design Philosophy and Theory. HCII 2019. Lecture Notes in Computer Science(), vol 11583. Springer, Cham. https://doi.org/10.1007/978-3-030-23570-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-23570-3_17
Published: 03 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23569-7
Online ISBN: 978-3-030-23570-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Data-Driven Design Framework for Customer Service Chatbot

Abstract

Similar content being viewed by others

Understanding the User Experience of Customer Service Chatbots: What Can We Learn from Customer Satisfaction Surveys?

Improving Customer Service Through the Use of Chatbot at Enma Spa Huancayo, Peru

Chatbots Assisting German Business Management Applications

Keywords

1 Introduction

2 Literature Review

2.1 Data-Driven Personas

2.2 Telemetry and Click Stream

2.3 Chatbot

3 A Data-Driven Design Framework for Customer Service Chatbot

3.1 Identify User Groups (Personas)

3.2 Identify Service Types

3.3 Distance Between Clickstreams

3.4 Matching Service List with Persona

4 Result

4.1 Collecting Clickstream Data

4.2 Calculating the Distances Between Clickstream

4.3 Selecting Representative Clickstreams Using H-Clustering

4.4 Mapping Clickstreams to Common Workflows

Explorers.

Soft Users.

Hard Users.

4.5 Extract Key Words and Related Words

4.6 Service Classification and Typing

4.7 Matching Service List to Persona

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us