Abstract
Quality and market acceptance of software products is strongly influenced by responsiveness to customer requests. Once a customer request is received, a decision must be made whether to escalate it to the development team. Once escalated, the ticket must be formulated as a development task and assigned to a developer. To make the process more efficient and reduce the time between receiving and escalating the customer request, we aim to automate the complete customer request management process. We propose a holistic method called ESSMArT. The method performs text summarization, predicts ticket escalation, creates the ticket’s title and content, and ultimately assigns the ticket to an available developer. We began evaluating the method through an internal assessment of 4114 customer tickets from Brightsquid’s secure health care communication platform - Secure-Mail. Next, we conducted an external evaluation of the usefulness of the approach and concluded that: i) supervised learning based on context specific data performs best for extractive summarization; ii) Random Forest trained on a combination of conversation and extractive summarization works best for predicting escalation of tickets, with the highest precision (of 0.9) and recall (of 0.55). Through external evaluation, we furthermore found that ESSMArT provides suggestions that are 71% aligned with human ones. Applying the prototype implementation to 315 customer requests resulted in an average time reduction of 9.2 min per request. ESSMArT helps to make ticket management faster and with reduced effort for human experts. We conclude that ESSMArT not only expedites ticket management, but furthermore reduces human effort. ESSMArT can help Brightsquid to (i) minimize the impact of staff turnover and (ii) shorten the cycle from an issue being reported to a developer being assigned to fix it.
Similar content being viewed by others
Notes
ESSMArT: EScalation and SuMmarization AuTomation
HIPPA: Health Insurance Portability and Accountability Act of 1996
We focused on the four products of Brightsquid among all the developed systems by them
References
Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K (2017) Text summarization techniques: a brief survey. arXiv preprint arXiv:1707.02268
Anvik J (2016) Evaluating an assistant for creating bug report assignment recommenders, vol 1705, pp 26–39
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on Software engineering. ACM, pp 361–370
Auffarth B, López M, Cerquides J (2010) Comparison of redundancy and relevance measures for feature selection in tissue classification of ct images. In: Industrial conference on data mining. Springer, pp 248–262
Bandera H, Bell DA, Little AD, York BB (2018) Increasing efficiency and effectiveness of support engineers in resolving problem tickets, Apr. 19 2018. US Patent App. 15/293,988
Banerjee S, Mitra P, Sugiyama K (2015) Multi-document abstractive summarization using ilp based multi-sentence compression. In: IJCAI, pp 1208–1214
Batista J, Ferreira R, Tomaz H, Ferreira R, Dueire Lins R, Simske S, Silva G, Riss M (2015) A quantitative and qualitative assessment of automatic text summarization systems. In: Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng ‘15. ACM, New York, pp 65–68
Bruckhaus T, Ling CX, Madhavji NH, Sheng S (2004) Software escalation prediction with data mining. In: Workshop on predictive software models (PSM 2004), A STEP Software Technology & Engineering Practice
Carenini G, Ng RT, Zhou X (2007) Summarizing email conversations with clue words. In: Proceedings of the 16th international conference on world wide web. ACM, pp 91–100
Cerpa N, Bardeen M, Astudillo CA, Verner J (2016) Evaluating different families of prediction methods for estimating software project outcomes. J Syst Softw 112:48–64
Das, Martins AF (2007) A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU 4:192–195
Du, Ruhe G (2009) Does explanation improve the acceptance of decision support for product release planning? In: Empirical Software Engineering and Measurement, 2009. ESEM 2009. 3rd International Symposium on, pages 56–68. IEEE
Edmundson P (1969) New methods in automatic extracting. Journal of the ACM (JACM) 16(2):264–285
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensem- bles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Re- views) 42(4):463–484
Gambhir M, Gupta V (Jan 2017) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47(1):1–66
Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence 2(3):258–268
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intelligent Systems and their applications 13(4):18–28
Hyvärinen JK, Oja E (2004) Independent component analysis, vol 46. Wiley
Jha N, Mahmoud A (2018) Using frame semantics for classifying and summarizing application store reviews. Empir Softw Eng:1–34
Jonsson L, Borg M, Broman D, Sandahl K, Eldh S, Runeson P (2016) Automated bug assignment: ensemble-based machine learning in large scale industrial contexts. Empir Softw Eng 21(4):1533–1578
Kabeer SJ, Nayebi M, Ruhe G, Carlson C, Chew F (2017) Predicting the vector impact of change-an industrial case study at brightsquid. In: Empirical software engineering and measurement (ESEM), 2017 ACM/IEEE international symposium on. IEEE, pp 131–140
Kim S, Ernst MD (2007) Which warnings should i fix first? In: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. ACM, pp 45–54
Kitchenham BA, Pfleeger SL (2008) Personal opinion surveys. In: Guide to advanced empirical software engineering. Springer, pp 63–92
Lemberger PP, Morel M (2013) Managing complexity of information systems: the value of simplicity. Wiley
Liaw MW et al (2002) Classification and regression by randomforest. R news 2(3):18–22
Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. Text Summarization Branches Out
Ling CX, Sheng S, Bruckhaus T, Madhavji NH (2005) Predicting software escalations with maximum roi. In: Data Mining, Fifth IEEE International Conference on. IEEE, p 4
Maalej W, Nayebi M, Johann T, Ruhe G (2016) Toward data-driven requirements engi- neering. Software, IEEE 33(1):48–54
Malhotra R (2015) A systematic review of machine learning techniques for software fault pre- diction. Appl Soft Comput 27:504–518
Mani S, Catherine R, Sinha VS, Dubey A (2012) Ausum: approach for unsupervised bug report summarization. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. ACM, p 11
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60
Martin W, Sarro F, Jia Y, Zhang Y, Harman M (2017) A survey of app store analysis for software engineering. IEEE Trans Softw Eng 43(9):817–847
Menzies T, Bird C, Zimmermann T, Schulte W, Kocaganeli E (2011) The inductive software engineering manifesto: principles for industrial data mining. In: Proceedings of the international workshop on machine learning Technologies in Software Engineering. ACM, pp 19–26
Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processing
Mohit B (2014) Named entity recognition. In: Natural language processing of semitic languages. Springer, pp 221–245
Montgomery L, Damian D (2017) What do support analysts know about their customers? On the study and prediction of support ticket escalations in large software organizations. In: Requirements engineering conference (RE), 2017 IEEE 25th international. IEEE, pp 362–371
Montgomery L, Reading E, Damian D (2017) Ecrits—visualizing support ticket escalation risk. In: Requirements engineering conference (RE), 2017 IEEE 25th international. IEEE, pp 452–455
Moreno L, Bavota G, Di Penta M, Oliveto R, Marcus A, Canfora G (2014) Automatic generation of release notes. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 484–495
Murray, Carenini G (2008) Summarizing spoken and written conversations. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 773–782
Nayebi M, Ruhe G (2014) Analytical open innovation for value-optimized service portfolio planning. In: International conference of software business. Springer, pp 273–288
Nayebi M, Ruhe G, Mota RC, Mufti M (2015) Analytics for software project management–where are we and where do we go? In: Automated Software Engineering Workshop (ASEW), 2015 30th IEEE/ACM International Conference on. IEEE, pp 18–21
Nayebi M, Marbouti M, Quapp R, Maurer F, Ruhe G (2017) Crowdsourced exploration of mobile app features: a case study of the fort mcmurray wildfire. In: Proceedings of the 39th international conference on software engineering: software engineering in society track. IEEE Press, pp 57–66
Nayebi M, Kabeer S, Ruhe G, Carlson C, Chew F (2018) Hybrid labels are the new measure!. IEEE Software, 35(1):54–57.
Nazar N, Hu Y, Jiang H (2016) Summarizing software artifacts: a literature review. J Comput Sci Technol 31(5):883–909
Peng FL, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Ramos et al (2003) Using tf-idf to determine word relevance in document queries. In: Proceed- ings of the first instructional conference on machine learning, vol 242, pp 133–142
Rastkar S, Murphy GC, Murray G (2010) Summarizing software artifacts: a case study of bug reports. In: Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ‘10. ACM, New York, pp 505–514
Rastkar S, Murphy GC, Murray G (2014) Automatic summarization of bug reports. IEEE Trans Softw Eng 40(4):366–380
Rish et al (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. IBM, New York, pp 41–46
Robillard MP, Maalej W, Walker RJ, Zimmermann T (2014) Recommendation systems in software engineering. Springer Science & Business
Schütze H (1998) Automatic word sense discrimination. Computational linguistics 24(1):97–123
Sheng VS, Gu B, Fang W, Wu J (2014) Cost-sensitive learning for defect escalation. Knowl-Based Syst 66:146–155
Singhal S, Bhattacharya A (2019) Abstractive text summarization. home.iitk.ac.in/~soumye/cs498a/report.pdf
Sorbo D, Panichella S, Alexandru CV, Shimagaki J, Visaggio CA, Canfora G, Gall HC (2016) What would users change in my app? Summarizing app reviews for recom- mending software changes. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 499–510
Steinberger, Jezek K (2004) Using latent semantic analysis in text summarization and summary evaluation. In: Proc. ISIM, vol 4, pp 93–100
Vanderwende HS, Brockett C, Nenkova A (2007) Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf Process Manag 43(6):1606–1618
Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd inter- national conference on machine learning. ACM, pp 977–984
Williams G, Mahmoud A (2017) Mining twitter feeds for software user requirements. In: Requirements engineering conference (RE), 2017 IEEE 25th International. IEEE, pp 1–10
Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1–3):37–52
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Xia X, Lo D, Ding Y, Al-Kofahi JM, Nguyen TN, Wang X (2017) Improving automated bug triaging with specialized topic model. IEEE Trans Softw Eng 43(3):272–297
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75
Yu, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5(Oct):1205–1224
Acknowledgments
This research was partially supported by the Natural Sciences and Engineering Research Council of Canada, NSERC Discovery Grant RGPIN-2017-03948 and the NSERC Collaborative Research and development project NSERC-CRD-486636-2015. We appreciate the discussion with and suggestions made by Lloyd Montgomery and acknowledge all the comments and suggestions made by the anonymous reviewers.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Alexander Serebrenik
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1 - Illustrative Example for Comparison of Processes
In this appendix we provide an anonymized Brightsquid customer support example to compare the traditional process of managing customer requests with the ESSMArT process.
1.1 Traditional Process of Managing Customer Requests at Brightsquid
Customers report issues to Brightsquid via telephone, chat or email, and issues are recorded in Zendesk. In this example, illustrated in Figure 13 below, the customer (Bob) contacts the CRM team, concerned about the management of secure messages in a team environment. The CRM agent (Alice) elaborates on the problem and provides a summary for the customer’s approval. Once Bob approves the description, Alice decides whether to escalate the ticket or resolve it on her own. If unable to determine a solution, Alice escalates the issue and informs Bob. The CRM manager (Carol), who is responsible for resolving or further escalating issues to the development project manager, reads through the ticket, expands upon the description as necessary, and escalates the issue to the project manager. The project manager (Erin) defines the ticket’s priority and assigns it to the most appropriate developer for resolution.
1.2 Managing Customer Requests with ESSMArT
When Bob’s request is received, Alice elaborates the description further with the customer, as above. ESSMArT consequently summarizes the conversation as below:
Zendesk ticket (summary of the customer request) | |
John emailed me and wanted a copy of the a message note faxed to him. I sent John back a note, with the message note attached. He sends me back a thank you. Once I have deleted my note on Mr. Smith, everyone else on the Yellow Team still has that email. Unless I specifically tell everyone to delete it or unless the rest of the Yellow Team goes into the EMR and looks into John’s chart if it’s done, no one has any way of knowing the task has been dealt with. Is there a way to delete that task from everyone’s inbox? |
ESSMArT then escalates the ticket to Carol, who receives a notification comprising the ESSMArT summary and escalation recommendation. If Carol agrees, ESSMArT creates a Jira development ticket by assigning a title using abstractive summarization (Check Figure 4 for a detailed example), assigning ticket priority, assigning a specific developer, and notifying Erin.
Jira ticket (development tickets) | |
Title: Delete everyone task in the EMR inbox Content: In the EMR system; a doctor had emailed a doctor and want a copy of a message note faxed to him. Staff member send back a note, with the message attached. The doctor sends back a thank you. Once doctor delete his note on patient, everyone else on the team still has that email. Unless the doctor specifically tell everyone to delete it or unless the rest of the team goes into the EMR and looks into the doctor’s chart to see if it’s done, no one has any way of knowing the task has been dealt with. is there a way to delete that task from everyone’s inbox? Priority: Major Assignee: Jane Doe |
Once Erin agrees, the ESSMArT ticket is added to the Jira backlog. Screenshots of this example in ESSMArT were previously presented in Figure 9.
Appendix 2 - Confusion Matrices
In Tables 5, 7, and 8, we presented precision, recall, and F1-score of three state-of-the art classifiers (Naive Bayes, SVM, and Random Forest) to predict ticket escalation, prioritize the escalated tickets, respectively assign them to a developer. These are the results of ten times of running cross validation and we provided the averages. In addition to that, we provide the aggregated confusion matrix for the classifiers with the best performance and for all the three types of prediction. These confusion matrices are presented below.
Confusion matrix for ticket escalation
Predicted | |||
---|---|---|---|
Yes | No | ||
Actual | Yes No | 218 103 | 24 297 |
The False-Positives demonstrate that the classifier mistakenly considered a ticket is escalated while in fact it has not been escalated. The False-Negatives indicate the tickets that should have been escalated but were not detected by the classifier. The False-positives may add additional effort to the development team while the False- Negatives may result in customer’s dissatisfaction by not properly addressing their problem.
Confusion matrix for ticket assignment
*Aggregated across eight different classes
Predicted | |||
---|---|---|---|
Yes | No | ||
Actual | Yes No | 218 26 | 40 41 |
The False-Positives may result in more work for developers as the ticket may be assigned to a developer with not enough expertise. False-Negatives may result in more time and effort to fix the ticket as the developer with less expertise would handle the ticket.
Confusion matrix for ticket prioritization
*Aggregated across five different classes
Predicted | |||
---|---|---|---|
Yes | No | ||
Actual | Yes No | 133 44 | 49 101 |
The False-Positives and Negatives would delay fixing some of the important customers’ concerns prioritize lower or stuck behind lower priority tickets.
Appendix 3 - Survey Questions
The user study to evaluate ESSMArT was conducted in two main parts: First, using the prototype tool of ESSMArT using the data from Brightsquid for evaluation purpose, and second, asking questions to understand the perception of participants about the usability of ESSMArT in practice. In this appendix, we present the five questions raised to understand the perception of users about ESSMArT.
Figure 11 shows the results of this survey and the perception of CRM experts and project managers. The sample of questions and the screenshots of the used prototype were presented in Fig. 8 respectively Fig. 9.
- 1.
How understandable did you find the results of ESSMArT?
5- Very understandable 4- understandable 3- Somewhat understandable but slightly ambiguous 2- Somewhat understandable but mostly ambiguous 1- Not understandable
- 2.
How likely you would use ESSMArT in practice?
5- Definitely 4- Very likely 3- Maybe 2- not likely 1- Definitely not
- 3.
To what extent you trust and rely on the ESSMArT results?
5- Totally trust it 4- Trust it 3- Neutral 2- Not trust it 1- Not trust it at all
- 4.
To what extent are you agree that ESSMArT reduces the time for deciding on a change request?
5- Strongly agree 4- Agree 3- Neutral 2- Disagree 1- Strongly disagree
- 5.
To what extent are you agree that ESSMArT reduces the time needed for CRM/PM tasks?
5- Strongly agree 4- Agree 3- Neutral 2- Disagree 1- Strongly disagree
Rights and permissions
About this article
Cite this article
Nayebi, M., Dicke, L., Ittyipe, R. et al. ESSMArT way to manage customer requests. Empir Software Eng 24, 3755–3789 (2019). https://doi.org/10.1007/s10664-019-09721-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09721-w