Product backlog optimization technique in agile software development using clustering algorithm

Sharma, Sarika; Kumar, Deepak

doi:10.1007/s11042-023-15406-w

Product backlog optimization technique in agile software development using clustering algorithm

Published: 02 May 2023

Volume 82, pages 46695–46715, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

321 Accesses
1 Altmetric
Explore all metrics

Abstract

Context

The recent research trend has highlighted that multiple stakeholders are involved during requirement gathering in agile software development. Hence, leading to an increased number of duplicate user stories in agile product backlog during requirement gathering.

Objective

The objective of this paper is to evaluate the existing techniques employed in identifying and eliminating the duplicate user stories from agile product backlog and to overcome the existing gaps with the help of a newly proposed clustering algorithm.

Method

An agile user story is expressed as a function of input and output parameters. That said multiple user stories having similar set of input parameters are most likely to be duplicate causing a redundancy. The newly proposed algorithm is used for clustering user stories having similar set of input parameters through various iterations and then removing the identified duplicate user stories from agile product backlog. This paper also introduces the concept of mass clustering which means clustering a number of user stories in single run.

Results

Experimental results prove the proposed model is capable of handling small and large releases ranging between 100 to 1000 user stories with similar efficiency. The proposed clustering algorithm outperformed the clustering algorithms and resulted in 37% decrease in agile product backlog by eliminating duplicate user stories causing redundancy. The experimental results are obtained from the logs of the MATLAB tool. However, the provided algorithm is generic in nature and can be implemented using R, Python or SAS programming tools. The provided algorithms employs proven matrix operations.

Conclusion

The proposed clustering algorithm overcomes the limitation of existing user story management methods and clearly out performs when compared with other clustering algorithms. Finally, this paper gives recommendations about the usage of the provided clustering algorithm during agile release planning for eliminating duplicate user stories from agile product backlog.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig 1.

User story clustering in agile development: a framework and an empirical study

Article 21 January 2023

Machine Learning Based Approach for User Story Clustering in Agile Engineering

Article 30 September 2023

An Impact Study of Business Process Models for Requirements Elicitation in XP

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Abrahamsson P, Warsta J, Siponen MT, Ronkainen J (2003) New directions on agile methods: a comparative analysis, in 25th International Conference on Software Engineering, 2003. Proceedings, pp. 244–254. doi: https://doi.org/10.1109/ICSE.2003.1201204
Ahmad MO, Dennehy D, Conboy K, Oivo M (2018) Kanban in software engineering: A systematic mapping study. J Syst Softw 137:96–113. https://doi.org/10.1016/j.jss.2017.11.045
Article Google Scholar
Alsalemi AM, Yeoh ET (2016) A survey on product backlog change management and requirement traceability in agile (Scrum), in 2015 9th Malaysian Software Engineering Conference, MySEC 2015. doi: https://doi.org/10.1109/MySEC.2015.7475219
Barbosa R, Silva AEA, Moraes R (2016) Use of Similarity Measure to Suggest the Existence of Duplicate User Stories in the Srum Process, in Proceedings - 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN-W 2016. doi: https://doi.org/10.1109/DSN-W.2016.27
Berger H, Beynon-Davies P (2009) The utility of rapid application development in large-scale, complex projects. Inf. Syst. J. 19(6):549–570. https://doi.org/10.1111/j.1365-2575.2009.00329.x
Article Google Scholar
Blankenship J, Bussa M, Millett S, Blankenship J, Bussa M, Millett S (2011) Sprint 0: Generating the Product Backlog,” in Pro Agile .NET Development with Scrum, doi: https://doi.org/10.1007/978-1-4302-3534-7_4
Boerman MP, Lubsen Z, Tamburri DA, Visser J (2015) Measuring and monitoring agile development status, in International Workshop on Emerging Trends in Software Metrics, WETSoM. doi: https://doi.org/10.1109/WETSoM.2015.15
Bolloju N, Gupta A, Alter S, Gupta S, Jain S (2017) Improving scrum user stories and product backlog using work system snapshots, in AMCIS 2017 - America’s Conference on Information Systems: A Tradition of Innovation
Charikar M, Guha S, Tardos É, Shmoys DB (2002) A constant-factor approximation algorithm for the k-median problem. J Comput Syst Sci. 65(1):129–149. https://doi.org/10.1006/JCSS.2002.1882
Cohen-Addad V, Larsen KG, Saulpic D, Schwiegelshohn C (2022) Towards optimal lower bounds for k-median and k-means coresets, in Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1038–1051. doi: https://doi.org/10.1145/3519935.3519946
Czumaj A, Sohler C (2017) Sublinear Clustering in Encyclopedia of Machine Learning and Data Mining, Boston, MA: Springer US, pp. 1205–1209. doi: https://doi.org/10.1007/978-1-4899-7687-1_798
Duraisamy G, Atan R (2013) Requirement traceability matrix through documentation for SCRUM methodology, J Theor Appl Inf Technol
Frahling G, Sohler C (2006) A fast k-means implementation using coresets, in Proceedings of the twenty-second annual symposium on Computational geometry, pp. 135–143. doi: 10.1145/1137856.1137879
Ghosh S, Kumar S (2013) Comparative Analysis of K-Means and Fuzzy C-Means Algorithms, Int J Adv Comput Sci Appl, vol. 4, no. 4, doi: https://doi.org/10.14569/IJACSA.2013.040406
Hartigan JA, Wong MA (1979) Algorithm AS 136: A K-Means Clustering Algorithm. Appl. Stat. 28(1):100. https://doi.org/10.2307/2346830
Article MATH Google Scholar
Holmes CC, Adams NM (2002) A probabilistic nearest neighbour method for statistical pattern recognition. J. R. Stat. Soc. Ser. B Statistical Methodol. 64(2):295–306. https://doi.org/10.1111/1467-9868.00338
Article MathSciNet MATH Google Scholar
Kayes I, Sarker M, Chakareski J (2016) Product backlog rating: a case study on measuring test quality in scrum, Innov Syst Softw Eng, doi: https://doi.org/10.1007/s11334-016-0271-0
Kosub S (2019) A note on the triangle inequality for the Jaccard distance. Pattern Recognit. Lett. 120:36–38. https://doi.org/10.1016/j.patrec.2018.12.007
Article Google Scholar
Kupiainen E, Mäntylä MV, Itkonen J (2015) Using metrics in Agile and Lean software development - A systematic literature review of industrial studies, Information and Software Technology.doi: https://doi.org/10.1016/j.infsof.2015.02.005
Li J, Song S, Zhang Y, Zhou Z (2016) Robust K-Median and K-Means Clustering Algorithms for Incomplete Data. Math. Probl. Eng. 2016:1–8. https://doi.org/10.1155/2016/4321928
Article MathSciNet MATH Google Scholar
Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit. 36(2):451–461. https://doi.org/10.1016/S0031-3203(02)00060-2
Article Google Scholar
Masulli F,Rovetta S (2015) Clustering high-dimensional data, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). doi: https://doi.org/10.1007/978-3-662-48577-4_1
Maurer F, Martel S (2002) Extreme programming. Rapid development for Web-based applications. IEEE Internet Comput. 6(1):86–90. https://doi.org/10.1109/4236.989006
Article Google Scholar
Noll J, Razzak MA, Bass JM, Beecham S (2017) A study of the scrum master’s role, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10611 LNCS, pp. 307–323. 10.1007/978-3-319-69926-4_22
Paasivaara M, Heikkilä VT, Lassenius C (2012) Experiences in scaling the Product Owner role in large-scale globally distributed Scrum, in Proceedings - 2012 IEEE 7th International Conference on Global Software Engineering, ICGSE 2012, doi: https://doi.org/10.1109/ICGSE.2012.41
Panigrahy R (2008) An Improved Algorithm Finding Nearest Neighbor Using Kd-trees, in LATIN: Theoretical Informatics, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 387–398. doi: https://doi.org/10.1007/978-3-540-78773-0_34
Park H-S, Jun C-H (2009) A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2):3336–3341. https://doi.org/10.1016/j.eswa.2008.01.039
Article Google Scholar
Radigan D (2018) The product backlog: your ultimate to-do list | Atlassian, Atlassian Agile Coach
Rawat KS, Sood SK (2021) Emerging trends and global scope of big data analytics: a scientometric analysis. Qual. Quant. 55(4):1371–1396. https://doi.org/10.1007/s11135-020-01061-y
Article Google Scholar
Samworth RJ (2012) Optimal weighted nearest neighbour classifiers. Ann. Stat. 40(5). https://doi.org/10.1214/12-AOS1049
Sedano T, Ralph P, Peraire C (2019) The Product Backlog, in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 200–211. doi: https://doi.org/10.1109/ICSE.2019.00036
Sharma S, Kumar D (2019) On the Development of Feature-Based Sprint in AGILE, in Ambient Communications and Computer Systems. Advances in Intelligent Systems and Computing, Volume 904., T. M. Hu YC., Tiwari S., Mishra K., Ed. Springer, Singapore, pp. 223–235. doi: https://doi.org/10.1007/978-981-13-5934-7_20
Sharma S, Kumar D (2019) Agile Release Planning Using Natural Language Processing Algorithm, in 2019 Amity International Conference on Artificial Intelligence (AICAI), pp. 934–938. doi: https://doi.org/10.1109/AICAI.2019.8701252.
Song G, Rochas J, El Beze LE, Huet F, Magoulès F (2016) K Nearest Neighbour Joins for Big Data on MapReduce: A Theoretical and Experimental Analysis, IEEE Trans Knowl Data Eng, doi: https://doi.org/10.1109/TKDE.2016.2562627
Tirumala SS, Ali S, Babu A (2016) A Hybrid Agile model using SCRUM and Feature Driven Development. Int. J. Comput. Appl. 156(5):1–5. https://doi.org/10.5120/ijca2016912443
Article Google Scholar
Wang C, Pedrycz W, Li Z, Zhou M (2021) Residual-driven Fuzzy C-Means Clustering for Image Segmentation. IEEE/CAA J. Autom. Sin. 8(4):876–889. https://doi.org/10.1109/JAS.2020.1003420
Article MathSciNet Google Scholar
Wong MA, Lane T (1983) A K th Nearest Neighbour Clustering Procedure. J. R. Stat. Soc. Ser. B 45(3):362–368. https://doi.org/10.1111/j.2517-6161.1983.tb01262.x
Article MathSciNet MATH Google Scholar
Xu R, Wunsch DC (2008) Clustering. doi: https://doi.org/10.1002/9780470382776
Xu R, WunschII D (2005) Survey of Clustering Algorithms. IEEE Trans. Neural Networks 16(3):645–678. https://doi.org/10.1109/TNN.2005.845141
Article Google Scholar
Yakowitz S (1987) Nearest-Neighbour Methods for Time Series Analysis. J Time Ser Anal 8(2):235–247. https://doi.org/10.1111/j.1467-9892.1987.tb00435.x
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Amity Institute of Information Technology, Amity University, Sector-125, Noida, U.P., India
Sarika Sharma & Deepak Kumar

Authors

Sarika Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarika Sharma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sharma, S., Kumar, D. Product backlog optimization technique in agile software development using clustering algorithm. Multimed Tools Appl 82, 46695–46715 (2023). https://doi.org/10.1007/s11042-023-15406-w

Download citation

Received: 26 September 2022
Revised: 09 March 2023
Accepted: 18 April 2023
Published: 02 May 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11042-023-15406-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Product backlog optimization technique in agile software development using clustering algorithm