Skip to main content
Log in

Product backlog optimization technique in agile software development using clustering algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Context

The recent research trend has highlighted that multiple stakeholders are involved during requirement gathering in agile software development. Hence, leading to an increased number of duplicate user stories in agile product backlog during requirement gathering.

Objective

The objective of this paper is to evaluate the existing techniques employed in identifying and eliminating the duplicate user stories from agile product backlog and to overcome the existing gaps with the help of a newly proposed clustering algorithm.

Method

An agile user story is expressed as a function of input and output parameters. That said multiple user stories having similar set of input parameters are most likely to be duplicate causing a redundancy. The newly proposed algorithm is used for clustering user stories having similar set of input parameters through various iterations and then removing the identified duplicate user stories from agile product backlog. This paper also introduces the concept of mass clustering which means clustering a number of user stories in single run.

Results

Experimental results prove the proposed model is capable of handling small and large releases ranging between 100 to 1000 user stories with similar efficiency. The proposed clustering algorithm outperformed the clustering algorithms and resulted in 37% decrease in agile product backlog by eliminating duplicate user stories causing redundancy. The experimental results are obtained from the logs of the MATLAB tool. However, the provided algorithm is generic in nature and can be implemented using R, Python or SAS programming tools. The provided algorithms employs proven matrix operations.

Conclusion

The proposed clustering algorithm overcomes the limitation of existing user story management methods and clearly out performs when compared with other clustering algorithms. Finally, this paper gives recommendations about the usage of the provided clustering algorithm during agile release planning for eliminating duplicate user stories from agile product backlog.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig 1.
Fig. 2.
Fig. 3
Fig. 4.
Fig. 5
Fig. 6.
Fig. 7.
Fig. 8

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Abrahamsson P, Warsta J, Siponen MT, Ronkainen J (2003) New directions on agile methods: a comparative analysis, in 25th International Conference on Software Engineering, 2003. Proceedings, pp. 244–254. doi: https://doi.org/10.1109/ICSE.2003.1201204

  2. Ahmad MO, Dennehy D, Conboy K, Oivo M (2018) Kanban in software engineering: A systematic mapping study. J Syst Softw 137:96–113. https://doi.org/10.1016/j.jss.2017.11.045

    Article  Google Scholar 

  3. Alsalemi AM, Yeoh ET (2016) A survey on product backlog change management and requirement traceability in agile (Scrum), in 2015 9th Malaysian Software Engineering Conference, MySEC 2015. doi: https://doi.org/10.1109/MySEC.2015.7475219

  4. Barbosa R, Silva AEA, Moraes R (2016) Use of Similarity Measure to Suggest the Existence of Duplicate User Stories in the Srum Process, in Proceedings - 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN-W 2016. doi: https://doi.org/10.1109/DSN-W.2016.27

  5. Berger H, Beynon-Davies P (2009) The utility of rapid application development in large-scale, complex projects. Inf. Syst. J. 19(6):549–570. https://doi.org/10.1111/j.1365-2575.2009.00329.x

    Article  Google Scholar 

  6. Blankenship J, Bussa M, Millett S, Blankenship J, Bussa M, Millett S (2011) Sprint 0: Generating the Product Backlog,” in Pro Agile .NET Development with Scrum, doi: https://doi.org/10.1007/978-1-4302-3534-7_4

  7. Boerman MP, Lubsen Z, Tamburri DA, Visser J (2015) Measuring and monitoring agile development status, in International Workshop on Emerging Trends in Software Metrics, WETSoM. doi: https://doi.org/10.1109/WETSoM.2015.15

  8. Bolloju N, Gupta A, Alter S, Gupta S, Jain S (2017) Improving scrum user stories and product backlog using work system snapshots, in AMCIS 2017 - America’s Conference on Information Systems: A Tradition of Innovation

  9. Charikar M, Guha S, Tardos É, Shmoys DB (2002) A constant-factor approximation algorithm for the k-median problem. J Comput Syst Sci. 65(1):129–149. https://doi.org/10.1006/JCSS.2002.1882

  10. Cohen-Addad V, Larsen KG, Saulpic D, Schwiegelshohn C (2022) Towards optimal lower bounds for k-median and k-means coresets, in Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1038–1051. doi: https://doi.org/10.1145/3519935.3519946

  11. Czumaj A, Sohler C (2017) Sublinear Clustering in Encyclopedia of Machine Learning and Data Mining, Boston, MA: Springer US, pp. 1205–1209. doi: https://doi.org/10.1007/978-1-4899-7687-1_798

  12. Duraisamy G, Atan R (2013) Requirement traceability matrix through documentation for SCRUM methodology, J Theor Appl Inf Technol

  13. Frahling G, Sohler C (2006) A fast k-means implementation using coresets, in Proceedings of the twenty-second annual symposium on Computational geometry, pp. 135–143. doi: 10.1145/1137856.1137879

  14. Ghosh S, Kumar S (2013) Comparative Analysis of K-Means and Fuzzy C-Means Algorithms, Int J Adv Comput Sci Appl, vol. 4, no. 4, doi: https://doi.org/10.14569/IJACSA.2013.040406

  15. Hartigan JA, Wong MA (1979) Algorithm AS 136: A K-Means Clustering Algorithm. Appl. Stat. 28(1):100. https://doi.org/10.2307/2346830

    Article  MATH  Google Scholar 

  16. Holmes CC, Adams NM (2002) A probabilistic nearest neighbour method for statistical pattern recognition. J. R. Stat. Soc. Ser. B Statistical Methodol. 64(2):295–306. https://doi.org/10.1111/1467-9868.00338

    Article  MathSciNet  MATH  Google Scholar 

  17. Kayes I, Sarker M, Chakareski J (2016) Product backlog rating: a case study on measuring test quality in scrum, Innov Syst Softw Eng, doi: https://doi.org/10.1007/s11334-016-0271-0

  18. Kosub S (2019) A note on the triangle inequality for the Jaccard distance. Pattern Recognit. Lett. 120:36–38. https://doi.org/10.1016/j.patrec.2018.12.007

    Article  Google Scholar 

  19. Kupiainen E, Mäntylä MV, Itkonen J (2015) Using metrics in Agile and Lean software development - A systematic literature review of industrial studies, Information and Software Technology.doi: https://doi.org/10.1016/j.infsof.2015.02.005

  20. Li J, Song S, Zhang Y, Zhou Z (2016) Robust K-Median and K-Means Clustering Algorithms for Incomplete Data. Math. Probl. Eng. 2016:1–8. https://doi.org/10.1155/2016/4321928

    Article  MathSciNet  MATH  Google Scholar 

  21. Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit. 36(2):451–461. https://doi.org/10.1016/S0031-3203(02)00060-2

    Article  Google Scholar 

  22. Masulli F,Rovetta S (2015) Clustering high-dimensional data, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). doi: https://doi.org/10.1007/978-3-662-48577-4_1

  23. Maurer F, Martel S (2002) Extreme programming. Rapid development for Web-based applications. IEEE Internet Comput. 6(1):86–90. https://doi.org/10.1109/4236.989006

    Article  Google Scholar 

  24. Noll J, Razzak MA, Bass JM, Beecham S (2017) A study of the scrum master’s role, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10611 LNCS, pp. 307–323.  10.1007/978-3-319-69926-4_22

  25. Paasivaara M, Heikkilä VT, Lassenius C (2012) Experiences in scaling the Product Owner role in large-scale globally distributed Scrum, in Proceedings - 2012 IEEE 7th International Conference on Global Software Engineering, ICGSE 2012, doi: https://doi.org/10.1109/ICGSE.2012.41

  26. Panigrahy R (2008) An Improved Algorithm Finding Nearest Neighbor Using Kd-trees, in LATIN: Theoretical Informatics, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 387–398. doi: https://doi.org/10.1007/978-3-540-78773-0_34

  27. Park H-S, Jun C-H (2009) A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2):3336–3341. https://doi.org/10.1016/j.eswa.2008.01.039

    Article  Google Scholar 

  28. Radigan D (2018) The product backlog: your ultimate to-do list | Atlassian, Atlassian Agile Coach

  29. Rawat KS, Sood SK (2021) Emerging trends and global scope of big data analytics: a scientometric analysis. Qual. Quant. 55(4):1371–1396. https://doi.org/10.1007/s11135-020-01061-y

    Article  Google Scholar 

  30. Samworth RJ (2012) Optimal weighted nearest neighbour classifiers. Ann. Stat. 40(5). https://doi.org/10.1214/12-AOS1049

  31. Sedano T, Ralph P, Peraire C (2019) The Product Backlog, in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 200–211. doi: https://doi.org/10.1109/ICSE.2019.00036

  32. Sharma S, Kumar D (2019) On the Development of Feature-Based Sprint in AGILE, in Ambient Communications and Computer Systems. Advances in Intelligent Systems and Computing, Volume 904., T. M. Hu YC., Tiwari S., Mishra K., Ed. Springer, Singapore, pp. 223–235. doi: https://doi.org/10.1007/978-981-13-5934-7_20

  33. Sharma S, Kumar D (2019) Agile Release Planning Using Natural Language Processing Algorithm, in 2019 Amity International Conference on Artificial Intelligence (AICAI), pp. 934–938. doi: https://doi.org/10.1109/AICAI.2019.8701252.

  34. Song G, Rochas J, El Beze LE, Huet F, Magoulès F (2016) K Nearest Neighbour Joins for Big Data on MapReduce: A Theoretical and Experimental Analysis, IEEE Trans Knowl Data Eng, doi: https://doi.org/10.1109/TKDE.2016.2562627

  35. Tirumala SS, Ali S, Babu A (2016) A Hybrid Agile model using SCRUM and Feature Driven Development. Int. J. Comput. Appl. 156(5):1–5. https://doi.org/10.5120/ijca2016912443

    Article  Google Scholar 

  36. Wang C, Pedrycz W, Li Z, Zhou M (2021) Residual-driven Fuzzy C-Means Clustering for Image Segmentation. IEEE/CAA J. Autom. Sin. 8(4):876–889. https://doi.org/10.1109/JAS.2020.1003420

    Article  MathSciNet  Google Scholar 

  37. Wong MA, Lane T (1983) A K th Nearest Neighbour Clustering Procedure. J. R. Stat. Soc. Ser. B 45(3):362–368. https://doi.org/10.1111/j.2517-6161.1983.tb01262.x

    Article  MathSciNet  MATH  Google Scholar 

  38. Xu R, Wunsch DC (2008) Clustering. doi: https://doi.org/10.1002/9780470382776

  39. Xu R, WunschII D (2005) Survey of Clustering Algorithms. IEEE Trans. Neural Networks 16(3):645–678. https://doi.org/10.1109/TNN.2005.845141

    Article  Google Scholar 

  40. Yakowitz S (1987) Nearest-Neighbour Methods for Time Series Analysis. J Time Ser Anal 8(2):235–247. https://doi.org/10.1111/j.1467-9892.1987.tb00435.x

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarika Sharma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, S., Kumar, D. Product backlog optimization technique in agile software development using clustering algorithm. Multimed Tools Appl 82, 46695–46715 (2023). https://doi.org/10.1007/s11042-023-15406-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15406-w

Keywords

Navigation