Skip to main content

Advertisement

Log in

Coverage-Based Fault Localization in Haskell

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Fault localization is to identify faulty program elements. Among a large number of fault localization approaches in the literature, coverage-based fault localization, especially spectrum-based fault localization (SBFL), has been intensively studied due to its effectiveness and lightweightness. Despite the rich literature, almost all existing fault localization approaches and studies have been conducted on imperative programming languages such as Java and C, leaving a gap in other programming paradigms. In this paper, we aim to study fault localization approaches for the functional programming paradigm, using the Haskell language as a representative. To the best of our knowledge, we build up the first dataset on real Haskell projects, including both real and seeded faults. The dataset enables the research of fault localization for functional languages. With it, we explore fault localization techniques for Haskell. In particular, as is typical for SBFL approaches, we study methods for coverage collection and formulae for suspiciousness score computation, and carefully adapt these two components to Haskell considering the language features and characteristics, resulting in a series of adaption approaches. Moreover, we design a learning-based approach and a transfer learning based approach to take advantage of data from imperative languages. Both approaches are evaluated on our dataset to demonstrate the promises of the direction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Zakari A, Lee S P, Abreu R, Ahmed B H, Rasheed R A. Multiple fault localization of software programs: A systematic literature review. Information and Software Technology, 2020, 124: 106312. DOI: https://doi.org/10.1016/j.infsof.2020.106312.

    Article  MATH  Google Scholar 

  2. Castro B, Perez A, Abreu R. Pangolin: An SFL-based toolset for feature localization. In Proc. the 34th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2019, pp.1130–1133. DOI: https://doi.org/10.1109/ASE.2019.00119.

    MATH  Google Scholar 

  3. Wong W E, Gao R, Li Y, Abreu R, Wotawa F. A survey on software fault localization. IEEE Trans. Software Engineering, 2016, 42(8): 707–740. DOI: https://doi.org/10.1109/TSE.2016.2521368.

    Article  MATH  Google Scholar 

  4. Ocariza Jr F S, Li G, Pattabiraman K, Mesbah A. Automatic fault localization for client-side JavaScript. Software: Testing, Verification and Reliability, 2016, 26(1): 69–88. DOI: https://doi.org/10.1002/stvr.l576.

    MATH  Google Scholar 

  5. Wen M, Chen J, Wu R, Hao D, Cheung S. Context-aware patch generation for better automated program repair. In Proc. the 40th International Conference on Software Engineering, May 27–Jun. 3, 2018, pp.1–11. DOI: https://doi.org/10.1145/3180155.3180233.

    MATH  Google Scholar 

  6. Ghanbari A, Benton S, Zhang L. Practical program repair via bytecode mutation. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2019, pp.19–30. DOI: https://doi.org/10.1145/3293882.3330559.

    Chapter  MATH  Google Scholar 

  7. Yuan Y, Banzhaf W. ARJA: Automated repair of Java programs via multi-objective genetic programming. IEEE Trans. Software Engineering, 2020, 46(10): 1040–1067. DOI: https://doi.org/10.1109/TSE.2018.2874648.

    Article  MATH  Google Scholar 

  8. Abreu R, Zoeteweij P, Van Gemund A J C. An evaluation of similarity coefficients for software fault localization. In Proc. the 12th Pacific Rim International Symposium on Dependable Computing (PRDC 2006), Dec. 2006, pp.39–46. DOI: https://doi.org/10.1109/PRDC.2006.18.

    Chapter  Google Scholar 

  9. Wong W E, Debroy V, Gao R, Li Y. The DStar method for effective software fault localization. IEEE Trans. Reliability, 2014, 63(1): 290–308. DOI: https://doi.org/10.1109/TR.2013.2285319.

    Article  MATH  Google Scholar 

  10. Jones J A, Harrold M J. Empirical evaluation of the tarantula automatic fault-localization technique. In Proc. the 20th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2005, pp.273–282. DOI: https://doi.org/10.1145/1101908.1101949.

    Chapter  MATH  Google Scholar 

  11. Zou D, Liang J, Xiong Y, Ernst M D, Zhang L. An empirical study of fault localization families and their combinations. IEEE Trans. Software Engineering, 2021, 47(2): 332–347. DOI: https://doi.org/10.1109/TSE.2019.2892102.

    Article  MATH  Google Scholar 

  12. Abreu R, Zoeteweij P, Golsteijn R, Van Gemund A J C. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software, 2009, 82(11): 1780–1792. DOI: https://doi.org/10.1016/j.jss.2009.06.035.

    Article  MATH  Google Scholar 

  13. Naish L, Lee H J, Ramamohanarao K. A model for spectra-based software diagnosis. ACM Trans. Software Engineering and Methodology (TOSEM), 2011, 20(3): Article No. 11. DOI: https://doi.org/10.1145/2000791.2000795.

  14. Liblit B, Naik M, Zheng A X, Aiken A, Jordan M I. Scalable statistical bug isolation. In Proc. the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementatio, Jun. 2005, pp.15–26. DOI: https://doi.org/10.1145/1065010.1065014.

    Chapter  Google Scholar 

  15. Greg4cr. Defects4J–Version 2.0.0, 2022. https://github.com/rjust/defects4j, December 2024.

    Google Scholar 

  16. Sohn J, Yoo S. FLUCCS: Using code and change metrics to improve fault localization. In Proc. the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2017, pp.273–283. DOI: https://doi.org/10.1145/3092703.3092717.

    Chapter  MATH  Google Scholar 

  17. Wang S, Khomh F, Zou Y. Improving bug localization using correlations in crash reports. In Proc. the 10th Working Conference on Mining Software Repositories, May 2013, pp.247–256. DOI: https://doi.org/10.1109/MSR.2013.6624036.

    MATH  Google Scholar 

  18. Zhang M, Li X, Zhang L, Khurshid S. Boosting spectrum-based fault localization using PageRank. In Proc. the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2017, pp.261–272. DOI: https://doi.org/10.1145/3092703.3092731.

    Chapter  MATH  Google Scholar 

  19. Li X, Li W, Zhang Y, Zhang L. DeepFL: Integrating multiple fault diagnosis dimensions for deep fault localization. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2019, pp.169–180. DOI: https://doi.org/10.1145/3293882.3330574.

    Chapter  MATH  Google Scholar 

  20. Papadakis M, Traon Y L. Metallaxis-FL: Mutation-based fault localization. Software: Testing, Verification and Reliability, 2015, 25(5/7): 605–628. DOI: https://doi.org/10.1002/stvr.l509.

    MATH  Google Scholar 

  21. Le T D B, Lo D, Le Goues C, Grunske L. A learning-to-rank based fault localization approach using likely invariants. In Proc. the 25th International Symposium on Software Testing and Analysis, Jul. 2016, pp.177–188. DOI: https://doi.org/10.1145/2931037.2931049.

    Chapter  MATH  Google Scholar 

  22. Just R, Jalali D, Ernst M D. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proc. the 2014 International Symposium on Software Testing and Analysis, Jul. 2014, pp.437–440. DOI: https://doi.org/10.1145/2610384.2628055.

    Chapter  MATH  Google Scholar 

  23. Jones S P. Haskell 98 Language and Libraries: The Revised Report. Cambridge University Press, 2003.

    MATH  Google Scholar 

  24. Le D, Alipour M A, Gopinath R, Groce A. MuCheck: An extensible tool for mutation testing of Haskell programs. In Proc. the 2014 International Symposium on Software Testing and Analysis, Jul. 2014, pp.429–432. DOI: https://doi.org/10.1145/2610384.2628052.

    Chapter  Google Scholar 

  25. Li F, Wang M, Hao D. Bridging the gap between different programming paradigms in coverage-based fault localization. In Proc. the 13th Asia-Pacific Symposium on Internetware, Jun. 2022, pp.75–84. DOI: https://doi.org/10.1145/3545258.3545272.

    Chapter  MATH  Google Scholar 

  26. Moon S, Kim Y, Kim M, Yoo S. Ask the mutants: Mutating faulty programs for fault localization. In Proc. the 7th IEEE International Conference on Software Testing, Verification and Validation, Mar. 31–Apr. 4, 2014, pp.153–162. DOI: https://doi.org/10.1109/ICST.2014.28.

    MATH  Google Scholar 

  27. Zhang X, Gupta N, Gupta R. Locating faults through automated predicate switching. In Proc. the 28th International Conference on Software Engineering, May 2006, pp.272–281. DOI: https://doi.org/10.1145/1134285.1134324.

    Chapter  MATH  Google Scholar 

  28. Marlow S, Brandy L, Coens J, Purdy J. There is no fork: An abstraction for efficient, concurrent, and concise data access. In Proc. the 19th ACM SIGPLAN International Conference on Functional Programming, Sept. 2014, pp.325–337. DOI: https://doi.org/10.1145/2628136.2628144.

    Chapter  MATH  Google Scholar 

  29. Hall C V, Hammond K, Jones S L P, Wadler P L. Type classes in Haskell. ACM Trans. Programming Languages and Systems (TOPLAS), 1996, 18(2): 109–138. DOI: https://doi.org/10.1145/227699.227700.

    Article  MATH  Google Scholar 

  30. Cheng Y, Wang M, Xiong Y, Hao D, Zhang L. Empirical evaluation of test coverage for functional programs. In Proc. the 2016 IEEE International Conference on Software Testing, Verification and Validation, Apr. 2016, pp.255–265. DOI: https://doi.org/10.1109/ICST.2016.8.

    MATH  Google Scholar 

  31. Gill A, Runciman C. Haskell program coverage. In Proc. the 2007 ACM SIGPLAN Workshop on Haskell Workshop, Sept. 2007, pp.1–12. DOI: https://doi.org/10.1145/1291201.1291203.

    MATH  Google Scholar 

  32. Just R, Jalali D, Inozemtseva L, Ernst M D, Holmes R, Fraser G. Are mutants a valid substitute for real faults in software testing? In Proc. the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Nov. 2014, pp.654–665. DOI: https://doi.org/10.1145/2635868.2635929.

    Chapter  Google Scholar 

  33. Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst M D, Pang D, Keller B. Evaluating and improving fault localization. In Proc. the 39th IEEE/ACM International Conference on Software Engineering, May 2017, pp.609–620. DOI: https://doi.org/10.1109/ICSE.2017.62.

    Google Scholar 

  34. Li F, Zhou J, Li Y, Hao D, Zhang L. AGA: An accelerated greedy additional algorithm for test case prioritization. IEEE Trans. Software Engineering, 2022, 48(12): 5102–5119. DOI: https://doi.org/10.1109/TSE.2021.3137929.

    MATH  Google Scholar 

  35. Ray B, Posnett D, Filkov V, Devanbu P. A large scale study of programming languages and code quality in github. In Proc. the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Nov. 2014, pp.155–165. DOI: https://doi.org/10.1145/2635868.2635922.

    Chapter  Google Scholar 

  36. Zhang J M, Li F, Hao D, Wang M, Tang H, Zhang L, Harman M. A study of bug resolution characteristics in popular programming languages. IEEE Trans. Software Engineering, 2021, 47(12): 2684–2697. DOI: https://doi.org/10.1109/TSE.2019.2961897.

    Article  MATH  Google Scholar 

  37. Li F, Lou Y, Tan X, Chen Z, Dong J, Li Y, Wang X, Hao D, Zhang L. What can we learn from quality assurance badges in open-source software? Science China Information Sciences, 2024, 67(4): 142103. DOI: https://doi.org/10.1007/s11432-022-3611-3.

    Article  Google Scholar 

  38. Borges H, Valente M T. What’s in a github star? Understanding repository starring practices in a social coding platform. Journal of Systems and Software, 2018, 146: 112–129. DOI: https://doi.org/10.1016/j.jss.2018.09.016.

    Article  Google Scholar 

  39. Jia Y, Harman M. An analysis and survey of the development of mutation testing. IEEE Trans. Software Engineering, 2011, 37(5): 649–678. DOI: https://doi.org/10.1109/TSE.2010.62.

    Article  MATH  Google Scholar 

  40. Xu X, Debroy V, Wong W E, Guo D. Ties within fault localization rankings: Exposing and addressing the problem. International Journal of Software Engineering and Knowledge Engineering, 2011, 21(6): 803–827. DOI: https://doi.org/10.1142/S0218194011005505.

    Article  MATH  Google Scholar 

  41. Wen M, Chen J, Tian Y, Wu R, Hao D, Han S, Cheung S C. Historical spectrum based fault localization. IEEE Trans. Software Engineering, 2021, 47(11): 2348–2368. DOI: https://doi.org/10.1109/TSE.2019.2948158.

    Article  MATH  Google Scholar 

  42. Silva-Junior D, Leitao-Junior P S, Dantas A, Camilo-Junior C G, Harrison R. Data-flow-based evolutionary fault localization. In Proc. the 35th ACM/SIGAPP Symposium on Applied Computing, Mar. 30–Apr. 3, 2020, pp.1963–1970. DOI: https://doi.org/10.1145/3341105.3373946.

    Chapter  MATH  Google Scholar 

  43. Li X, Zhang L. Transforming programs and tests in tandem for fault localization. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA): Article No. 92. DOI: https://doi.org/10.1145/3133916.

  44. Benton S, Li X, Lou Y, Zhang L. On the effectiveness of unified debugging: An extensive study on 16 program repair systems. In Proc. the 35th IEEE/ACM International Conference on Automated Software Engineering, Sept. 2020, pp.907–918.

    Chapter  MATH  Google Scholar 

  45. Lou Y, Ghanbari A, Li X, Zhang L, Zhang H, Hao D, Zhang L. Can automated program repair refine fault localization? A unified debugging approach. In Proc. the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2020, pp.75–87. DOI: https://doi.org/10.1145/3395363.3397351.

    Chapter  MATH  Google Scholar 

  46. Tzeng E, Hoffman J, Zhang N, Saenko K, Darrell T. Deep domain confusion: Maximizing for domain invariance. arXiv: 1412.3474, 2014. https://arxiv.org/abs/1412.3474, Dec. 2014.

  47. Gretton A, Borgwardt K M, Rasch M, Schölkopf B, Smola A J. A kernel method for the two-sample-problem. In Proc. the 19th International Conference on Neural Information Processing Systems, Dec. 2006, pp.513–520.

    MATH  Google Scholar 

  48. Friedman J H. Stochastic gradient boosting. Computational Statistics & Data Analysis, 2002, 38(4): 367–378. DOI: https://doi.org/10.1016/S0167-9473(01)00065-2.

    Article  MathSciNet  MATH  Google Scholar 

  49. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research, 2011, 12: 2825–2830.

    MathSciNet  MATH  Google Scholar 

  50. Borgwardt K M, Gretton A, Rasch M J, Kriegel H P, Schölkopf B, Smola A J. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, 2006, 22(14): e49–e57. DOI: https://doi.org/10.1093/bioinformatics/btl242.

    Article  MATH  Google Scholar 

  51. Bruneton E, Lenglet R, Coupaye T. ASM: A code manipulation tool to implement adaptable systems. Adaptable and Extensible Component Systems, 2002, 30(19).

  52. Zhang L, Gligoric M, Marinov D, Khurshid S. Operator-based and random mutant selection: Better together. In Proc. the 28th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2013, pp.92–102. DOI: https://doi.org/10.1109/ASE.2013.6693070.

    Google Scholar 

  53. Landsberg D, Chockler H, Kroening D. Probabilistic fault localisation. In Proc. the 12th International Haifa Verification Conference on Hardware and Software: Verification and Testing, Nov. 2016, pp.65–81. DOI: https://doi.org/10.1007/978-3-319-49052-6_5.

    Chapter  MATH  Google Scholar 

  54. Wang Q, Parnin C, Orso A. Evaluating the usefulness of IR-based fault localization techniques. In Proc. the 2015 International Symposium on Software Testing and Analysis, Jul. 2015, pp.1–11. DOI: https://doi.org/10.1145/2771783.2771797.

    Google Scholar 

  55. Koyuncu A, Liu K, Bissyandé T F, Kim D, Monperrus M, Klein J, Le Traon Y. iFixR: Bug report driven program repair. In Proc. the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Aug. 2019, pp.314–325. DOI: https://doi.org/10.1145/3338906.3338935.

    Google Scholar 

  56. Thompson G, Sullivan A K. ProFL: A fault localization framework for Prolog. In Proc. the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2020, pp.561–564. DOI: https://doi.org/10.1145/3395363.3404367.

    Chapter  MATH  Google Scholar 

  57. Widera M. Why testing matters in functional programming. In Proc. the 7th Symposium on Trends in Functional Programming, University of Nottingham, TFP, 2006.

    MATH  Google Scholar 

  58. Claessen K, Hughes J. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proc. the 5th ACM SIGPLAN International Conference on Functional Programming, Sept. 2000, pp.268–279. DOI: https://doi.org/10.1145/351240.351266.

    MATH  Google Scholar 

  59. Braquehais R, Runciman C. FitSpec: Refining property sets for functional testing. In Proc. the 9th International Symposium on Haskell, Sept. 2016, pp.1–12. DOI: https://doi.org/10.1145/2976002.2976003.

    MATH  Google Scholar 

  60. Grieco G, Ceresa M, Buiras P. QuickFuzz: An automatic random fuzzer for common file formats. In Proc. the 9th International Symposium on Haskell, Sept. 2016, pp.13–20. DOI: https://doi.org/10.1145/2976002.2976017.

    Chapter  MATH  Google Scholar 

  61. Mista A, Russo A, Hughes J. Branching processes for QuickCheck generators. In Proc. the 11th ACM SIGPLAN International Symposium on Haskell, Sept. 2018, pp.1–13. DOI: https://doi.org/10.1145/3242744.3242747.

    MATH  Google Scholar 

  62. Breitner J. A promise checked is a promise kept: Inspection testing. In Proc. the 11th ACM SIGPLAN International Symposium on Haskell, Sept. 2018, pp.14–25. DOI: https://doi.org/10.1145/3242744.3242748.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dan Hao  (郝 丹).

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

A preliminary version of the paper was published in the Proceedings of Internetware 2022.

This work was supported by the National Natural Science Foundation of China under Grant No. 61872008.

Feng Li received his B.S. degree in computer science and technology from Peking University, Beijing, in 2018. He received his Ph.D. degree in the School of Computer Science, Peking University, Beijing, in 2023. His research interests include software testing, software analysis, and program comprehension.

Guo-Qing Wang received his B.S. degree in software engineering from Harbin Institute of Technology, Harbin, in 2022. He is currently a Ph.D. student at the School of Computer Science, Peking University, Beijing. His research interests include software testing, software analysis, and program comprehension.

Meng Wang is a reader (associate professor) at the University of Bristol, Bristol, after faculty level appointments at the University of Kent, Canterbury, and Chalmers University of Technology, Göteborg. He heads the Programming Languages research group and publishes broadly in programming languages (POPL, ICFP, OOPSLA, JFP) and software engineering (TSE and TOSEM). Most of his research focuses on software correctness, including specialized programming language designs, software testing, and program synthesis.

Dan Hao is a professor at the School of Computer Science, Peking University, Beijing. She received her Ph.D. degree in computer science from Peking University, Beijing, in 2008, and her B.S. degree in computer science from the Harbin Institute of Technology, Harbin, in 2002. Her current research interests include software testing and debugging, program comprehension, and software maintenance.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, F., Wang, GQ., Wang, M. et al. Coverage-Based Fault Localization in Haskell. J. Comput. Sci. Technol. 40, 158–177 (2025). https://doi.org/10.1007/s11390-024-2967-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-024-2967-1

Keywords