Skip to main content

Analyzing Empirical Data in Requirements Engineering Techniques

  • Conference paper
  • First Online:
Information Systems Development

Abstract

Getting meaningful information from empirical data is a challenging task in software engineering (SE). It requires an in-depth analysis of the research problem, the data obtained and to select the most suitable data analysis methods, as well as an evaluation of the validity of the analysis result. This chapter reports research with three data analysis methods that were used to analyze a set of empirical requirements techniques data. One of the major findings is that it is possible to get better analysis results if several data analysis methods are combined. The way to examine the validity of the results is also explored.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    As RE techniques are a subset of SE techniques, we infer that the research results derived from RE techniques analysis will be applicable to SE techniques analysis.

  2. 2.

    We acknowledge the differences between the two terms “method” and “technique” as used in the SE research community and the disparities of the definitions given for these two terms in academia. The term “method” is deliberately used in this chapter to refer to any one or more algorithms and/or methods created for data clustering and data analysis. The purpose of adopting this terminology (in this chapter only) is to differentiate the two terms “method” and “technique” with the latter referring to SE techniques or methods.

  3. 3.

    A sufficient statistic refers to a statistic that has the property of sufficiency with respect to a statistical model and its associated unknown parameter θ that are used in statistical calculation and reasoning (Hogg and Craig 1978), i.e., no other statistic that can be calculated from the same data set provides any additional information as to the value of the parameter θ.

References

  • Antón AI (2003) Successful software projects need requirements planning. IEEE Softw 20(3):44–46

    Article  Google Scholar 

  • Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition – Part I. IEEE Trans Syst Man Cybern Part B Cybern 29(6):778–785

    Article  Google Scholar 

  • Bezdek JC (1974) Cluster validity with fuzzy sets. J Cybern 3(3):58–71

    Article  MathSciNet  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Brooks F (1987) No silver bullet-essence and accident in software engineering. IEEE Comput 20(4):10–19

    Article  MathSciNet  Google Scholar 

  • Burridge J (2003) Information preserving statistical obfuscation. Statist Comput 13(4):321–327

    Article  MathSciNet  Google Scholar 

  • Carreira-Perpinan MA (1997) A review of dimension reduction techniques. Technical report CS-96-09, Department of Computer Science, University of Sheffield

    Google Scholar 

  • Chambers LD (2001) The practical handbook of genetic algorithms applications. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

  • Cordon O (2001) Ten years of genetic fuzzy systems: current framework and new trends. In: Proceedings joint 9th IFSA world congress and 20th NAFIPS international conference (Cat. No. 01TH8569), p 1241. 0-7803-7078-3, 978-0-7803-7078-4

    Google Scholar 

  • Dekker D, Krackhardt D et al (2007) Sensitivity of MRQAP tests to collinearity and autocorrelation conditions. Psychometrika 72(4):563

    Article  MathSciNet  MATH  Google Scholar 

  • Dickinson W, Leon D, Podgurski A (2001) Finding failures by cluster analysis of execution profiles. In: Proceedings of the international conference on software engineering (ICSE), Toronto, ON, Canada, pp 339–348

    Google Scholar 

  • Dunn J (1974) A fuzzy relative of the ISODATA process and its use in detecting compact, well separated cluster. J Cybern 3(3):32–57

    Article  MathSciNet  Google Scholar 

  • Emam KE, Birk A (2000) Validating the ISO/IEC 15504 measure of software requirements analysis process capability. IEEE Trans Softw Eng 26(6):119–149

    Google Scholar 

  • Gao XB, Ji HB, Li J (2002) An advanced cluster analysis method based on statistical test. IEEE ICSP, pp 1100–1103

    Google Scholar 

  • Gen M, Cheng R (1997) Genetic algorithms and engineering design. Wiley, New York

    Google Scholar 

  • Glass RL (2004) Matching methodology to problem domain. Commun ACM 47(5):19–21

    Article  Google Scholar 

  • Goel AL, Shin M (1997) Software engineering data analysis techniques (tutorial). In: Proceedings of the 19th international conference on software engineering, Boston, Massachusetts, United States, pp 667–668

    Google Scholar 

  • Hastie TJ, Stuetzle W (1989) Principal curves. J Am Stat Assoc 84:502–516

    Article  MathSciNet  MATH  Google Scholar 

  • Hogg RV, Craig AT (1978) Introduction to mathematical statistics. Macmillan, New York

    Google Scholar 

  • Jiang L (2005) A framework for requirements engineering process development. Ph.D. thesis, University of Calgary, Canada

    Google Scholar 

  • Jiang L, Eberlein A (2006) Clustering requirements engineering techniques, In: The 10th IASTED international conference on software engineering and applications, Dallas, TX, USA, 13–15 November

    Google Scholar 

  • Jiang SY, Song XY, Wang H et al (2006) A clustering-based method for unsupervised intrusion detections. Pattern Recog Lett 27(7):802–810

    Article  Google Scholar 

  • Jiang L, Eberlein A, Far BH, Mousavi M (2008) A methodology for the selection of requirements engineering techniques. J Softw Syst Model 7(3):303–328

    Article  Google Scholar 

  • Jiang L, Eberlein A, Krishna A (2011) Analyzing empirical data in software engineering techniques. Technical report (1), Jan 2011. School of Computer Science, The University of Adelaide, Australia. http://cs.adelaide.edu.au/~ljiang/research/publicationsList/TechicalReport_REtechniquesClustering.pdf

  • Jolliffe IT (1986) Principal component analysis, Springer series in statistics. Springer, Berlin

    Book  Google Scholar 

  • Jones MC (1983) The projection pursuit algorithm for exploratory data analysis. Ph.D. thesis, University of Bath

    Google Scholar 

  • Jones C (2008) Applied software measurement: global analysis of productivity and quality, 3rd edn. McGraw-Hill, New York

    Google Scholar 

  • Khoshgoftaar TM, Allen EB (1999) Modeling software quality with classification trees. In: Pham H (ed) Recent advances in reliability and quality engineering. World Scientific, Singapore

    Google Scholar 

  • Krackardt D (1987) QAP partialling as a test of spuriousness* 1. Soc Netw 9(2):171

    Article  Google Scholar 

  • Lee MA, Takagi H (1993) Dynamic control of genetic algorithms using fuzzy logic techniques. In: Proceedings of international conference on genetic algorithms, Urbana-Champaign, IL, July 1993, pp 76–83

    Google Scholar 

  • Lehmann EL, Casella G (1998) Theory of point estimation. Springer, New York

    MATH  Google Scholar 

  • Liu K, Kargupta H, Ryan J (2006) Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans Knowl Data Eng 18(1):92–106

    Article  Google Scholar 

  • Mendonca M, Sunderhaft NL (1999) Mining software engineering, data: a survey. A DACS state-of-the-art report, Data & Analysis Center for Software, Rome, NY

    Google Scholar 

  • Naur P, Randell B et al (1969) Software engineering: report on a conference sponsored by the NATO SCIENCE COMMITTEE, Garmisch, Germany, 7–11 Oct 1968, Scientific Affairs Division, NATO

    Google Scholar 

  • Neill CJ, Laplante PA (2003) Requirements engineering: the state of the practice. IEEE Softw 20(6):40–45

    Article  Google Scholar 

  • Shin M, Goel AL (2000) Empirical data modeling in software engineering using radial basis functions. IEEE Trans Softw Eng (0098-5589) 26(6):567

    Google Scholar 

  • Zhao L, Tsujimura Y, Gen M (1996) Genetic algorithm for fuzzy clustering. In: Proceedings of IEEE international conference on evolutionary computation, p 716. 0-7803-2902-3, 978-0-7803-2902-7

    Google Scholar 

  • Zhong S, Khoshgoftaar TM, Seliya N (2004) Analyzing software measurement data with clustering techniques. IEEE Intell Syst 19(2):20–27

    Article  Google Scholar 

  • Zowghi D, Damian D, Offen R (2001) Field studies of requirements engineering in a multi-site software development organization. In: Proceedings of the Australian workshop on requirements engineering, University of New South Wales

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Jiang, L., Eberlein, A., Krishna, A. (2013). Analyzing Empirical Data in Requirements Engineering Techniques. In: Pooley, R., Coady, J., Schneider, C., Linger, H., Barry, C., Lang, M. (eds) Information Systems Development. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4951-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-4951-5_29

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-4950-8

  • Online ISBN: 978-1-4614-4951-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics