Skip to main content
Log in

On fuzzy repetitions detection in documentation reuse

  • Published:
Programming and Computer Software Aims and scope Submit manuscript

Abstract

Increasing complexity of software documentation calls for additional requirements of document maintenance. Documentation reuse can make a considerable contribution to solve this problem. This paper presents a method for fuzzy repetitions search in software documentation that is based on software clone detection. The search results are used for document refactoring. This paper also presents Documentation Refactoring Toolkit implementing the proposed method and integrated with the DocLine project. The proposed approach is evaluated on documentation packages for a number of open-source projects: Linux Kernel, Zend Framework, Subversion, and DocBook.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Watson, R., Developing best practices for API reference documentation: Creating a platform to study how programmers learn new APIs, Proc. IPCC, 2012, pp. 1–9.

  2. Garousi, G., Garousi, V., Moussavi, M., Ruhe, G., and Smith, B., Evaluating usage and quality of technical software documentation: An empirical study, Proc. EASE, 2013, pp. 24–35.

    Chapter  Google Scholar 

  3. Parnas, D.L., Precise documentation: The key to better software, The Future of Software Engineering, Nanz, S., Ed., Springer, 2011.

  4. Shalyto, A.A., New initiative in programming: Drive for open project documentation, PC Week RE, 2003, no. 40, pp. 38–42.

    Google Scholar 

  5. Holmes, R. and Walker, R.J., Systematizing pragmatic software reuse, ACM Trans. Software Eng. Methodol., 2013, vol. 21, no. 4, p. 44.

    Google Scholar 

  6. Czarnecki, K., Software reuse and evolution with generative techniques, Proc. IEEE/ACM Int. Conf. Automated Software Engineering, 2007, p. 575.

    Google Scholar 

  7. Bassett, P., The theory and practice of adaptive reuse, SIGSOFT Software Eng. Notes, 1997, vol. 22, no. 3, pp. 2–9.

    Article  Google Scholar 

  8. Jarzabek, S., Bassett, P., Zhang, H., and Zhang, W., XVCL: XML-based variant configuration language, Proc. ICSE, 2003, pp. 810–811.

    Google Scholar 

  9. Koznov, D. and Romanovsky, K., DocLine: A method for software product lines documentation development, Program. Comput. Software, 2008, vol. 34, no. 4, pp. 216–224.

    Article  MATH  Google Scholar 

  10. Romanovsky, K., Koznov, D., and Minchin, L., Refactoring the documentation of software product lines, Lect. Notes Comp. Sci., 2011, vol. 4980, pp. 158–170.

    Article  Google Scholar 

  11. Koznov, D.V., Shutak, A.V., Smirnov, M.N., and Smazhevskii, M.A., Clone search for technical documentation refactoring, Komp’yuternye instrumenty v obrazovanii, 2012, no. 4, pp. 30–40.

    Google Scholar 

  12. Lutsiv, D.V., Koznov, D.V., Basit, H.A., Li, O.E., Smirnov, M.N., and Romanovskii, K.Yu., Method for repeating text fragments search in technical documentation, Nauchno-Tekh. Vestn. Inf. Tekhnol. Mekh. Opt., 2014, vol. 4, no. 92, pp. 106–114.

    Google Scholar 

  13. Basit, H.A., Smyth, W.F., Puglisi, S.J., Turpin, A., and Jarzabek, S., Efficient token-based clone detection with flexible tokenization, Proc. ACM SIGSOFT Int. Symp. Foundations of Software Engineering, 2007, pp. 513–516.

    Google Scholar 

  14. Mathematics and Mechanics Faculty of the St. Petersburg State University, Document Refactoring Toolkit. http://wwwmathspburu/user/kromanovsky//docline/index_enhtml.

  15. GitHub, Linux Kernel Documentation. https://githubcom/torvalds/linux/tree/master//Documentation/DocBook.

  16. GitHub, Zend PHP Framework documentation. https: //githubcom/zendframework/zf1/tree//master/documentation.

  17. SourceForge, SVN Book. http://sourceforgenet/p/svnbook/source/HEAD/tree/trunk/en/book.

  18. SourceForge, DocBook Definitive Guide. http:// sourceforgenet/p/docbook/code/HEAD/tree/trunk /defguide/en.

  19. Zhi, J., Garousi, V., Sun, B., Garousi, G., Shahnewaz, S., and Ruhe, G., Cost, benefits and quality of technical software documentation: A systematic mapping, J. Syst. Software, 2012, pp. 1–24.

    Google Scholar 

  20. Zhong, H., Zhang, L., Xie, T., and Mei, H., Inferring source specifications from natural language API documentation, Proc. 24th ASE, 2009, pp. 307–318.

    Google Scholar 

  21. Zhong, H. and Su, Z., Detecting API documentation errors, Proc. SPASH/OOPSLA, 2013, pp. 803–816.

    Google Scholar 

  22. Wingkvist, A., Lowe, W., Ericsson, M., and Lincke, R., Analysis and visualization of information quality of technical documentation, Proc. 4th Eur. Conf. Information Management and Evaluation, 2010, pp. 388–396.

    Google Scholar 

  23. Wingkvist, A., Ericsson, M., and Lowe, W.A, Visualization- based approach to present and assess technical documentation quality, Electron. J. Inf. Syst. Eval., 2011, vol. 14, no. 1, pp. 150–159.

    Google Scholar 

  24. Applied Research in System Analysis, VizzAnalyzer Clone Detection Tool. http://wwwarisase/vizz_analyzer. php.

  25. Walsh, N. and Muellner, L., DocBook: The Definitive Guide, O’Reilly, 1999.

    Google Scholar 

  26. Darwin Information Typing Architecture (DITA) Version 1.2. http://docsoasis-openorg/dita/v1.2/os/ spec/DITA1.2-specpdf.

  27. Koznov, D.V., Shutak, A.V., Smirnov, M.N., and Smazhevskii, M.A., Clone search for technical documentation refactoring, Komp’yuternye instrumenty v obrazovanii, 2012, no. 4, pp. 30–40.

    Google Scholar 

  28. Lutsiv, D.V., Koznov, D.V., Basit, H.A., Li, O.E., Smirnov, M.N., and Romanovskii, K.Yu., Method for repeating text fragments search in technical documentation, Nauchno-Tekh. Vestn. Inf. Tekhnol. Mekh. Opt., 2014, vol. 4, no. 92, pp. 106–114.

    Google Scholar 

  29. Fowler, M., Beck, K., Brant, J., Opdyke, W., and Roberts, D., Refactoring: Improving the Design of Existing Code, Addison-Wesley, 1999.

    Google Scholar 

  30. Rattan, D., Bhatia, R.K., and Singh, M., Software clone detection: A systematic review, Inf. Software Technol., 2013, vol. 55, no. 7, pp. 1165–1199.

    Article  Google Scholar 

  31. Akhin, M. and Itsykson, V., Clone detection: Why, what and how, Proc. CEE-SECR, 2010, pp. 36–42.

    Google Scholar 

  32. Abouelhoda, M.I., Kurtz, S., and Ohlebusch, E., Replacing suffix trees with enhanced suffix arrays, J. Discrete Algorithms, 2004, vol. 53.

  33. Basili, V.R., Caldeira, G., and Rombach, H.D., The Goal Question Metric Approach, Wiley, 1994, vol. 1, pp. 528–532.

    Google Scholar 

  34. Frakes, W. and Terry, C., Software reuse: Metrics and models, ACM Comput. Surv., 1996, vol. 28, no. 2, pp. 415–435.

    Article  Google Scholar 

  35. Krueger, C.W., Variation management for software product lines, Proc. SPL, San Diego, 2002, pp. 37–48.

  36. Koznov, D.V., Novitskii, I.A., and Smirnov, M.N., Variation management tools: Ready for industrial application, Tr. S.-Peterb. Inst. Inf. Avtom. Ross. Akad. Nauk, 2013, no. 3, 297–331.

    Google Scholar 

  37. Abadi, A., Nisenson, M., and Simionovici, Y.A, Traceability technique for specifications, Proc. ICPC, 2008, pp. 103–112.

  38. Terekhov, A.N. and Sokolov, V.V., Document implementation of the conformation of MSC and SDL diagrams in the REAL technology, Program. Comput. Software, 2007, vol. 33, no. 1, pp. 24–33.

    Article  MathSciNet  MATH  Google Scholar 

  39. Koznov, D.V., Smirnov, M.N., Dorokhov, V.A., and Romanovskii, K.Yu., WebMLDoc: An approach to automatic change tracking in user documentation of Web applications, Vestn. S.-Peterb. Univ., Ser. 10 Appl. Math. Inf. Protsessy Upr., 2011, no. 3, pp. 112–126.

    Google Scholar 

  40. Smirnov, M.N., Koznov, D.V., Dorokhov, V.A., and Romanovskii, K.Yu., WebMLDoc software environment for automatic change tracking in user documentation of Web applications, Sist. Program., 2010, vol. 5, no. 1, pp. 32–51.

  41. Gavrilova, T.A., Ontological engineering for practical knowledge work, Proc. 11th Int. Conf. Knowledge-Based and Intelligent Information and Engineering Systems, 2007.

    Google Scholar 

  42. Kudryavtsev, D. and Gavrilova, T., Diagrammatic knowledge modeling for managers: Ontology-based approach, Proc. Int. Conf. Knowledge Engineering and Ontology Development, 2011, pp. 386–389

    Google Scholar 

  43. Bolotnikova, E.S., Gavrilova, T.A., and Gorovoy, V.A., To a method of evaluating ontologies, J. Comput. Syst. Sci. Int., 2011, vol. 50, no. 3, pp. 448–461.

    Article  MathSciNet  Google Scholar 

  44. Gavrilova, T.A., Gorovoy, V.A., and Bolotnikova, E.S., Evaluation of the cognitive ergonomics of ontologies on the basis of graph analysis, J. Sci. Tech. Inf. Process., 2010, vol. 37, no 6, pp. 398–406.

  45. Grigoriev, L. and Kudryavtsev, D., ORG-Master: Combining classifications, matrices, and diagrams in the enterprise architecture modeling tool, Communications in Computer and Information Science, Springer, 2013, pp. 250–258.

    Google Scholar 

  46. Koznov, D.V., Arzumanyan, M.Yu., Orlov, Yu.V., Derevyanko, M.A., Romanovskii, K.Yu., and Sidorina, A.A., Specificity of projects in the field of enterprise architecture design, Biznes-informatika, 2005, no. 4., pp. 15–26.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. V. Luciv.

Additional information

Original Russian Text © D.V. Luciv, D.V. Koznov, H.A. Basit, A.N. Terekhov, 2016, published in Programmirovanie, 2016, Vol. 42, No. 4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luciv, D.V., Koznov, D.V., Basit, H.A. et al. On fuzzy repetitions detection in documentation reuse. Program Comput Soft 42, 216–224 (2016). https://doi.org/10.1134/S0361768816040046

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0361768816040046

Navigation