Processing math: 0%
DPWord2Vec: Better Representation of Design Patterns in Semantics | IEEE Journals & Magazine | IEEE Xplore

DPWord2Vec: Better Representation of Design Patterns in Semantics


Abstract:

With the plain text descriptions of design patterns, developers could better learn and understand the definitions and usage scenarios of design patterns. To facilitate th...Show More

Abstract:

With the plain text descriptions of design patterns, developers could better learn and understand the definitions and usage scenarios of design patterns. To facilitate the automatic usage of these descriptions, e.g., recommending design patterns by free-text queries, design patterns and natural languages should be adequately associated. Existing studies usually use texts in design pattern books as the representations of design patterns to calculate similarities with the queries. However, this way is problematic. Lots of information of design patterns may be absent from design pattern books and many words would be out of vocabulary due to the content limitation of these books. To overcome these issues, a more comprehensive method should be constructed to estimate the relatedness between design patterns and natural language words. Motivated by Word2Vec, in this study, we propose DPWord2Vec that embeds design patterns and natural language words into vectors simultaneously. We first build a corpus containing more than 400 thousand documents extracted from design pattern books, Wikipedia, and Stack Overflow. Next, we redefine the concept of context window to associate design patterns with words. Then, the design pattern and word vector representations are learnt by leveraging an advanced word embedding method. The learnt design pattern and word vectors can be universally used in textual description based design pattern tasks. An evaluation shows that DPWord2Vec outperforms the baseline algorithms by 24.2-120.9 percent in measuring the similarities between design patterns and words in terms of Spearman’s rank correlation coefficient. Moreover, we adopt DPWord2Vec on two typical design pattern tasks. In the design pattern tag recommendation task, the DPWord2Vec-based method outperforms two state-of-the-art algorithms by 6.6 and 32.7 percent respectively when considering Recall@10. In the design pattern selection task, DPWord2Vec improves the existing methods by 6.5-70.7 ...
Published in: IEEE Transactions on Software Engineering ( Volume: 48, Issue: 4, 01 April 2022)
Page(s): 1228 - 1248
Date of Publication: 18 August 2020

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.