API tutorials are important learning resources as they explain how to use certain APIs in a given programming context. An API tutorial can be split into a number of units. Consecutive units that describe a same topic are often called a tutorial fragment. We consider the API explained by a tutorial fragment as an API tag. Generating API tags for a tutorial fragment can help understand, navigate, and retrieve the fragment. Existing approaches often do not perform well on API tag generation due to high manual effort and low accuracy. Like API tutorials, Stack Overflow (SO) is also an important learning resource that provides the explanations of APIs. Thus, SO posts also contain API tags. Besides, API tags of SO posts are abundant and can be extracted easily. In this paper, we propose a novel approach ATTACK (stands for A PI T ag for T utorial frA gments using C rowd K nowledge), which can automatically generate API tags for tutorial fragments from SO posts. ATTACK first constructs \(\left \langle Q\&A\ pair, tag\ set \right \rangle \) pairs by extracting API tags of SO posts. Then, it trains a deep neural network with the attention mechanism to learn the semantic relatedness between Q&A pairs and the associated API tags, taking into consideration both textual descriptions and code in a Q&A pair. Finally, the trained model is used to generate API tags for tutorial fragments. We evaluate ATTACK on public Java and Android datasets containing 43,132 \(\left \langle Q\&A\ pair, tag\ set \right \rangle \) pairs. Experimental results show that ATTACK is effective and outperforms the state-of-the-art approaches in terms of F-Measure. Our user study further confirms the effectiveness of ATTACK in generating API tags for tutorial fragments. We also apply ATTACK to document linking and the results confirm the usefulness of API tags generated by ATTACK.
A positive instance refers to a Q&A pair/fragment and its an API tag, while a negative instance refers to a Q&A pair/fragment and a supporting API.
An API tag refers to APIs explained by a tutorial fragment/Q&A pair, while a SO tag is the question-related keyword such as programming languages (e.g., java, android), libraries (e.g., jodatime, graphics), or APIs (e.g., DateTime, Canvas).
(2017) Stack overflow’s public data dump. https://archive.org/download/stackexchange
(2018) Android specification. https://developer.android.com/reference/packages
(2018) The document of the API DateTimeFormat. https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
(2018) Eclipse’s java parser. https://mvnrepository.com/artifact/org.eclipse.jdt/org.eclipse.jdt.core
(2018a) An example of so q&a pair. https://stackoverflow.com/questions/5663671/
(2018b) An example of tutorial fragment. https://stuff.mit.edu/afs/sipb/project/android/docs/guide/topics/graphics/2d-graphics.html
(2018) Java SE specification. https://www.oracle.com/technetwork/java/javase/documentation/index.html
(2018a) Jodatime specification. https://www.joda.org/joda-time/apidocs/index.html
(2018b) Jodatime tutorial. https://www.joda.org/joda-time/userguide.html
(2018) Math specification. http://commons.apache.org/proper/commons-math/javadocs/
(2018) Smack specification. http://download.igniterealtime.org/smack/docs/
(2018) Tensorflow framework. https://www.tensorflow.org
(2019) The details of WE+ATTACK. https://sites.google.com/site/attackapitags/we-attack
This work was supported by the NSFC-Key Project of General Technology Fundamental Research United Fund under Grant No. U1736211, No. 61933013, and No. 62032016, the Natural Science Foundation of Guangdong Province under Grant No. 2019A1515011076, the Innovation Group of Guangdong Education Department under Grant No. 2020KCXTD014 and 2018KCXTD019, the Key Project of Natural Science Foundation of Hubei Province under Grant No. 2018CFA024.
Wu, D., Jing, XY., Zhang, H. et al. Generating API tags for tutorial fragments from Stack Overflow. Empir Software Eng 26, 66 (2021). https://doi.org/10.1007/s10664-021-09962-8
