Skip to main content

Advertisement

Log in

Learning natural ordering of tags in domain-specific Q&A sites

特定领域问答网站中的标签自然顺序研究

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Tagging is a defining characteristic of Web 2.0. It allows users of social computing systems (e.g., question and answering (Q&A) sites) to use free terms to annotate content. However, is tagging really a free action? Existing work has shown that users can develop implicit consensus about what tags best describe the content in an online community. However, there has been no work studying the regularities in how users order tags during tagging. In this paper, we focus on the natural ordering of tags in domain-specific Q&A sites. We study tag sequences of millions of questions in four Q&A sites, i.e., CodeProject, SegmentFault, Biostars, and CareerCup. Our results show that users of these Q&A sites can develop implicit consensus about in which order they should assign tags to questions. We study the relationships between tags that can explain the emergence of natural ordering of tags. Our study opens the path to improve existing tag recommendation and Q&A site navigation by leveraging the natural ordering of tags.

摘要

标注是Web 2.0的一个重要特征. 它使得社会计算系统 (如问答网站) 的用户们可以自由地标记内容. 然而, 标注真的是自由不受限的吗? 现有工作表明, 用户们常常可以隐性地就哪种标签最能描述在线社区的内容达成共识. 然而, 目前还没有针对用户在标注过程中对标签排序的规律性开展研究. 本文专注于研究特定领域问答网站中的标签自然排序, 并对CodeProject, SegmentFault, Biostars以及CareerCup 4个问答网站上数以百万计的问题中的标签序列进行研究. 结果表明, 这些问答网站的用户可以就问题标签的排序达成隐性共识. 研究了标签之间的关系, 这些关系可以解释标签自然顺序的出现. 该研究为利用标签的自然顺序提升现有标签推荐以及问答站点导航提供了可能.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abate ST, Besacier L, Seng S, 2010. Boosting N-gram coverage for unsegmented languages using multiple text segmentation approach. Proc 1st Workshop on South and Southeast Asian Natural Language, p.1–7.

  • Allamanis M, Barr ET, Bird C, et al., 2014. Learning natural coding conventions. Proc 22nd ACM SIGSOFT Int Symp on Foundations of Software Engineering, p.281–293. https://doi.org/10.1145/2635868.2635883

  • Belém F, Martins E, Pontes T, et al., 2011. Associative tag recommendation exploiting multiple textual features. Proc 34th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.1033–1042. https://doi.org/10.1145/2009916.2010053

  • Bird S, Boguraev B, Kay M, et al., 1997. Survey of the State of the Art in Human Language Technology. Cambridge University Press, USA.

    Google Scholar 

  • Cattuto C, Loreto V, Pietronero L, 2007. Semiotic dynamics and collaborative tagging. PNAS, 104(5):1461–1464. https://doi.org/10.1073/pnas.0610487104

    Article  Google Scholar 

  • Chen SF, Goodman J, 1996. An empirical study of smoothing techniques for language modeling. Proc 34th Annual Meeting on Association for Computational Linguistics, p.310–318. https://doi.org/10.3115/981863.981904

  • Chi EH, Mytkowicz T, 2008. Understanding the efficiency of social tagging systems using information theory. Proc 19th ACM Conf on Hypertext and Hypermedia, p.81–88. https://doi.org/10.1145/1379092.1379110

  • Feng W, Wang JY, 2012. Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. Proc 18th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.1276–1284. https://doi.org/10.1145/2339530.2339729

  • Fu WT, Kannampallil T, Kang RG, et al., 2010. Semantic imitation in social tagging. ACM Trans Comput-Human Interact, Article 12. https://doi.org/10.1145/1806923.1806926

  • Gemmell J, Shepitsen A, Mobasher B, et al., 2008. Personalizing navigation in folksonomies using hierarchical tag clustering. Proc 10th Int Conf on Data Warehousing and Knowledge, p.196–205. https://doi.org/10.1007/978-3-540-85836-2_19

  • Golder SA, Huberman BA, 2006. Usage patterns of collaborative tagging systems. J Inform Sci, 32(2):198–208. https://doi.org/10.1177/0165551506062337

    Article  Google Scholar 

  • Goodman JT, 2001. A bit of progress in language modeling. Comput Speech Lang, 15(4):403–434. https://doi.org/10.1006/csla.2001.0174

    Article  Google Scholar 

  • Gummidi SRB, Xie XK, Pedersen TB, 2019. A survey of spatial crowdsourcing. ACM Trans Database Syst, 44(2):1–46. https://doi.org/10.1145/3291933

    Article  MathSciNet  Google Scholar 

  • Guthrie D, Allison B, Liu W, et al., 2006. A closer look at skip-gram modelling. Proc 5th Int Conf on Language Resources and Evaluation, p.1–4.

  • Halpin H, Robu V, Shepherd H, 2007. The complex dynamics of collaborative tagging. Proc 16th Int Conf on World Wide Web, p.211–220. https://doi.org/10.1145/1242572.1242602

  • Heckner M, Heilemann M, Wolff C, 2009. Personal information management vs. resource sharing: towards a model of information behaviour in social tagging systems. Proc 3rd Int AAAI Conf on Weblogs and Social Media, p.42–49.

  • Heymann P, Garcia-Molina H, 2006. Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems. InfoLab Technical Report, Stanford.

  • Heymann P, Koutrika G, Garcia-Molina H, 2008. Can social bookmarking improve web search? Proc Int Conf on Web Search and Data Mining, p.195–206. https://doi.org/10.1145/1341531.1341558

  • Hindle A, Barr ET, Su ZD, et al., 2012. On the naturalness of software. Proc 34th Int Conf on Software Engineering, p.837–847. https://doi.org/10.1109/ICSE.2012.6227135

  • Körner C, Kern R, Grahsl HP, et al., 2010. Of categorizers and describers: an evaluation of quantitative measures for tagging motivation. Proc 21st ACM Conf on Hypertext and Hypermedia, p.157–166. https://doi.org/10.1145/1810617.1810645

  • Levenshtein VI, 1966. Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl, 10(8):707–710.

    MathSciNet  Google Scholar 

  • Ponte JM, Croft WB, 1998. A language modeling approach to information retrieval. Proc 21st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.275–281. https://doi.org/10.1145/290941.291008

  • Robu V, Halpin H, Shepherd H, 2009. Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM Trans Web, 3(4):14. https://doi.org/10.1145/1594173.1594176

    Article  Google Scholar 

  • Rosenfeld R, 1994. A hybrid approach to adaptive statistical language modeling. Proc Workshop on Human Language Technology, p.76–81. https://doi.org/10.3115/1075812.1075827

  • Rosenfeld R, 1995. Optimizing lexical and N-gram coverage via judicious use of linguistic data. Proc European Conf on Speech Technology, p.1763–1766.

  • Schenkel R, Crecelius T, Kacimi M, et al., 2008. Efficient top-k querying over social-tagging networks. Proc 31st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.523–530. https://doi.org/10.1145/1390334.1390424

  • Schmitz C, Hotho A, Jäschke R, et al., 2006. Mining association rules in folksonomies. In: Batagelj V, Bock HH, Ferligoj A, et al. (Eds.), Data Science and Classification. Springer, Berlin, p.261–270. https://doi.org/10.1007/3-540-34416-0_28

    Google Scholar 

  • Sigurbjörnsson B, van Zwol R, 2008. Flickr tag recommendation based on collective knowledge. Proc 17th Int Conf on World Wide Web, p.327–336. https://doi.org/10.1145/1367497.1367542

  • Siu M, Ostendorf M, 2000. Variable N-grams and extensions for conversational speech language modeling. IEEE Trans Speech Audio Process, 8(1):63–75. https://doi.org/10.1109/89.817454

    Article  Google Scholar 

  • Song Y, Zhuang ZM, Li HJ, et al., 2008. Real-time automatic tag recommendation. Proc 31st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.515–522. https://doi.org/10.1145/1390334.1390423

  • Storey MA, Cheng LT, Bull I, et al., 2006. Waypointing and social tagging to support program navigation. CHI Extended Abstracts on Human Factors in Computing Systems, p.1367–1372. https://doi.org/10.1145/1125451.1125704

  • Strohmaier M, Körner C, Kern R, 2010. Why do users tag? Detecting users’ motivation for tagging in social tagging systems. Proc 4th Int AAAI Conf on Weblogs and Social Media, p.23–26.

  • Thom-Santelli J, Muller MJ, Millen DR, 2008. Social tagging roles: publishers, evangelists, leaders. Proc SIGCHI Conf on Human Factors in Computing Systems, p.1041–1044. https://doi.org/10.1145/1357054.1357215

  • Tuarob S, Pouchard LC, Giles CL, 2013. Automatic tag recommendation for metadata annotation using probabilistic topic modeling. Proc 13th ACM/IEEE-CS joint Conf on Digital Libraries, p.239–248. https://doi.org/10.1145/2467696.2467706

  • Wagner C, Singer P, Strohmaier M, et al., 2014. Semantic stability in social tagging streams. Proc 23rd Int Conf on World Wide Web, p.735–746. https://doi.org/10.1145/2566486.2567979

  • Wang SW, Lo D, Vasilescu B, et al., 2014. EnTagRec: an enhanced tag recommendation system for software information sites. Proc IEEE Int Conf on Software Maintenance and Evolution, p.291–300. https://doi.org/10.1109/ICSME.2014.51

  • Wattenberg M, Viégas FB, 2008. The word tree, an interactive visual concordance. IEEE Trans Vis Comput Graph, 14(6):1221–1228. https://doi.org/10.1109/TVCG.2008.172

    Article  Google Scholar 

  • Xia X, Lo D, Wang XY, et al., 2013. Tag recommendation in software information sites. Proc 10th Working Conf on Mining Software Repositories, p.287–296. https://doi.org/10.1109/MSR.2013.6624040

  • Xie XK, Jin PQ, Yiu ML, et al., 2016. Enabling scalable geographic service sharing with weighted imprecise Voronoi cells. IEEE Trans Knowl Data Eng, 28(2):439–453. https://doi.org/10.1109/TKDE.2015.2464804

    Article  Google Scholar 

  • Xie XK, Lin X, Xu JL, et al., 2017. Reverse keyword-based location search. Proc IEEE 33rd Int Conf on Data Engineering, p.403–434. https://doi.org/10.1109/ICDE.2017.96

  • Zubiaga A, 2012. Enhancing navigation on Wikipedia with social tags. https://arxiv.org/abs/1202.5469v1

Download references

Author information

Authors and Affiliations

Authors

Contributions

Junfang JIA implemented the system, processed the data, carried out the experiments, and drafted the paper. Guoqiang LI supervised the research, found the suitable datasets, and revised and finalized the paper.

Corresponding author

Correspondence to Guoqiang Li  (李国强).

Ethics declarations

Junfang JIA and Guoqiang LI declare that they have no conflict of interest.

Additional information

Project supported by the Shanxi Datong University Project (No. 2012k6) and Shanxi Datong University Educational Reform Project (No. xjg2015202)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, J., Li, G. Learning natural ordering of tags in domain-specific Q&A sites. Front Inform Technol Electron Eng 22, 170–184 (2021). https://doi.org/10.1631/FITEE.1900645

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1900645

Key words

关键词

CLC number