Abstract
Folksonomies, often known as tagging systems, such as the ones used on the popular Delicious or Flickr websites, use a very simple Knowledge Organisation System. Users have thus been quick to adopt this system and create extensive annotations on the Web. However, because of the simplicity of the folksonomy model, the semantics of the tags used is not explicit and can only be inferred from their context of use. This is a barrier for the automatic use of such Knowledge Organisation Systems by computers and new techniques have been developed to extract the semantic of the tags. In this article we discuss the drawbacks of some of these approaches and propose a generalization of the different approaches to detect new senses of terms in a folksonomy. Another weak point of the current state of the art in the field is the lack of formal evaluation methodology; we thus propose a novel evaluation framework. We introduce a dataset and evaluation methodology that enable the comparison of results between different approaches to sense induction in folksonomies. Finally we discuss the performances of different approaches to the task of homonymous/polysemous tag detection and synonymous identification.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Which is a kind of cheese.
Which are provided by thesaurus such as WordNet.
Linked Open Data cloud diagram, 19 September 2011, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/.
Or \(c_i\), as they are the same.
References
Aberer K, Cudré-Mauroux P, Ouksel AM, Catarci T, Hacid M-S, Illarramendi A, Kashyap V, Mecella M, Mena E, Neuhold EJ, De Troyer O, Risse T, Scannapieco M, Saltor F, De Santis L, Spaccapietra S, Staab S, Studer R (2004) Emergent semantics principles and issues. In: Lee Y-J, Li J, Whang K-Y, Lee D (eds) Proceedings of the 9th international conference on database systems for advanced applications (DASFAA’04), vol 2973 of Lecture notes in computer science. Springer, Berlin, pp 25–38. ISBN 3-540-21047-4
Alfonseca E, Manandhar S (2002a) Extending a lexical ontology by a combination of distributional semantics signatures. In: Proceedings of the 13th international conference on knowledge engineering and knowledge management. Ontologies and the semantic web, EKAW ’02. Springer, London, UK, pp 1–7. ISBN 3-540-44268-5
Alfonseca E, Manandhar S (2002b) Proposal for evaluating ontology refinement methods. In: Proceedings of the language resources and evaluation conference (LREC-2002)
Amigó E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retr 12:461–486. ISSN 1386–4564. doi:10.1007/s10791-008-9066-8
Andrews P, Pane J, Zaihrayeu I (2011) Semantic disambiguation in folksonomy: A case study. In: Bernardi R, Chambers S, Gottfried B, Segond F, Zaihrayeu I (eds) Advanced language technologies for digital libraries, vol 6699 of Lecture notes in computer science. Springer, Berlin, pp 114–134. ISBN 978-3-642-23159-9. doi:10.1007/978-3-642-23160-5_8
Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555–596
Au NS, Gibbins, Hadbolt N (2007) Understanding the semantics of ambiguous tags in folksonomies. In: The international workshop on emergent semantics and ontology evolution (ESOE2007) at ISWC/ASWC 2007
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: Aberer K, Choi K-S, Noy N, Allemang D, Lee K-I, Nixon L, Golbeck J, Mika P, Maynard D, Mizoguchi R, Schreiber G, Cudré-Mauroux P (eds) The semantic web, vol 4825 of Lecture notes in computer science. Springer, Berlin, pp 722–735. ISBN 978-3-540-76297-3. doi:10.1007/978-3-540-76298-0_52
Brody S, Lapata M (2009) Bayesian word sense induction. In: Proceedings of the 12th conference of the European Chapter of the association for computational linguistics, EACL ’09, Stroudsburg, PA, USA. Association for computational linguistics, pp 103–111
Dattolo A, Eynard D, Mazzola L (2011) An integrated approach to discover tag semantics. In: Proceedings of the 2011 ACM symposium on applied computing, SAC ’11, New York, NY, USA. ACM, pp 814–820. ISBN 978-1-4503-0113-8. doi:10.1145/1982185.1982359, http://doi.acm.org/10.1145/1982185.1982359
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining (KDD-96). AAAI Press, California, pp 226–231
Flouris Giorgos, Manakanatas Dimitris, Kondylakis Haridimos, Plexousakis Dimitris, Antoniou Grigoris (2008) Ontology change: classification and survey. Knowl Eng Rev 23(2):117–152
García-Silva A, Szomszor M, Alani H, Corcho O (2009) Preliminary results in tag disambiguation using dbpedia. In: Proceedings of the fifth international conference on knowledge capture (KCAP), USA
García-Silva A, Corcho O, Alani H, Gómez-Pérez A (2011) Review of the state of the art: discovering and associating semantics to tags in folksonomies. Knowl Eng Rev 26(4): 57–85
Giunchiglia F, Zaihrayeu I (2009) Lightweight ontologies. In: Liu L, Ozsu MT (eds) Encyclopedia of database systems. Springer, Berlin. ISBN 978-0-387-49616-0
Golder Scott, Huberman Bernardo A (2006) The structure of collaborative tagging systems. J Inf Sci 32(2):198–208
Haase P, Hotho A, Schmidt-Thieme L, Sure Y (2005) Collaborative and usage-driven evolution of personal ontologies. In: Gómez-Pérez A, Euzenat J (eds) The semantic web: research and applications, vol 3532 of Lecture notes in computer science. Springer, Berlin, pp 125–226. ISBN 978-3-540-26124-7. doi:10.1007/11431053_33
Hahn U, Schnattinger K (1998) Towards text knowledge engineering. In: Proceedings of the fifteenth national/tenth conference on artificial intelligence/innovative applications of artificial intelligence, AAAI ’98/IAAI ’98, Menlo Park, CA, USA. American Association for Artificial Intelligence, pp 524–531. ISBN 0-262-51098-7
Jamoussi S (2009) Une nouvelle représentation vectorielle pour la classification sémantique. Traitement Automatique des Langues 50(3):23–57
Kompatsiaris I, Diplaris S, Papadopoulos S (2011) Extracting emergent semantics from large-scale user-generated content. In: ICT innovations 2011 conference, Skopje, Sept 2011. http://mklab.iti.gr/mklab_people/papadop/lib/exe/fetch.php?media=conf:2011:ict_innovations2011.pdf
Lau JH, Cook P, McCarthy D, Newman D, Baldwin T (2012) Word sense induction for novel sense detection. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics, Avignon, France, April 2012. Association for computational linguistics, pp 591–601. http://www.aclweb.org/anthology/E12-1060
Lin D (1998) Automatic retrieval and clustering of similar words. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics, vol 2, Stroudsburg, PA, USA. Association for Computational Linguistics, pp 768–774. doi:10.3115/980691.980696
Lin H, Davis J, Zhou Y (2009) An integrated approach to extracting ontological structures from folksonomies. In: Aroyo L, Traverso P, Ciravegna F, Cimiano P, Heath T, Hyvnen E, Mizoguchi R, Oren E, Sabou M, Simperl E (eds) The semantic web: research and applications, vol 5554 of Lecture notes in computer science. Springer, Berlin/Heidelberg, pp 654–668 doi:10.1007/978-3-642-02121-3_48
Maala MZ, Delteil A, Azough A (2008) A conversion process from Flickr tags to RDF descriptions. IADIS Int J www/internet 6(1):103–120
Mika P (2007) Ontologies are us: a unified model of social networks and semantics. Web Sem 5:5–15. ISSN 1570–8268. doi:10.1016/j.websem.2006.11.002
Manning CD, Raghavan P, Schütze H (2008) Flat clustering. In: Introduction to information retrieval, chap 16. Cambridge University Press, Cambridge. http://nlp.stanford.edu/IR-book/
Miller GA (1995) WordNet: a lexical database for english. Commun ACM 38(11):39–41. doi:10.1145/219717.219748
Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64
Specia L, Motta E (2007) Integrating folksonomies with the semantic web. In: Proceedings of the European semantic web conference (ESWC2007), volume 4519 of LNCS, Berlin Heidelberg, Germany. Springer, Berlin, pp 624–639
Stojanovic L, Maedche A, Motik B, Stojanovic N (2002) User-driven ontology evolution management. In: Gómez-Pérez A, Benjamins V (eds) Knowledge engineering and knowledge management: ontologies and the semantic web, vol 2473 of Lecture notes in computer science. Springer, Berlin, pp 133–140. ISBN 978-3-540-44268-4. doi:10.1007/3-540-45810-7_27
Uschold M, Gruninger M (2004) Ontologies and semantics for seamless connectivity. SIGMOD Rec 33:58–64. ISSN 0163–5808. doi:10.1145/1041410.1041420
Van Damme C, Hepp M, Siorpaes K (2007) Folksontology: an integrated approach for turning folksonomies into ontologies. In: Hotho A, Hoser B. (eds) Proceedings of the ESWC 2007 workshop bridging the gap between semantic web and Web 2.0, Innsbruck, Austria, pp 71–84
Vander Wal T (2007) Folksonomy: coinage and definition. http://www.vanderwal.net/folksonomy.html (last Accessed on 26 Nov 2011)
Weinberger KQ, Slaney M, Van Zwol R (2008) Resolving tag ambiguity. In: Proceeding of the 16th ACM international conference on Multimedia, MM ’08, New York, NY, USA. ACM, pp 111–120. ISBN 978-1-60558-303-7. doi:10.1145/1459359.1459375
Wetzker R, Zimmermann C, Bauckhage C (2008) Analyzing social bookmarking systems: a del.icio.us cookbook. In: Proceedings of the ECAI 2008 mining social data workshop. IOS Press, Amsterdam, pp 26–30
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw/Publ IEEE Neural Netw Counc 16(3):645–678. ISSN 1045–9227. doi:10.1109/TNN.2005.845141
Zhang L, Wu X, Yu Y (2006) Emergent semantics from folksonomies: a quantitative study. In: Spaccapietra S, Aberer K, Cudré-Mauroux P (eds) Journal on data semantics VI, volume 4090 of Lecture notes in computer science. Springer, Berlin, pp 168–186. ISBN 978-3-540-36712-3. doi:10.1007/11803034_8
Acknowledgments
This work has been supported by the INSEMTIVES project (FP7-231181, see http://www.insemtives.eu). The authors would like to thank Ilya Zaihrayeu for his valuable contributions and feedback on this work. The dataset annotation system and evaluation framework code are freely available at http://sourceforge.net/projects/tags2con/.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Andrews, P., Pane, J. Sense induction in folksonomies: a review. Artif Intell Rev 40, 147–174 (2013). https://doi.org/10.1007/s10462-012-9382-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-012-9382-7