Skip to main content

Automatic Document Topic Identification Using Social Knowledge Network

  • Reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Automatic document topic identification; Clustering; Ontology; Social knowledge network; Wikipedia

Glossary

ADTI:

Stands for automatic document topic identification

Ontology:

“A model for describing the world, that consists of a set of types (concepts), properties, and relationship types” (Garshol 2004)

SKN:

Stands for social knowledge network

WHO:

Stands for Wikipedia Hierarchical Ontology

TF-IDF:

A term weighting methodology that is commonly used in text mining and in information retrieval. It stands for term frequency-inverse document frequency

hi5:

An online social networking website

RDF:

Stands for Resource Description Framework. It is a method of representing information to facilitate the data interchange on the Web

ASR:

Stands for automatic speech recognition

NMI:

Stands for normalized mutual information. It is a well-known document clustering performance measure

NMF:

Stands for nonnegative matrix factorization. Nonnegative matrix factorization is a family of algorithms...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Auer S, Lehmann J (2007) What have Innsbruck and Leipzig in common? Extracting semantics from wiki content. In: Franconi E, Kifer M, May W (eds) The semantic web: research and applications. Springer, Berlin/New York, pp 503–517

    Chapter  Google Scholar 

  • Coursey K, Mihalcea R (2009) Topic identification using Wikipedia graph centrality. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics, companion volume: short papers, Association for Computational Linguistics, Boulder, pp 117–120

    Google Scholar 

  • Coursey K, Mihalcea R, Moen W (2009) Using encyclopedic knowledge for automatic topic identification. In: Proceedings of the thirteenth conference on computational natural language learning, Association for Computational Linguistics, Boulder, pp 210–218

    Google Scholar 

  • European Travel Commission (2013) Social networking and UGC. http://www.newmediatrendwatch.com/world-overview/137-social-networking-and-ugc, June 2013. Online; Accessed 25 Oct 2013

  • Garshol L (2004) Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all. J Inf Sci 30(4):378

    Article  Google Scholar 

  • Giles J (2005) Internet encyclopaedias go head to head. Nature 438(7070):900–901

    Article  Google Scholar 

  • Hassan M (2013) Automatic document topic identification using hierarchical ontology extracted from human background knowledge. PhD dissertation, University of Waterloo

    Google Scholar 

  • Huynh D, Cao T, Pham P, Hoang T (2009) Using hyperlink texts to improve quality of identifying document topics based on Wikipedia. In: International conference on knowledge and systems engineering, 2009 (KSE’09), IEEE, Hanoi, pp 249–254

    Google Scholar 

  • Janik M, Kochut K (2008a) Training-less Ontology-based Text Categorization. In: workshop on exploiting semantic annotations in information retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval, ECIR

    Google Scholar 

  • Janik M, Kochut K (2008b) Wikipedia in action: ontological knowledge in text categorization. In: IEEE international conference on semantic computing, 2008, IEEE, Santa Clara, pp 268–275

    Google Scholar 

  • Korfiatis NT, Poulos M, Bokos G (2006) Evaluating authoritative sources using social networks: an insight from Wikipedia. Online Inf Rev 30(3):252–262

    Article  Google Scholar 

  • Kuhn HW (2005) The Hungarian method for the assignment problem. Nav Res Logist 52(1):7–21

    Article  Google Scholar 

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  MATH  Google Scholar 

  • Medelyan O (2009) Human-competitive automatic topic indexing. PhD dissertation, The University of Waikato

    Google Scholar 

  • Medelyan O, Witten I, Milne D (2008) Topic indexing with Wikipedia. In: Proceedings of AAAI workshop on Wikipedia and artificial intelligence: an evolving synergy, AAAI, Chicago, pp 19–24

    Google Scholar 

  • Ng A, Jordan M, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856

    Google Scholar 

  • Popescul A, Ungar LH (2000) Automatic labeling of document clusters. http://citeseer.ist.psu.edu/viewdoc/download?https://doi.org/10.1.1.33.141&rep=rep1&type=pdf

  • Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207

    Google Scholar 

  • Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Sheffield, pp 202–209

    Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Toronto, pp 267–273

    Google Scholar 

  • Zhao Y, Karypis G, Fayyad U (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10(2):141–168

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mostafa M. Hassan .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Hassan, M.M., Karray, F., Kamel, M.S. (2018). Automatic Document Topic Identification Using Social Knowledge Network. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-7131-2_352

Download citation

Publish with us

Policies and ethics