Skip to main content

Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement

  • Conference paper
Mining Intelligence and Knowledge Exploration

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8284))

Abstract

Measures of semantic relatedness are largely applicable in intelligent tasks of NLP and Bioinformatics. By taking these automated measures into account, this paper attempts to improve Second-order Co-occurrence Vector semantic relatedness measure for more effective estimation of relatedness between two given concepts. Typically, this measure, after constructing concepts definitions (Glosses) from a thesaurus, considers the cosine of the angle between the concepts’ gloss vectors as the degree of relatedness. Nonetheless, these computed gloss vectors of concepts are impure and rather large in size which would hinder the expected performance of the measure. By employing latent semantic analysis (LSA), we try to conduct some level of insignificant feature elimination to generate economic gloss vectors. Applying both approaches to the biomedical domain, using MEDLINE as corpus, UMLS as thesaurus, and reference standard of biomedical concept-pairs manually rated for relatedness, we show LSA implementation enforces positive impact in terms of performance and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Muthaiyah, S., Kerschberg, L.: A Hybrid Ontology Mediation Approach for the Semantic Web. International Journal of E-Business Research 4, 79–91 (2008)

    Article  Google Scholar 

  2. Pekar, V., Ou, S., Constantin Orasan, C., Spurk, C., Negri, M.: Development and alignment of a domain-specific ontology for question answering. In: Proceedings of the 6th Edition of the Language Resources and Evaluation Conference, LREC-08 (May 2008)

    Google Scholar 

  3. Chen, B., Foster, G., Kuhn, R.: Bilingual Sense Similarity for Statistical Machine Translation. In: Proceedings of the ACL, pp. 834–843 (2010)

    Google Scholar 

  4. Bousquet, C., Lagier, G., LilloLe, L.A., Le Beller, C., Venot, A., Jaulent, M.C.: Appraisal of the MedDRA Conceputal Structure for describing and grouping adverse drug reactions. Drug Safety 28(1), 19–34 (2005)

    Article  Google Scholar 

  5. Firth, J.R.: A Synopsis of Linguistic Theory 1930-1955. In: Studies in Linguistic Analysis, pp. 1–32 (1957)

    Google Scholar 

  6. Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and Application of a Metric on Semantic Nets. IEEE Transactions on Systems, Man and Cybernetics 19, 17–30 (1989)

    Article  Google Scholar 

  7. Wu, Z., Palmer, M.: Verb Semantics and Lexical Selections. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (1994)

    Google Scholar 

  8. Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)

    Google Scholar 

  9. Jiang, J.J., Conrath, D.W.: Semantic Similarity based on Corpus Statistics and Lexical Taxonomy. In: International Conference on Research in Computational Linguistics (1997)

    Google Scholar 

  10. Lin, D.: An Information-theoretic Definition of Similarity. In: 15th International Conference on Machine Learning, Madison, USA (1998)

    Google Scholar 

  11. Pesaranghader, A., Muthaiyah, S.: Definition-based information content vectors for semantic similarity measurement. In: Proceedings of the 2nd International Multi-Conference on Artificial Intelligence Technology (M-CAIT), pp. 268–282 (2013)

    Google Scholar 

  12. Lesk, M.: Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice-cream Cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation, New York, USA, pp. 24–26 (1986)

    Google Scholar 

  13. Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet. In: Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City (2002)

    Google Scholar 

  14. Patwardhan, S., Pedersen, T.: Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts. In: Proceedings of the EACL 2006 Workshop, Making Sense of Sense: Bringing Computational Linguistics and Psycholinguistics together, Trento, Italy, pp. 1–8 (2006)

    Google Scholar 

  15. Liu, Y., McInnes, B.T., Pedersen, T., Melton-Meaux, G., Pakhomov, S.: Semantic relatedness study using second order co-occurrence vectors computed from biomedical corpora, UMLS and WordNet. In: Proceedings of the 2nd ACM SIGHIT IHI, pp. 363–371

    Google Scholar 

  16. Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.: Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. In: Proceedings of AMIA, pp. 572–576 (2010)

    Google Scholar 

  17. Landauer, T.K., Dumais, S.T.: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction and Representation of Knowledge. Psychological Review 104, 211–240 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Pesaranghader, A., Pesaranghader, A., Rezaei, A. (2013). Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03844-5_58

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03843-8

  • Online ISBN: 978-3-319-03844-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics