Skip to main content

Heuristics for Semantic Path Search in Wikipedia

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8584))

Abstract

In this paper an approach based on Heuristic Semantic Walk (HSW) is presented, where semantic proximity measures among concepts are used as heuristics in order to guide the concept chain search in the collaborative network of Wikipedia, encoding problem-specific knowledge in a problem-independent way. Collaborative information and multimedia repositories over the Web represent a domain of increasing relevance, since users cooperatively add to the objects tags, label, comments and hyperlinks, which reflect their semantic relationships, with or without an underlying structure. As in the case of the so called Big Data, methods for path finding in collaborative web repositories require solving major issues such as large dimensions, high connectivity degree and dynamical evolution of online networks, which make the classical approach ineffective. Experiments held on a range of different semantic measures show that HSW lead to better results than state of the art search methods, and points out the relevant features of suitable proximity measures for the Wikipedia concept network. The extracted semantic paths have many relevant applications such as query expansion, synthesis of explanatory arguments, and simulation of user navigation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bollegala, D., Matsuo, Y., Ishizukain, M.: A Web Search Engine-Based Approach to Measure Semantic Similarity between Words. IEEE Transactions on Knowledge and Data Engineering (2011)

    Google Scholar 

  2. Cilibrasi, R., Vitanyi, P.: The Google Similarity Distance. ArXiv.org (2004)

    Google Scholar 

  3. Church, K.W., Hanks, P.: Word association norms, mutual information and lexicography. In: ACL, vol. 27 (1989)

    Google Scholar 

  4. Franzoni, V., Milani, A.: PMING Distance: A Collaborative Semantic Proximity Measure. In: WI-IAT, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 2, pp. 442–449 (2012)

    Google Scholar 

  5. Kurant, M., Markopoulou, A., Thiran, P.: On the bias of BSF. ITC (2010)

    Google Scholar 

  6. Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. WIKIAI (2008)

    Google Scholar 

  7. Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: WikiWalk: Random walks on Wikipedia for Semantic Relatedness. In: Proc. Graph-based Methods for Natural Language Processing (2009)

    Google Scholar 

  8. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. University of Michigan, MI (2003)

    Google Scholar 

  9. Cao, G., Gao, J., Nie, J.Y., Bai, J.: Extending query translation to cross-language query expansion with markov chain models. CIKM, ATM (2007)

    Google Scholar 

  10. Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)

    Google Scholar 

  11. Xu, Z., Luo, X., Yu, J., Xu, W.: Measuring semantic similarity between words by removing noise and redundancy in web snippets. Concurrency Computat: PE 23 (2011)

    Google Scholar 

  12. Wu, L., Hua, X.S., Yu, N., Ma, W.Y., Li, S.: Flickr Distance. Microsoft Research Asia (2008)

    Google Scholar 

  13. Leung, C.H.C., Li, Y., Milani, A., Franzoni, V.: Collective Evolutionary Concept Distance Based Query Expansion for Effective Web Document Retrieval. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) ICCSA 2013, Part IV. LNCS, vol. 7974, pp. 657–672. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  14. Gori, M.,, P.: A random-walk based scoring algorithm with application to recommender systems for large-scale e-commerce. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)

    Google Scholar 

  15. Franzoni, V., Milani, A.: Heuristic Semantic Walk. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) ICCSA 2013, Part IV. LNCS, vol. 7974, pp. 643–656. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comp. Com. App (2006)

    Google Scholar 

  17. Franzoni, V., Milani, A.: Heuristic semantic walk for concept chaining in collaborative networks. International Journal of Web Information Systems 10(1), 85–103 (2014), doi:10.1108/IJWIS-11-2013-0031

    Article  Google Scholar 

  18. Franzoni, V., Milani, A., Mengoni, P., Mencacci, M.: Semantic Heuristic Search in Collaborative Networks: Measures and Contexts. In: WI-IAT, 2014 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (2014) (accepted for)

    Google Scholar 

  19. Cheng, V.C., Leung, C.H.C., Liu, J., Milani, A.: Probabilistic Aspect Mining Model for Drug Reviews. IEEE Transactions on Knowledge and Data Engineering 99, 1 (preprint, 2014), doi:10.1109/TKDE.2013.175

    Google Scholar 

  20. Milani, A., Santucci, V.: Community of scientist optimization: An autonomy oriented approach to distributed optimization. AI Commun. 25(2), 157–172 (2012), doi:10.3233/AIC-2012-0526

    MathSciNet  Google Scholar 

  21. Leung, C.H.C., Chan, A.W.S., Milani, A., Liu, J., Li, Y.: Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine. ACM TIST 3(3), 47 (2012), doi:10.1145/2168752.2168761

    Google Scholar 

  22. Baioletti, M., Milani, A., Poggioni, V., Rossi, F.: Experimental evaluation of pheromone models in ACOPlan. Ann. Math. Artif. Intell. 62(3-4), 187–217 (2011), doi:10.1007/s10472-011-9265-7

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Franzoni, V., Mencacci, M., Mengoni, P., Milani, A. (2014). Heuristics for Semantic Path Search in Wikipedia. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8584. Springer, Cham. https://doi.org/10.1007/978-3-319-09153-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09153-2_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09152-5

  • Online ISBN: 978-3-319-09153-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics