Skip to main content

Ranking Web Page with Path Trust Knowledge Graph

  • Conference paper
  • First Online:
Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques (IScIDE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

  • 2898 Accesses

Abstract

How to find and discover useful information from Internet is a real challenge in information retrieval (IR) and search engines (SE). In this paper, we propose and construct Path Trust Knowledge Graph PTKG model for assigning priority values to the unvisited web pages. For a given user specific topic t, its PTKG contains five parts: (1) The context graph \(G(t)=(V, E)\), where V is the crawled history web page set and E includes the hyper link set among the history web pages; (2) Retrieving knowledge implied in the paths among these web pages and finding their lengths; (3) Building the trust degrees among the web pages; (4) Constructing topic specific language model and general language model by using the trust degrees; (5) Assigning the priority values of web pages for ranking them. Finally, we perform an experimental comparison among our proposed PTKG approach with the classic LCG and RCG. As a result, our method outperforms LCG and RCG.

Y. Du—Project supported by the National Nature Science Foundation of China (No. 61271413, 61472329).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Liu, W.J., Du, Y.J.: A novel focused crawler based on cell-like membrane computing optimization algorithm. Neurocomputing 123, 266–280 (2014)

    Article  Google Scholar 

  2. The size of the World Wide Web (2014). http://www.worldwidewebsize.com/

  3. Web crawler (2014). http://en.wikipedia.org/wiki/Web_crawler

  4. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)

    Article  Google Scholar 

  5. Developed by WebBee Team, WebBee SEO Spider. Java based Desktop (SEO Spider) application (2014)

    Google Scholar 

  6. Chris, S.: Become.com Launches Shopping Search Engine. In: SES Conference and Expo (2005)

    Google Scholar 

  7. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  8. Wu, B., Yang, J., He, L.: Chinese hownet-based multi-factor word similarity algorithm integrated of result modification. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part V. LNCS, vol. 7667, pp. 256–266. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Du, Y.J., Hai, Y.F.: Semantic ranking of web pages based on formal concept analysis. J. Syst. Softw. 86, 187–197 (2013)

    Article  Google Scholar 

  10. Diligenti, M., Coetzee, F.M., Lawrence, S., Giles, C.L.: Focused crawling using context graphs. In: The 26th International Conference on Very Large Database (VLDB), pp. 527–534 (2000)

    Google Scholar 

  11. Hsu, C.C., Wu, F.: Topic-specific crawling on the web with the measurements of the relevancy context graph. Inf. Syst. 31, 232–246 (2006)

    Article  Google Scholar 

  12. Du, Y.J., Peng, Q.Q., Gao, Z.Q.: A topic-specific crawling strategy based on semantics similarity. Data Knowl. Eng. 88, 75–93 (2013)

    Article  Google Scholar 

  13. Du, Y.J., Hai, Y.F., Xie, C.Z.: An approach for selecting seed urls of focused crawler based on user-interest ontology. Appl. Soft Comput. 14(C), 663–676 (2014)

    Article  Google Scholar 

  14. Liu, Z.J., Du, Y.J., Zhao, Y.: Focused crawler based on domain ontology and FCA. J. Inf. Comput. Sci. 8(10), 1909–1917 (2011)

    Google Scholar 

  15. Du, Y.J., Dong, Z.B.: Focused web crawling strategy based on concept context graph. J. Inf. Comput. Sci. 5(3), 1097–1106 (2009)

    Google Scholar 

  16. Wille, R.: An approach based restructuring lattice theory hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, vol. 83, pp. 445–470. Springer, The Netherlands (1982)

    Chapter  Google Scholar 

  17. Liu, Q., Tu, Z.P., Lin, S.X.: A novel graph-based compact representation of word alignment. In: Proceedings of Annual Meeting of the Association for Computational Linguistics ACL 2013 (2013)

    Google Scholar 

  18. Alexis, P., Panagiotis, S., Yannis, M.: Fast and accurate link prediction in social networking systems. J. Syst. Softw. 85, 2119–2132 (2012)

    Article  Google Scholar 

  19. Albert, R., Jeong, H., Barabasi, A.: Internet:diameter of the world-wide web. Nature 401(6749), 130–131 (1999)

    Article  Google Scholar 

  20. Guo, Y., Liu, Z.W., Zhao, Z.X.: Complexity analysis on link structure of world wide web. Comput. Eng. 37(23), 105–106, 109 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to YaJun Du .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Du, Y., Hu, Q., Li, X., Chen, X., Li, C. (2015). Ranking Web Page with Path Trust Knowledge Graph. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23862-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23861-6

  • Online ISBN: 978-3-319-23862-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics