Abstract
How to find and discover useful information from Internet is a real challenge in information retrieval (IR) and search engines (SE). In this paper, we propose and construct Path Trust Knowledge Graph PTKG model for assigning priority values to the unvisited web pages. For a given user specific topic t, its PTKG contains five parts: (1) The context graph \(G(t)=(V, E)\), where V is the crawled history web page set and E includes the hyper link set among the history web pages; (2) Retrieving knowledge implied in the paths among these web pages and finding their lengths; (3) Building the trust degrees among the web pages; (4) Constructing topic specific language model and general language model by using the trust degrees; (5) Assigning the priority values of web pages for ranking them. Finally, we perform an experimental comparison among our proposed PTKG approach with the classic LCG and RCG. As a result, our method outperforms LCG and RCG.
Y. Du—Project supported by the National Nature Science Foundation of China (No. 61271413, 61472329).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, W.J., Du, Y.J.: A novel focused crawler based on cell-like membrane computing optimization algorithm. Neurocomputing 123, 266–280 (2014)
The size of the World Wide Web (2014). http://www.worldwidewebsize.com/
Web crawler (2014). http://en.wikipedia.org/wiki/Web_crawler
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)
Developed by WebBee Team, WebBee SEO Spider. Java based Desktop (SEO Spider) application (2014)
Chris, S.: Become.com Launches Shopping Search Engine. In: SES Conference and Expo (2005)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Wu, B., Yang, J., He, L.: Chinese hownet-based multi-factor word similarity algorithm integrated of result modification. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part V. LNCS, vol. 7667, pp. 256–266. Springer, Heidelberg (2012)
Du, Y.J., Hai, Y.F.: Semantic ranking of web pages based on formal concept analysis. J. Syst. Softw. 86, 187–197 (2013)
Diligenti, M., Coetzee, F.M., Lawrence, S., Giles, C.L.: Focused crawling using context graphs. In: The 26th International Conference on Very Large Database (VLDB), pp. 527–534 (2000)
Hsu, C.C., Wu, F.: Topic-specific crawling on the web with the measurements of the relevancy context graph. Inf. Syst. 31, 232–246 (2006)
Du, Y.J., Peng, Q.Q., Gao, Z.Q.: A topic-specific crawling strategy based on semantics similarity. Data Knowl. Eng. 88, 75–93 (2013)
Du, Y.J., Hai, Y.F., Xie, C.Z.: An approach for selecting seed urls of focused crawler based on user-interest ontology. Appl. Soft Comput. 14(C), 663–676 (2014)
Liu, Z.J., Du, Y.J., Zhao, Y.: Focused crawler based on domain ontology and FCA. J. Inf. Comput. Sci. 8(10), 1909–1917 (2011)
Du, Y.J., Dong, Z.B.: Focused web crawling strategy based on concept context graph. J. Inf. Comput. Sci. 5(3), 1097–1106 (2009)
Wille, R.: An approach based restructuring lattice theory hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, vol. 83, pp. 445–470. Springer, The Netherlands (1982)
Liu, Q., Tu, Z.P., Lin, S.X.: A novel graph-based compact representation of word alignment. In: Proceedings of Annual Meeting of the Association for Computational Linguistics ACL 2013 (2013)
Alexis, P., Panagiotis, S., Yannis, M.: Fast and accurate link prediction in social networking systems. J. Syst. Softw. 85, 2119–2132 (2012)
Albert, R., Jeong, H., Barabasi, A.: Internet:diameter of the world-wide web. Nature 401(6749), 130–131 (1999)
Guo, Y., Liu, Z.W., Zhao, Z.X.: Complexity analysis on link structure of world wide web. Comput. Eng. 37(23), 105–106, 109 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Du, Y., Hu, Q., Li, X., Chen, X., Li, C. (2015). Ranking Web Page with Path Trust Knowledge Graph. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-23862-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23861-6
Online ISBN: 978-3-319-23862-3
eBook Packages: Computer ScienceComputer Science (R0)