Abstract
With the growth of Web information, traditional search engines, which are built on the text-based search technology, are unable to meet users’ demands on Web search. As many queries are time-related, and most Web pages contain time information, it has been an important issue to develop time-aware Web search engines. Based on this view, in this paper we study the indexing mechanism of the temporal information in Web pages. Our work is based on the assumption that each Web page only has one primary time, which will be utilized in time-based Web search. We present a new index structure called BT+-tree which is based on the MAP21-tree. However, unlike MAP21-tree’s double-tree structure, BT+-tree only uses one tree structure. Furthermore, duplicated keys can be effectively treated in BT+-tree, while the MAP21-tree has little consideration on duplicated keys. After discussing the index structure as well as manipulation algorithms of BT+-tree, we design a testing program to measure the performance of BT+-tree. The experimental results show that BT+-tree is effective for indexing temporal information in Web pages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alonso, O., Gertz, M., Yates, R.B.: On the value of temporal information in information retrieval. In: Proc. Of SIGIR 2007, pp. 35–41 (2007)
Jensen, C. S.: Temporal Database Management, PhD Thesis, http://www.cs.auc.dk/~csj/Thesis/
Nascimento, M., Dunham, M.: Indexing Valid Time Databases via B+-Trees. IEEE Transactions on Knowledge and Engineering 11(6), 929–947 (1999)
SnodgrassandIAhn, R.T.: Temporal Databases. Computer 19(9), 35–42 (1986)
Infoseek, http://www.infoseek.co.jp/
Namazu, http://www.namazu.org/
Deniz, E., Chris, F., Terence, J.P.: Chronica: a temporal Web search engine. In: Proc. Of ICWE 2006, pp. 119–120 (2006)
Nunes, S., Ribeiro, C., David, G.: Use of Temporal Expressions in Web Search. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 580–584. Springer, Heidelberg (2008)
Bliujute, R., Jensen, C.S., Saltenis, S., et al.: R-tree based indexing of NOW-relative bitemporal data. In: Proc. of the 24th VLDB Conf.
Bliujute, R., Jensen, C.S., Saltenis, S., et al.: Light-weight indexing of bitemporal data. In: Proc. of the 12th International Conf. on Scientific and Statistical Database Management, Berlin, pp. 125–138 (2000)
Goh, C.H., et al.: Indexing temporal data using existing B+-trees. Data and Knowledged Engineering 18, 147–165 (1996)
Nascimento, M.A., Dunham, M.H.: Indexing Valid Time Databases via B+-Trees. IEEE Trans. Knowl. Data Eng. 11(6), 929–947 (1999)
Ang, C.-H., Tan, K.-P.: The interval B-tree. Information Processing Letters 53(2), 85–89 (1995)
Clifford, J., Dyreson, C., Isakowitz, T., Jensen, C.S., Snodgrass, R.T.: On the semantics of “NOW” in temporal databases. Technical Report R-94-2047, Dept. of Mathematics and Computer Science, Aalborg University (November 1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, H., Li, Q., Jin, P. (2010). BT+-tree: A New Index for Temporal Information in Web Pages. In: Zhang, Y., Cuzzocrea, A., Ma, J., Chung, Ki., Arslan, T., Song, X. (eds) Database Theory and Application, Bio-Science and Bio-Technology. BSBT DTA 2010 2010. Communications in Computer and Information Science, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17622-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-17622-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17621-0
Online ISBN: 978-3-642-17622-7
eBook Packages: Computer ScienceComputer Science (R0)