Elsevier

Information Systems

Volume 33, Issue 1, March 2008, Pages 18-35
Information Systems

Effective protocols for kNN search on broadcast multi-dimensional index trees

https://doi.org/10.1016/j.is.2007.04.002Get rights and content

Abstract

In a wireless mobile environment, data broadcasting provides an efficient way to disseminate data. Via data broadcasting, a server can provide location-based services to a large client population in a wireless environment. Among different location-based services, the k nearest neighbors (kNN) search is important and is used to find the k closest objects to a given point. However, the kNN search in a broadcast environment is particularly challenging due to the sequential access to the data on a broadcast channel. We propose efficient protocols for the kNN search on a broadcast R-tree, which is a popular multi-dimensional index tree, in a wireless broadcast environment in terms of latency and tuning time as well as memory usage. We investigate how a server schedules the broadcast and provide the corresponding kNN search algorithms at the mobile clients. One of our kNN search protocols further allows a kNN search to start at an arbitrary time instance and it can skip the waiting time for the beginning of a broadcast cycle, thereby reducing the latency. The experimental results validate that our mechanisms achieve the objectives.

Introduction

Advanced technologies in communications, positioning systems, and networking make it possible for mobile clients to ubiquitously access different kinds of information services, such as traffic information, stock-price information, electronic news, etc. However, the bandwidth in such a wireless mobile environment is asymmetric [1], [2], [3]. In other words, the downlink (server-to-client) bandwidth is much greater than the uplink (client-to-server) bandwidth. The conventional client–server model will hence be a poor match with the wireless mobile environment when the group of mobile clients is large due to bottlenecks of the uplink. Instead, data broadcasting provides an effective approach for a server to disseminate data to a large pool of clients.

Via data broadcasting, a mobile client can access the information and execute a query by tuning into the broadcast. A mobile client executing a query experiences a latency, which is the time elapsed between issuing and termination of the query, and the tuning time, which is the amount of time spent on listening to the broadcast. The latency indicates the Quality of Service (QoS) provided by the system and the tuning time represents the power consumption of mobile clients. In general, when the broadcast consists of only data, these two cost measures are equivalent. Broadcasting data with an index structure was introduced [4], [5] in order to further alleviate energy consumption. The index allows a mobile client to selectively tune into a broadcast according to the index [6], [7], [8], [9], [10], [11], [12], [13], [14]. This leads to a reduction in tuning time and therefore distinguishes these two cost measures. Moreover, since the memory on a mobile device is also limited, the amount of memory used when a mobile client executing a query by listening to the broadcast should be considered [6], [8].

The k nearest neighbors (kNN) search is one of the important and classic problems in computer science. Given a query point p, the kNN search is to find the k closest objects to p in a multi-dimensional space. Using the kNN search, a mobile client can have a kNN query, such as “please give me 5 nearest hotels” or “please find the 3 nearest gas stations”. The shaded nodes in Fig. 1(a) are the 3NN at the query point p. In [15], the authors proposed a kNN search algorithm using an R-tree [16] as the index structure. However, such an algorithm does not fully fit in a wireless broadcast environment due to the sequential access nature of the broadcast. By adapting the algorithm, Gedik et al. [17] provided a kNN search algorithm on a broadcast R-tree.

This paper considers the kNN search on a broadcast index tree that is an R-tree or one of its variations in wireless mobile environments. We use an R-tree as the index structure for the following reasons. First, the R-tree has been well studied and recognized as an efficient index structure [18]. Second, many types of queries have been studied on broadcast R-trees in recent years, including point query and range query [6], [7], [8], [11]. This allows us to avoid designing a new index structure, for which all kinds of queries should be reconsidered. Our work considers one broadcast R-tree for different kinds of queries.

In order to simultaneously optimize the latency, tuning time, and memory usage at the client, we investigate how a server schedules the index tree for broadcast and what the query processing is with the corresponding broadcast at the client side. Almost all previous papers on the kNN search in a broadcast environment assumed that the execution of a search starts from the root (i.e., the beginning of a broadcast cycle). Such a mechanism will have the clients wait for the beginning of the broadcast to start the search, and therefore leads to a longer latency. In this paper, we further provide a mechanism to allow the execution of a kNN search to start at an arbitrary time instance and each search execution can be done within one broadcast cycle length.

In the rest of this paper, we first overview the related work and present the preliminaries in Sections 2 and 3, respectively. In Section 4, we present and discuss the broadcast schedules. Scheduling the index tree for broadcast involves determining the order by which the index nodes are sent out and adding additional entries to the index nodes for improving performance. A mobile client tunes into the broadcast and operates independently according to the broadcast schedule. An algorithm for executing a kNN search on the corresponding broadcast R-tree is proposed and analyzed in Section 5. In Section 6, by adapting our technique from Section 5, we propose a mechanism to allow a kNN search to start at any time instance, and thus reduce the latency further. The experimental evaluation is discussed in Section 7. We use R*-trees [19] as the index trees on point data sets and compare our approach on the tuning time, latency, and memory usage with the mechanisms in [17]. Our experimental results show that our proposed algorithms achieve a shorter latency and smaller tuning time with less memory usage on the mobile client side. Section 8 concludes this paper.

Section snippets

Related work and background

R-trees and their variations are widely used to index multi-dimensional points or rectangles [18], [19]. The index node of an R-tree uses the minimum bounding rectangle (MBR) as its index which surrounds the MBRs of its children and contains the information of its children, including the MBRs of the children. The leaf node in an R-tree only contains the MBRs of data objects. Fig. 1 shows a 16-node R-tree and the corresponding MBRs of the index nodes. The R-tree has been extensively studied and

Preliminaries and assumptions

The first NN search algorithm on an R-tree [15] followed the branch-and-bound algorithmic design pattern and could be generalized to kNN search (k>1). The algorithm searches the R-tree in a depth-first fashion and prunes the irrelevant nodes using two distance metrics, mindist and minmaxdist, defined below. Suppose a query point p is given. Then, for a node v,

  • mindist(v) is the minimum distance from p to v's MBR; and

  • minmaxdist(v) is the minimum distance of the maximum distances from p to each

Data broadcast scheduling

There are two aspects which should be considered when designing a data broadcasting protocol in a wireless environment. One is the broadcast schedule at the server and the other is the corresponding query process at the mobile clients. In this section, we discuss the broadcast schedules at the server. A broadcast consists of a sequence of packets (nodes). The schedules based on the breadth-first traversal (BFS) can achieve a better tuning time for kNN search [17], but they result in a large

Exact kNN search algorithm

This section introduces our exact kNN search algorithm, w-disk, on a broadcast R-tree. Algorithm w-disk follows the branch-and-bound algorithmic design pattern and starts from the beginning of the broadcast cycle (i.e., the root of the broadcast R-tree). We will show that algorithm w-disk can find the exact kNN efficiently and analyze the time complexity for exploring a node in a broadcast R-tree.

Suppose that a kNN query is issued at query point p. In order to prune the nodes irrelevant to the

Search in the middle of a broadcast cycle

Most of the related papers have assumed that the query process always starts with the root of the broadcast R-tree (i.e., at the beginning of broadcast cycle). This assumption increases the latency because clients must wait for the root to appear in the broadcast after the queries are issued. In order to avoid waiting for the root in the broadcast, in this section we adapt algorithm w-disk and propose a kNN search mechanism, w-disk*, which allows (1) the kNN search process to start right after

Experimental results

In this section, we present our experimental results. We first compare our kNN search algorithm w-disk with the revised conventional approach w-conv and the improved algorithm w-opt from [17]. The cost measures include the tuning time, latency, and memory usage. As mentioned in Section 2, we use the node-based metric where the number of nodes is counted in the experiments. Then, we discuss the impact on the performance resulting from different broadcast schedules. In addition to pDFS and wDFS,

Conclusions

We considered effective protocols for kNN search on broadcasted multi-dimensional index trees, focusing on the search algorithms executed by the clients and the broadcast schedules generated by the server. By adding additional entries, called l-entries, into the broadcasted R-tree nodes, our proposed kNN search algorithm, called w-disk, can find pruning circles to avoid the exploration of irrelevant nodes during the search process to achieve good performance in terms of tuning time. We also

Acknowledgments

We thank the referees for numerous constructive comments and suggestions.

References (32)

  • S. Acharya, R. Alonso, M. Franklin, S. Zdonik, Broadcast disks: data management for asymmetric communication...
  • S. Acharya, M. Franklin, S. Zdonik, Balancing push and pull for data broadcast, in: Proceedings of the 1997 ACM SIGMOD...
  • J. Jing et al.

    Client–server computing in mobile environments

    ACM Comput. Surv.

    (1999)
  • T. Imielinski, S. Viswanathan, B.R. Badrinath, Energy efficient indexing on air, in: Proceedings of the 1994 ACM SIGMOD...
  • N. Shivakumar et al.

    Efficient indexing for broadcast based wireless systems

    Mobile Networks Appl.

    (1996)
  • S. Hambrusch et al.

    Efficient query execution on broadcasted index tree structures

    Data Knowledge Eng.

    (2007)
  • S. Hambrusch et al.

    Broadcasting and querying multi-dimensional index trees in a multi-channel environment

    Inf. Syst.

    (2006)
  • S.E. Hambrusch, C.-M. Liu, W.G. Aref, S. Prabhakar, Query processing in broadcasted spatial index trees, in:...
  • T. Imieliński et al.

    Data on air: organization and access

    IEEE Trans. Knowledge Data Eng.

    (1997)
  • S. Khanna et al.

    On indexed data broadcast

    J. Comput. Syst. Sci.

    (2000)
  • C.-M. Liu, Broadcasting and blocking large data sets with an index tree, Ph.D. Thesis, Purdue University, West...
  • J.X. Yu et al.

    An analysis of selective tuning schemes for nonuniform broadcast

    Data Knowledge Eng.

    (1997)
  • J. Zhang, L. Gruenwald, Optimizing data placement over wireless broadcast channel for multi-dimensional range query...
  • B. Zheng et al.

    Spatial queries in wireless broadcast systems

    Wireless Networks

    (2004)
  • N. Roussopoulos, S. Kelley, F. Vincent, Nearest neighbor queries, in: Proceedings of the 1995 ACM SIGMOD International...
  • A. Guttman, R-trees: a dynamic index structure for spatial searching, in: Proceedings of the 1984 ACM SIGMOD...
  • Cited by (29)

    • Forensic classification of black inkjet prints using Fourier transform near-infrared spectroscopy and Linear Discriminant Analysis

      2019, Forensic Science International
      Citation Excerpt :

      The classification model is calibrated on the training set with different categories. The performance of the calibration model is evaluated using the prediction or test set – validation [24,25]. This perspective DA method could be also very useful to the investigation of black inkjet-printed documents.

    • Searching continuous nearest neighbors in road networks on the air

      2014, Information Systems
      Citation Excerpt :

      Thus, it is not necessary for the client to wait for an index segment, if it has already identified the desired data items based on the broadcasting order before the associated index segment has arrived. Liu et al. [16] proposed efficient protocols for kNN search using a broadcast R-tree. By adding some additional entries to the index nodes of the R-tree, the method allows the kNN search to start in the middle of a broadcast cycle, thereby reducing the access latency.

    • Processing generalized k-nearest neighbor queries on a wireless broadcast stream

      2012, Information Sciences
      Citation Excerpt :

      Although most current LBSs heavily rely on a point-to-point communication method, they might suffer from drastic performance degradation due to the overwhelming server workload and communication contention when the population of users becomes extremely large. Wireless broadcasting can be an attractive complementary communication method for provisioning LBSs [13,19,23,25,40,41]. In a wireless broadcasting system, the server periodically broadcasts data objects through a wireless broadcasting channel.

    View all citing articles on Scopus
    View full text