Skip to main content

Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases

  • Conference paper
Book cover Database and Expert Systems Applications (DEXA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5181))

Included in the following conference series:

Abstract

This paper deals with the performance problem of nearest neighbor queries in voluminous multimedia databases. We propose a data allocation method which allows achieving a \(0(\sqrt{n})\) query processing time in parallel settings. Our proposal is based on the complexity analysis of content based retrieval when it is used a clustering method. We derive a valid range of values for the number of clusters that should be obtained from the database. Then, to efficiently process nearest neighbor queries, we derive the optimal number of nodes to maximize parallel resources. We validated our method through experiments with different high dimensional databases and implemented a query processing algorithm for full k nearest neighbors in a shared nothing cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdel-Ghaffar, K.A.S., El Abbadi, A.: Optimal Allocation of Two-Dimensional Data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 409–418. Springer, Heidelberg (1996)

    Google Scholar 

  2. Aggarwal, C.C.: On the Effects of Dimensionality Reduction on High Dimensional Similarity Search. In: ACM PODS 2001: Symposium on Principles of Database Systems Conference, pp. 256–266 (2001)

    Google Scholar 

  3. Aggarwal, C.C.: An efficient subspace sampling framework for high-dimensional data reduction, selectivity estimation, and nearest-neighbor search. IEEE Transactions on Knowledge and Data Engineering 16(10), 1247–1262 (2004)

    Article  Google Scholar 

  4. Alpkocak, A., Danisman, T., Ulker, T.: A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree. In: Grigoras, D., Nicolau, A., Toursel, B., Folliot, B. (eds.) IWCC 2001. LNCS, vol. 2326, pp. 166–171. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Attila Gürsoy, E.E.: Data Decomposition for Parallel K-means Clustering. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2004. LNCS, vol. 3019, pp. 241–248. Springer, Heidelberg (2004)

    Google Scholar 

  6. Berchtold, S., Böhm, C., Braunmüller, B., Keim, D.A., Kriegel, H.: Fast parallel similarity search in multimedia databases. In: SIGMOD Rec., vol. 26(2), pp. 1–12 (1997)

    Google Scholar 

  7. Berrani, S.-A., Amsaleg, L., Gros, P.: Approximate Searches: k-Neighbors + Precision. In: CIKM 2003: Proceedings of the 12th ACM International Conference on Information and Knowledge, pp. 24–31 (2003)

    Google Scholar 

  8. Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001)

    Article  Google Scholar 

  9. Bok, K.S., Seo, D.M., Song, S.I., Kim, M.H., Yoo, J.S.: An Index Structure for Parallel Processing of Multidimensional Data. In: Fan, W., Wu, Z., Yang, J. (eds.) WAIM 2005. LNCS, vol. 3739, pp. 589–600. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Bok, K.S., Song, S.I., Yoo, J.S.: Efficient k-Nearest Neighbor Searches for Parallel Multidimensional Index Structures. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 870–879. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Chavez, E., Navarro, G.: Probabilistic proximity search: Fighting the curse of dimensionality in metric spaces. Information Processing Letters 85(1)(16), 39–46 (2003)

    Google Scholar 

  12. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by Image and Video Content: The QBIC System. IEEE Computer 28(9), 23–32 (1995)

    Google Scholar 

  13. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  14. Kamel, I., Faloutsos, C.: Parallel R-trees. In: SIGMOD 1992: Proceedings of the ACM international Conference on Management of Data, pp. 195–204 (1992)

    Google Scholar 

  15. Kanungo, T., Mount, D.M., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Analysis and Machine Intelligence 24, 881–892 (2002)

    Article  Google Scholar 

  16. Li, C., Chang, E., Garcia-Molina, H., Wiederhold, G.: Clustering for approximate similarity search in high-dimensional spaces. IEEE Transactions on Knowledge and Data Engineering 14(4), 792–808 (2002)

    Article  Google Scholar 

  17. Liu, T., Rosenberg, C.R., Rowley, H.A.: Clustering Billions of Images with Large Scale Nearest Neighbor Search. In: 8th IEEE Workshop on Applications of Computer Vision (WACV 2007), p. 28 (2007)

    Google Scholar 

  18. Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: Indexing the Distance: An Efficient Method to KNN Processing. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 421–430 (2001)

    Google Scholar 

  19. Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. Journal of the ACM Transactions on Database Systems 30(2), 364–397 (2005)

    Article  Google Scholar 

  20. Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 2nd edn. Prentice-Hall, Englewood Cliffs (1999)

    Google Scholar 

  21. Prabhakar, S., Agrawal, D., El Abbadi, A., Singh, A., Smith, T.: Browsing and placement of multi-resolution images on parallel disks. Multimedia Systems 8(6), 459–469 (2003)

    Article  Google Scholar 

  22. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest Neighbor Queries. In: SIGMOD 1995: Proceedings of the International Conference on Management of Data, San Jose, California, May 22-25, pp. 71–79 (1995)

    Google Scholar 

  23. Schnitzer, B., Leutenegger, S.T.: Master-Client R-Trees: A New Parallel R-Tree Architecture. In: SSDBM 1999: Proceedings of the 11th International Conference on Scientific and Statistical Database Management (1999)

    Google Scholar 

  24. Weber, R., Schek, H.J., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB 1998: Proceedings of the 24th International Conference Very Large Data Bases, pp. 194–205 (1998)

    Google Scholar 

  25. Yu, D., Zhang, A.: ClusterTree: Integration of Cluster Representation and Nearest Neighbor Search for Large Datasets with High Dimensionality. IEEE Transactions on Knowledge and Data Engineering 15(5), 1316–1337 (2003)

    Article  MathSciNet  Google Scholar 

  26. Zezula, P., Savino, P., Rabitti, F., Amato, G., Ciaccia, P.: Processing M-trees with parallel resources. In: Research Issues In Data Engineering. Eighth International Workshop on Continuous-Media Databases and Applications, pp. 147–154 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sourav S. Bhowmick Josef Küng Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Manjarrez-Sanchez, J., Martinez, J., Valduriez, P. (2008). Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2008. Lecture Notes in Computer Science, vol 5181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85654-2_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85654-2_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85653-5

  • Online ISBN: 978-3-540-85654-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics