Abstract
This paper presents the use of the Low Memory Locality Sensitive Hashing (LMLSH) technique operating in Euclidean space to build a data structure for the Defense Meteorological Satellite Program (DMSP) satellite imagery database. The LMLSH technique finds satellite image matches in sublinear search time. The texture feature vectors of the images are extracted using pyramid-structured wavelet transform coupled with Gaussian central moment technique. These feature vectors and families of hash functions, drawn randomly and independently from a Gaussian distribution, are used to build hash tables. Given a query, the hash tables are used to pull out the best matches to that query and this is done in a sublinear search time complexity. When tested, our algorithm has proven to be approximately twenty six times faster than the Linear Search (LS) algorithm. In addition, the LMLSH algorithm searches about two percent of the entire database randomly to find the possible matches to any given query without loss of accuracy compared to the absolute best matches returned by its LS counterpart.
Similar content being viewed by others
Abbreviations
- P :
-
Data set containing the texture feature vectors of the images
- N :
-
Number of texture feature vectors in P
- d :
-
Dimensionality of the texture feature vector
- p :
-
Any vector belonging to P
- ℜd :
-
A d-dimensional vector space such that if p ∈ ℜd then p is a d-dimensional vector
- q :
-
A query vector such that q ∈ ℜd
- L :
-
Number of tables
- m :
-
A prime number representing the number of bins per table
- N gd (μ,σ2):
-
A Gaussian distribution with mean μ and variance σ 2
- H Gj :
-
A family of hash functions drawn randomly from N gd (0,d 2) for table j
- α :
-
Load factor, i.e. expected number of texture feature vectors that project into the same bin
- β(q, R):
-
A sphere of radius R centered at q
- δ :
-
The probability that a true nearest neighbor is not reported to a given q
- Pr:
-
Probability.
References
Arya S, Mount DM, Netanyahu NS, Silverman R, Wu A (1994) An optimal algorithm for approximate nearest neighbor searching. Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, Arlington, VA, pp. 573–582, January 23–25
Bentley J (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18:509–517
Beyer KS, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbor meaningful? Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pp. 217–235, January 10–12
Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147
Buaba R, Gebril M, Homaifar A, Kihn E, Zhizhin M (2010) Locality sensitive hashing for satellite images using texture feature vectors. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–10, March 6–13
Buhler J (2001) Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17:419–428
Buhler J (2002) Provably sensitive indexing strategies for biosequence similarity search. Proceedings of the 6th Annual International Conference on Computational Molecular Biology (RECOMB02), Washington DC, pp. 399–417, April 18–21
Buhler J, Tompa M (2001) Finding motifs using random projections. Proceedings of the 5th Annual International Conference on Computational Molecular Biology (RECOMB01), Montreal, Canada, pp. 69–76, April 22–25
Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman J, Yang C (2000) Finding interesting associations without support pruning. Proceedings of the 16th International Conference on Data Engineering (ICDE), San Diego, CA, pp. 64–78, February 28–March 3
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn. McGraw-Hill, Boston, pp 221–252
Datar M, Indyk P, Immorlica N, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. Proceedings of the 20th ACM Annual Symposium on Computational Geometry, Brooklyn, NY, pp. 253–262, June 9–11
Do MN, Vetterli M (2002) Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Trans Image Process 11:146–158
Gebril M, Buaba R, Homaifar A, Kihn E, Zhizhin M (2010) Structural indexing of satellite images using texture feature extraction retrieval. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–9, March 6–13
Georgescu B, Shimshoni I, Meer P (2003) Mean shift based clustering in high dimensions: a texture classification example. Proceedings of the 9th IEEE International Conference on Computer Vision, Los Alamitos, CA, pp. 456–463, October 13–16
Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), Edinburgh, Scotland, UK, pp. 518–529, September 7–10
Har-Peled S (2001) A replacement for voronoi diagrams of near linear size. Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Las Vegas, NV, pp. 94–103, October 8–11
Hinneburg A, Aggarwal C, Keim DA (2000) What is the nearest neighbor in high dimensional spaces? Proceedings of the 26th International Conference on Very Large Databases (VLDB), Junghoo Cho, Sougata Mukherjea, pp 506–515, September 10–14
Indyk P, Motwani R (1998) Approximate nearest neighbor: towards removing the curse of dimensionality. Proceedings of the 30th Annual ACM Symposium on Theory of Computing, Dallas, TX, pp. 604–613, May 24–26
Kleinberg J (1997) Two algorithms for nearest-neighbor search in high dimensions. Proceedings of the 29th Annual ACM Symposium on Theory of Computing, EI Paso, TX, pp. 599–608, May 1–4
Kushilevitz E, Ostrovsky R, Rabani Y (1998) Efficient search for approximate nearest neighbor in high dimensional spaces. Proceedings of the 30th ACM Symposium on Theory of Computing, Dallas, TX, pp. 614–623, May 24–26
Nasser S, Alkhaldi R, Vert G (2006) A modified Fuzzy K-means clustering using expectation maximization. Proceedings of the IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada, pp. 231–235, July 16–21
Ouyang Z, Memon N, Suel T, Trendafilov D (2002) Cluster-based delta compression of collections of files. Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE), Singapore, pp. 257–266, December 12–14
Slaney M, Casey M (2008) Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Process Mag 25:128–131
Thorpe S, Fize D, Marlot C (1996) Speed of processing in the human visual system. Nature 381:520–522
Weber R, Schek HJ, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. Proceedings of the 24th Int. Conf. on Very Large Data Bases (VLDB), New York City, NY, pp. 194–205, August 24–27
Yang C (2001) MACS: music audio characteristic sequence indexing for similarity retrieval. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 123–126, October 21–24
Zolotarey VM (1986) One-dimensional stable distributions. In: American Mathematical Society. Translations of Mathematical Monographs, vol. 65. Providence, Rhode Island, pp. 269–298
Acknowledgement
We are grateful to National Oceanic and Atmospheric Administration (NOAA)/National Geophysical Data Center in Boulder, Colorado, for providing us with the DMSP satellite imagery database. This work is partially supported by NOAA/National Center for Atmospheric Research Educational Program under Cooperative Agreement No: NA060AR4810187.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. A. Babaie
Rights and permissions
About this article
Cite this article
Buaba, R., Homaifar, A., Gebril, M. et al. Satellite image retrieval using low memory locality sensitive hashing in Euclidean space. Earth Sci Inform 4, 17–28 (2011). https://doi.org/10.1007/s12145-010-0076-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-010-0076-x