Skip to main content
Log in

Satellite image retrieval using low memory locality sensitive hashing in Euclidean space

  • Research Article
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

This paper presents the use of the Low Memory Locality Sensitive Hashing (LMLSH) technique operating in Euclidean space to build a data structure for the Defense Meteorological Satellite Program (DMSP) satellite imagery database. The LMLSH technique finds satellite image matches in sublinear search time. The texture feature vectors of the images are extracted using pyramid-structured wavelet transform coupled with Gaussian central moment technique. These feature vectors and families of hash functions, drawn randomly and independently from a Gaussian distribution, are used to build hash tables. Given a query, the hash tables are used to pull out the best matches to that query and this is done in a sublinear search time complexity. When tested, our algorithm has proven to be approximately twenty six times faster than the Linear Search (LS) algorithm. In addition, the LMLSH algorithm searches about two percent of the entire database randomly to find the possible matches to any given query without loss of accuracy compared to the absolute best matches returned by its LS counterpart.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

P :

Data set containing the texture feature vectors of the images

N :

Number of texture feature vectors in P

d :

Dimensionality of the texture feature vector

p :

Any vector belonging to P

d :

A d-dimensional vector space such that if p ∈ ℜd then p is a d-dimensional vector

q :

A query vector such that q ∈ ℜd

L :

Number of tables

m :

A prime number representing the number of bins per table

N gd     (μ,σ2):

A Gaussian distribution with mean μ and variance σ 2

H Gj :

A family of hash functions drawn randomly from N gd     (0,d 2) for table j

α :

Load factor, i.e. expected number of texture feature vectors that project into the same bin

β(q, R):

A sphere of radius R centered at q

δ :

The probability that a true nearest neighbor is not reported to a given q

Pr:

Probability.

References

  • Arya S, Mount DM, Netanyahu NS, Silverman R, Wu A (1994) An optimal algorithm for approximate nearest neighbor searching. Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, Arlington, VA, pp. 573–582, January 23–25

  • Bentley J (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18:509–517

    Article  Google Scholar 

  • Beyer KS, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbor meaningful? Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pp. 217–235, January 10–12

  • Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147

    Article  Google Scholar 

  • Buaba R, Gebril M, Homaifar A, Kihn E, Zhizhin M (2010) Locality sensitive hashing for satellite images using texture feature vectors. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–10, March 6–13

  • Buhler J (2001) Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17:419–428

    Article  Google Scholar 

  • Buhler J (2002) Provably sensitive indexing strategies for biosequence similarity search. Proceedings of the 6th Annual International Conference on Computational Molecular Biology (RECOMB02), Washington DC, pp. 399–417, April 18–21

  • Buhler J, Tompa M (2001) Finding motifs using random projections. Proceedings of the 5th Annual International Conference on Computational Molecular Biology (RECOMB01), Montreal, Canada, pp. 69–76, April 22–25

  • Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman J, Yang C (2000) Finding interesting associations without support pruning. Proceedings of the 16th International Conference on Data Engineering (ICDE), San Diego, CA, pp. 64–78, February 28–March 3

  • Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn. McGraw-Hill, Boston, pp 221–252

    Google Scholar 

  • Datar M, Indyk P, Immorlica N, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. Proceedings of the 20th ACM Annual Symposium on Computational Geometry, Brooklyn, NY, pp. 253–262, June 9–11

  • Do MN, Vetterli M (2002) Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Trans Image Process 11:146–158

    Article  Google Scholar 

  • Gebril M, Buaba R, Homaifar A, Kihn E, Zhizhin M (2010) Structural indexing of satellite images using texture feature extraction retrieval. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–9, March 6–13

  • Georgescu B, Shimshoni I, Meer P (2003) Mean shift based clustering in high dimensions: a texture classification example. Proceedings of the 9th IEEE International Conference on Computer Vision, Los Alamitos, CA, pp. 456–463, October 13–16

  • Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), Edinburgh, Scotland, UK, pp. 518–529, September 7–10

  • Har-Peled S (2001) A replacement for voronoi diagrams of near linear size. Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Las Vegas, NV, pp. 94–103, October 8–11

  • Hinneburg A, Aggarwal C, Keim DA (2000) What is the nearest neighbor in high dimensional spaces? Proceedings of the 26th International Conference on Very Large Databases (VLDB), Junghoo Cho, Sougata Mukherjea, pp 506–515, September 10–14

  • Indyk P, Motwani R (1998) Approximate nearest neighbor: towards removing the curse of dimensionality. Proceedings of the 30th Annual ACM Symposium on Theory of Computing, Dallas, TX, pp. 604–613, May 24–26

  • Kleinberg J (1997) Two algorithms for nearest-neighbor search in high dimensions. Proceedings of the 29th Annual ACM Symposium on Theory of Computing, EI Paso, TX, pp. 599–608, May 1–4

  • Kushilevitz E, Ostrovsky R, Rabani Y (1998) Efficient search for approximate nearest neighbor in high dimensional spaces. Proceedings of the 30th ACM Symposium on Theory of Computing, Dallas, TX, pp. 614–623, May 24–26

  • Nasser S, Alkhaldi R, Vert G (2006) A modified Fuzzy K-means clustering using expectation maximization. Proceedings of the IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada, pp. 231–235, July 16–21

  • Ouyang Z, Memon N, Suel T, Trendafilov D (2002) Cluster-based delta compression of collections of files. Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE), Singapore, pp. 257–266, December 12–14

  • Slaney M, Casey M (2008) Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Process Mag 25:128–131

    Article  Google Scholar 

  • Thorpe S, Fize D, Marlot C (1996) Speed of processing in the human visual system. Nature 381:520–522

    Article  Google Scholar 

  • Weber R, Schek HJ, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. Proceedings of the 24th Int. Conf. on Very Large Data Bases (VLDB), New York City, NY, pp. 194–205, August 24–27

  • Yang C (2001) MACS: music audio characteristic sequence indexing for similarity retrieval. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 123–126, October 21–24

  • Zolotarey VM (1986) One-dimensional stable distributions. In: American Mathematical Society. Translations of Mathematical Monographs, vol. 65. Providence, Rhode Island, pp. 269–298

Download references

Acknowledgement

We are grateful to National Oceanic and Atmospheric Administration (NOAA)/National Geophysical Data Center in Boulder, Colorado, for providing us with the DMSP satellite imagery database. This work is partially supported by NOAA/National Center for Atmospheric Research Educational Program under Cooperative Agreement No: NA060AR4810187.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdollah Homaifar.

Additional information

Communicated by: H. A. Babaie

Rights and permissions

Reprints and permissions

About this article

Cite this article

Buaba, R., Homaifar, A., Gebril, M. et al. Satellite image retrieval using low memory locality sensitive hashing in Euclidean space. Earth Sci Inform 4, 17–28 (2011). https://doi.org/10.1007/s12145-010-0076-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-010-0076-x

Keywords

Navigation