Skip to main content
Log in

Skyline-based dissimilarity of images

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Large image collections are being used in many modern applications. In this paper, we aim at capturing the intrinsic dissimilarities of image descriptors in large image collections, i.e., to detect dissimilar (or else diverse) images without defining an explicit similarity or distance measure. Towards this goal, we adopt skyline processing techniques for large image databases, based on their high-dimensional descriptor vectors. The novelty of the proposed methodology lies in the use of skyline techniques empowered by state-of-the-art hashing schemes to enable effective data partitioning and indexing in secondary memory, towards supporting large image databases. The proposed approach is evaluated experimentally by using three real-world image datasets. Performance evaluation results demonstrate that images lying on the skyline have significantly different characteristics, which depend on the type of the descriptor. Thus, these skyline items may be used as seeds to apply clustering in large image databases. In addition, we observe that skyline processing using hash-based indexing structures is significantly faster than index-free skyline computation and also more efficient than skyline computation with hierarchical indexing structures. Based on our results, the proposed approach is both efficient (regarding runtime) and effective (with respect to image diversity) and therefore can be used as a base for more complex data mining tasks such as clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33

Similar content being viewed by others

Notes

  1. http://arma.sourceforge.net/

  2. http://www.boost.org/

  3. https://www.mathworks.com/products/matlab.html

  4. https://www.gnu.org/software/octave/

  5. http://imageclef.org/wikidata

  6. http://corpus-texmex.irisa.fr/

  7. http://lear.inrialpes.fr/jegou/data.php#holidays

References

  • Borzsony, S., Kossmann, D., Stocker, K. (2001). The skyline operator, Proceedings 17th international conference on data engineering (ICDE) pp. 421–430, Heidelberg, Germany.

  • Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L. (2001). Searching in metric spaces. ACM Computer Surveys, 33(3), 273–321.

    Article  Google Scholar 

  • Chatzichristofis, S. A., & Boutalis, Y.S. (2008). CEDD: color and edge directivity descriptor – a compact descriptor for image indexing and retrieval, Proceedings 6th international conference in advanced research on computer vision systems (ICVS) pp. 312–322, Santorini, Greece.

  • Cheng, Y., & Chen, S. (2003). Image classification using color, texture and regions. Image & Vision Computing, 21(9), 759–776.

    Article  Google Scholar 

  • Drosou, M., & Pitoura, E. (2015). Multiple radii disC diversity: Result diversification based on dissimilarity and coverage. ACM Transactions on Database Systems, 1, 40.

    MathSciNet  Google Scholar 

  • Fagin, R. (1999). Combining fuzzy information from multiple systems. Journal of Computer & System Sciences, 58(1), 83–99.

    Article  MathSciNet  Google Scholar 

  • Georgiadis, N., Tiakas, E., Manolopoulos, Y. (2017). Detecting intrinsic dissimilarities in large image databases through skylines, Proceedings 9th international conference on management of digital ecosystems (MEDES), pp. 194–201, Bangkok, Thailand.

  • Di Gesu, V., & Starovoitov, V. (1999). Distance-based functions for image comparison. Pattern Recognition Letters, 20(2), 207–214.

    Article  Google Scholar 

  • Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F. (2013). Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis & Machine Intelligence, 35(12), 2916–2929.

    Article  Google Scholar 

  • Grauman, K., & Fergus, R. (2013). Learning binary hash codes for large-scale image search, chapter in book machine learning for computer vision by R. cipolla, S. Battiato and G.M. Farinella (eds.), pp. 49–87, Springer.

  • Heo, J. P., Lee, Y., He, J., Chang, S. F., Yoon, S.E. (2015). Spherical hashing: binary code embedding with hyperspheres. IEEE Transactions on Pattern Analysis & Machine Intelligence, 37(11), 2304–2316.

    Article  Google Scholar 

  • Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: Towards removing the curse of dimensionality, Proceedings 30th annual ACM symposium on theory of computing (STOC), pp. 604–613, Dallas, TX.

  • Jégou, H., Douze, M., Schmid, C. (2008). Hamming embedding and weak geometry consistency for large scale image search, Proceedings 10th European conference on computer vision (ECCV), pp. 304–317, Marseille, France.

  • Jégou, H., Douze, M., Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis & Machine Intelligence, 33(1), 117–128.

    Article  Google Scholar 

  • Jin, Z., Li, C., Lin, Y., Cai, D. (2014). Density sensitive hashing. IEEE Transactions on Cybernetics, 44(8), 1362–1371.

    Article  Google Scholar 

  • Kossmann, D., Ramsak, F., Rost, S. (2002). Shooting stars in the sky: An online algorithm for skyline queries, Proceedings 28th international conference on very large data bases (VLDB), pp. 275–286, Hong Kong, China.

    Chapter  Google Scholar 

  • Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.

    Article  Google Scholar 

  • Papadias, D., Tao, Y., Fu, G., Seeger, B. (2003). An optimal and progressive algorithm for skyline queries, Proceedings ACM international conference on management of data (SIGMOD), pp. 467–478, San Diego, CA.

  • Shirkhorshidi, A. S., Aghabozorgi, S., Wah, T.Y. (2015). A comparison study on similarity and dissimilarity measures in clustering continuous data, PLos ONE, 10(12).

    Article  Google Scholar 

  • Stehling, R. O., Nascimento, M. A., Falcão, A.X. (2002). A compact and efficient image retrieval approach based on border/interior pixel classification, Proceedings 11th international conference on information & knowledge management (CIKM), pp. 102–109, McLean, VA.

  • Tan, K. -L., Eng, P. -K., Ooi, B.C. (2001). Efficient progressive skyline computation, Proceedings 27th international conference on very large data bases (VLDB), pp. 301–310, Rome, Italy.

  • Tiakas, E., Papadopoulos, A.N., Manolopoulos, Y. (2013). On estimating the maximum domination value and the skyline cardinality of multidimensional data sets. International Journal of Knowledge-based Organizations, 3(4), 61–83.

    Article  Google Scholar 

  • Tiakas, E., Papadopoulos, A. N., Manolopoulos, Y. (2016). Skyline queries: An introduction, Proceedings 6th international conference on information, intelligence, systems & applications (IISA), pp. 1–6, Corfu, Greece.

  • Tiakas, E., Rafailidis, D., Dimou, A., Daras, P. (2013). MSIDX: Multi-sort indexing for efficient Content-Based image search and retrieval. IEEE Transactions on Multimedia, 15(6), 1415–1430.

    Article  Google Scholar 

  • Valkanas, G., Papadopoulos, A. N., Gunopoulos, D. (2013). Skydiver: A framework for skyline diversification, Proceedings of joint EDBT/ICDT conferences, pp. 406–417, Genoa, Italy.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolaos Georgiadis.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Georgiadis, N., Tiakas, E., Manolopoulos, Y. et al. Skyline-based dissimilarity of images. J Intell Inf Syst 53, 509–545 (2019). https://doi.org/10.1007/s10844-019-00571-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-019-00571-y

Keywords

Navigation