Skip to main content

Indexing Multiple-Instance Objects

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10439))

Included in the following conference series:

  • 1093 Accesses

Abstract

As an actively investigated topic in machine learning, Multiple-Instance Learning (MIL) has many proposed solutions, including supervised and unsupervised methods. We introduce an indexing technique supporting efficient queries on Multiple-Instance (MI) objects. Our technique has a dynamic structure that supports efficient insertions and deletions and is based on an effective similarity measure for MI objects. Some MIL approaches have proposed their similarity measures for MI objects, but they either do not use all information or are time consuming. In this paper, we use two joint Gaussian based measures for MIL, Joint Gaussian Similarity (JGS) and Joint Gaussian Distance (JGD). They are based on intuitive definitions and take all the information into account while being robust to noise. For JGS, we propose the Instance based Index for querying MI objects. For JGD, metric trees can be directly used as the index because of its metric properties. Extensive experimental evaluations on various synthetic and real-world data sets demonstrate the effectiveness and efficiency of the similarity measures and the performance of the corresponding index structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://drive.google.com/open?id=0B3LRCuPdnX1BMFViblpaS1VKZmM.

  2. 2.

    https://drive.google.com/open?id=0B3LRCuPdnX1BVHFjeWpiLWF3M2M.

  3. 3.

    https://archive.ics.uci.edu/ml/machine-learning-databases/musk/.

  4. 4.

    http://www.miproblems.org/datasets/foxtigerelephant/.

  5. 5.

    https://drive.google.com/open?id=0B3LRCuPdnX1BYXpGUzlxYVdsSDA.

  6. 6.

    The time cost in this paper refers to the CPU time.

References

  1. Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)

    Article  MATH  Google Scholar 

  2. Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)

    MathSciNet  Google Scholar 

  3. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems 15, pp. 561–568 (2002)

    Google Scholar 

  4. Guan, X., Raich, R., Wong, W.: Efficient multi-instance learning for activity recognition from time series data using an auto-regressive hidden markov model. In: ICML, pp. 2330–2339 (2016)

    Google Scholar 

  5. Xu, X.: Statistical learning in multiple instance problems. Master’s thesis, University of Waikato (2003)

    Google Scholar 

  6. Wang, J., Zucker, J.: Solving the multiple-instance problem: a lazy learning approach. In: ICML, pp. 1119–1126 (2000)

    Google Scholar 

  7. Zhang, W., Lin, X., Cheema, M.A., Zhang, Y., Wang, W.: Quantile-based KNN over multi-valued objects. In: ICDE, pp. 16–27 (2010)

    Google Scholar 

  8. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: ACM/SIGACT-SIAM SODA, pp. 311–321 (1993)

    Google Scholar 

  9. Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hausdorff, F., Aumann, J.R.: Grundzüge der mengenlehre. Veit (1914)

    Google Scholar 

  11. Niiniluoto, I.: Truthlikeness, vol. 185. Springer Science & Business Media, Dordrecht (2012)

    MATH  Google Scholar 

  12. Belongie, S.J., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)

    Article  Google Scholar 

  13. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  14. Ramon, J., Bruynooghe, M.: A polynomial time computable metric between point sets. Acta Inf. 37(10), 765–780 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  15. He, X.: Multi-purpose exploratory mining of complex data. Ph.D. dissertation, Ludwig-Maximilians-Universität München (2014)

    Google Scholar 

  16. Sørensen, L., Loog, M., Tax, D.M.J., Lee, W., de Bruijne, M., Duin, R.P.W.: Dissimilarity-based multiple instance learning. In: IAPR, pp. 129–138 (2010)

    Google Scholar 

  17. Fukui, T., Wada, T.: Commonality preserving multiple instance clustering based on diverse density. In: Jawahar, C.V., Shan, S. (eds.) ACCV 2014. LNCS, vol. 9010, pp. 322–335. Springer, Cham (2015). doi:10.1007/978-3-319-16634-6_24

    Google Scholar 

  18. Guillaumin, M., Verbeek, J., Schmid, C.: Multiple instance metric learning from automatically labeled bags of faces. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 634–647. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15549-9_46

    Chapter  Google Scholar 

  19. Jin, R., Wang, S., Zhou, Z.: Learning a distance metric from multi-instance multi-label data. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, USA, 20–25 June 2009, pp. 896–902 (2009)

    Google Scholar 

  20. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MATH  Google Scholar 

  21. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of Annual Meeting, SIGMOD 1984, Boston, Massachusetts, 18–21 June 1984, pp. 47–57 (1984)

    Google Scholar 

  22. Böhm, C., Pryakhin, A., Schubert, M.: The gauss-tree: efficient object identification in databases of probabilistic feature vectors. In: ICDE, p. 9 (2006)

    Google Scholar 

  23. Zhou, L., Wackersreuther, B., Fiedler, F., Plant, C., Böhm, C.: Gaussian component based index for GMMs. In: ICDM, pp. 1365–1370 (2016)

    Google Scholar 

  24. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB, pp. 426–435 (1997)

    Google Scholar 

  25. Kriegel, H.-P., Pryakhin, A., Schubert, M.: An EM-approach for clustering multi-instance objects. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS, vol. 3918, pp. 139–148. Springer, Heidelberg (2006). doi:10.1007/11731139_18

    Chapter  Google Scholar 

  26. Wei, X., Wu, J., Zhou, Z.: Scalable multi-instance learning. In: ICDM, pp. 1037–1042 (2014)

    Google Scholar 

  27. Vatsavai, R.R.: Gaussian multiple instance learning approach for mapping the slums of the world using very high resolution imagery. In: KDD, pp. 1419–1426 (2013)

    Google Scholar 

  28. Zhou, L., Plant, C., Böhm, C.: Joint gaussian based measures for multiple-instance learning. In: ICDE, pp. 203–206 (2017)

    Google Scholar 

  29. Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for gaussian mixture models with application in image retrieval. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 835–840. Springer, Heidelberg (2005). doi:10.1007/11550907_132

    Google Scholar 

  30. Jensen, J.H., Ellis, D.P.W., Christensen, M.G., Jensen, S.H.: Evaluation of distance measures between gaussian mixture models of MFCCs. In: ISMIR, pp. 107–108 (2007)

    Google Scholar 

  31. Cui, S., Datcu, M.: Comparison of kullback-leibler divergence approximation methods between gaussian mixture models for satellite image retrieval. In: IGARSS, pp. 3719–3722 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Böhm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Zhou, L., Ye, W., Wang, Z., Plant, C., Böhm, C. (2017). Indexing Multiple-Instance Objects. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10439. Springer, Cham. https://doi.org/10.1007/978-3-319-64471-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64471-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64470-7

  • Online ISBN: 978-3-319-64471-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics