skip to main content
article

MEGA---the maximizing expected generalization algorithm for learning complex query concepts

Authors Info & Claims
Published:01 October 2003Publication History
Skip Abstract Section

Abstract

Specifying exact query concepts has become increasingly challenging to end-users. This is because many query concepts (e.g., those for looking up a multimedia object) can be hard to articulate, and articulation can be subjective. In this study, we propose a query-concept learner that learns query criteria through an intelligent sampling process. Our concept learner aims to fulfill two primary design objectives: (1) it has to be expressive in order to model most practical query concepts and (2) it must learn a concept quickly and with a small number of labeled data since online users tend to be too impatient to provide much feedback. To fulfill the first goal, we model query concepts in k-CNF, which can express almost all practical query concepts. To fulfill the second design goal, we propose our maximizing expected generalization algorithm (MEGA), which converges to target concepts quickly by its two complementary steps: sample selection and concept refinement. We also propose a divide-and-conquer method that divides the concept-learning task into G subtasks to achieve speedup. We notice that a task must be divided carefully, or search accuracy may suffer. Through analysis and mining results, we observe that organizing image features in a multiresolution manner, and minimizing intragroup feature correlation, can speed up query-concept learning substantially while maintaining high search accuracy. Through examples, analysis, experiments, and a prototype implementation, we show that MEGA converges to query concepts significantly faster than traditional methods.

References

  1. Ankerst, M., Elsen, C., Ester, M., and Kriegel, H.-P. 1999. Visual classification: An interactive approach to decision tree construction. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. ACM, New York, 392--396.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bartolini, I., Ciaccia, P., and Waas, F. 2001. Feedbackbypass: A new approach to interactive similarity query processing. In Proceedings of the 27th VLDB Conference, 201--210.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Breiman, L. 1996. Bagging predicators. Mach. Learn. 24, 123--140.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Breiman, L. 1998. Arcing classifiers. Ann. Stat. 26, 801--849.]]Google ScholarGoogle ScholarCross RefCross Ref
  5. Chang, E., Cheng, K.-T., and Chang, L. 2001a. PBIR---Perception-based image retrieval. In Proceedings of the ACM SIGMOD (Demo). ACM New York, 613--614.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chang, E., Cheng, K.-T., Lai, W.-C., Wu, C.-T., Chang, C.-W., and Wu, Y.-L. 2001b. PBIR---A system that learns subjective image query concepts. In Proceedings of ACM Multimedia, http://www.mmdb.ece.ucsb.edu/∼demo/corelacm/. ACM, New York, 611--614.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chang, E., Li, B., and Li, C. 2000. Towards perception-based image retrieval. In Proceedings of IEEE Content-Based Access of Image and Video Libraries. IEEE Computer Society Press, Los Alamitos, Calif., 101--105.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chen, C. 1996. Fuzzy Logic and Neural Network Handbook. McGraw-Hill, New York.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cox, I. J., Miller, M. L., Minka, T. P., Papathomas, T. V., and Yianilos, P. N. 2000. The bayesian image retrieval system, pichunter: Theory, implementation and psychological experiments. IEEE Trans. Image Proc. 9, 1, 20--31.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fagin, R. 1998. Fuzzy queries in multimedia database systems. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York, 1--10.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fagin, R. and Wimmers, E. L. 1997. A formula for incorporating weights into scoring rules. In Proceedings of the International Conference on Database Theory. 247--261.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Faloutsos, C., Barber, R., Flickner, M., Hafner, J., Niblack, W., Petkovic, D., and Equitz, W. 1994. Efficient and effective querying by image content. J. Int. Inf. Syst. Integ. Artif. Intel. Datab. Tech. 3, 3-4, 231--262.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Freund, Y., Seung, H. S., Shamir, E., and Tishby, N. 1997. Selective sampling using the query by committee algorithm. Mach. Learn. 28, 133--168.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fukanaga, K. 1990. Introduction to Statistical Pattern Recognition. Academic Press, Orlands, Fla.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Goh, K., Li, B., and Chang, E. 2002. Dyndex---An dynamic and nonmetric space indexer. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, 466--475.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ishikawa, Y., Subramanya, R., and Faloutsos, C. 1998. Mindreader: Querying databases through multiple examples. In Proceedings of the Symposium on Very Large DataBases (VLDB). 218--227.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jones, K. S. and Willet, P. W. 1997. Readings in Information Retrieval. Morgan-Kaufman, San Mateo, Calif.]]Google ScholarGoogle Scholar
  18. Kearns, M., Li, M., and Valiant, L. 1994. Learning Boolean formulae. J. ACM, 41, 6, 1298--1328.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kearns, M. and Vazirani, U. 1994. An Introduction to Computational Learning Theory. MIT Press, Cambridge Mass.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, B., Lai, W.-C., Chang, E., and Cheng, K.-T. 2001. Mining image features for efficient query processing. In Proceedings of the 1st IEEE Data Mining Conference (San Jose, Calif.). IEEE Computer Society Press, Los Alamitos, Calif., 353--360.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Li, C., Chang, E., Garcia-Molina, H., and Wiederhold, G. 2002. Clustering for approximate similarity queries in high-dimensional spaces. IEEE Trans. Knowl. Data Eng. 14, 4, 792--808.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Li, J., Wang, J. Z., and Wiederhold, G. 2000. IRM: Integrated region matching for image retrieval. ACM Multimedia, 147--156.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mitchell, T. M. 1997. Machine Learning. McGraw-Hill, New York.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Natsev, A., Rastogi, R., and Shim, K. 1999. Walrus: A similarity retrieval algorithm for image databases. In Proceedings of the ACM SIGMOD. ACM, New York, 395--406.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ortega, M., Rui, Y., Chakrabarti, K., Warshavsky, A., Mehrotra, S., and Huang, T. S. 1999. Supporting ranked Boolean similarity queries in mars. IEEE Trans. Knowl. Data Eng. 10, 6 (Dec.), 905--925.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Perner, P., Zscherpel, U., and Jacobsen, C. 2001. A comparision between neural networks and decision trees based on data from industrial radiographic testing. Patt. Recog. Lett. 22, 47--54.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Porkaew, K., Chakrabarti, K., and Mehrotra, S. 1999a. Query refinement for multimedia similarity retrieval in mars. In Proceedings of ACM Multimedia. ACM, New York, 235--238.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Porkaew, K., Mehrota, S., and Ortega, M. 1999b. Query reformulation for content based multimedia retrieval in mars. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems. IEEE Computer Society Press, Los Alamitos, Calif., 747--751.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Raghu, P., Poongodi, R., and Yegnanarayana, B. 1997. Unsupervised texture classification using vector quantization and deterministic relaxation neural network. IEEE Trans. Image Proc. 6, 10, 1376--1387.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Robinson, J. A. and Voronkov, A. 2000. Handbook of Automated Reasoning. Elsevier Science Publishers, Amsterdam, The Netherlands.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rui, Y. and Huang, T. 2000. Optimizing learning in image retrieval. In Proceedings of IEEE Computer Vision and Pattern Recognition. IEEE Computer Society Press, Los Alamitos, Calif., 236--243.]]Google ScholarGoogle Scholar
  32. Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool in interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Tech. 8, 5 (Sept.), 644--655.]]Google ScholarGoogle Scholar
  33. Schapire, R. 1999. Theoretical views of boosting and applications. In Proceedings of the 10th International Conference on Algorithmic Learning Theory. 13--25.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Schapire, R., Freund, Y., Bartlett, P., and Lee, W. 1997. Boosting the margin: A new explanation for the effectiveness of voting methods. In Proceeding of the 14th International Conference on Machine Learning. Morgan-Kaufmann, San Mateo, Calif.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Smith, J. and Chang, S.-F. 1997. An image and video search engine for the world-wide web. Storage and Retrieval for Image and Video Databases V, Proc SPIE 3022, 84--95.]]Google ScholarGoogle Scholar
  36. Tong, S. and Chang, E. 2001. Support vector machine active learning for image retrieval. In Proceedings of the ACM Multimedia. ACM, New York, 107--118.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Valiant, L. 1984. A theory of learnable. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing. ACM, New York, 436--445.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Vapnik, V. 1998. Statistical Learning Theory. Wiley, New York.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Wu, L., Faloutsos, C., Sycara, K., and Payne, T. R. 2000. Falcon: Feedback adaptive loop for content-based retrieval. In Proceedings of the 26th VLDB Conference. 297--306.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zadeh, L. A. 1965. Fuzzy sets. Inf. Cont. 338--353.]]Google ScholarGoogle Scholar
  41. Zemke, S. 1999. Bagging imperfect predictors. In Proceedings of Artificial Neural Networks in Engineering. 1067--1072.]]Google ScholarGoogle Scholar

Index Terms

  1. MEGA---the maximizing expected generalization algorithm for learning complex query concepts

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Information Systems
          ACM Transactions on Information Systems  Volume 21, Issue 4
          October 2003
          179 pages
          ISSN:1046-8188
          EISSN:1558-2868
          DOI:10.1145/944012
          Issue’s Table of Contents

          Copyright © 2003 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 October 2003
          Published in tois Volume 21, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader