Skip to main content

Unsupervised Classifier Selection Based on Two-Sample Test

  • Conference paper
Discovery Science (DS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5255))

Included in the following conference series:

Abstract

We propose a well-founded method of ranking a pool of m trained classifiers by their suitability for the current input of n instances. It can be used when dynamically selecting a single classifier as well as in weighting the base classifiers in an ensemble. No classifiers are executed during the process. Thus, the n instances, based on which we select the classifier, can as well be unlabeled. This is rare in previous work. The method works by comparing the training distributions of classifiers with the input distribution. Hence, the feasibility for unsupervised classification comes with a price of maintaining a small sample of the training data for each classifier in the pool.

In the general case our method takes time \(O\!\left(m(t+n)^2\right)\) and space \(O\!\left(mt+n\right)\), where t is the size of the stored sample from the training distribution for each classifier. However, for commonly used Gaussian and polynomial kernel functions we can execute the method more efficiently. In our experiments the proposed method was found to be accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhu, X., Wu, X., Yang, Y.: Effective classification of noisy data streams with attribute-oriented dynamic classifier selection. Knowledge and Information Systems 9(3), 339–363 (2006)

    Article  MathSciNet  Google Scholar 

  2. Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis 8(3), 281–300 (2004)

    Google Scholar 

  3. Gama, J., Medas, P., Castillo, G., Rodrigues, P.P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Merz, C.J.: Dynamical selection of learning algorithms. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data: Artificial Intelligence and Statistics. Lecture Notes in Statistics, vol. 112, pp. 281–290. Springer, Berlin (1996)

    Google Scholar 

  5. Ko, A.H., Sabourin, R., Britto Jr., A.S.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recognition 41(5), 1735–1748 (2008)

    Article  Google Scholar 

  6. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8, 2755–2790 (2007)

    MATH  Google Scholar 

  7. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Getoor, L., Senator, T.E., Domingos, P., Faloutsos, C. (eds.) Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM Press, New York (2003)

    Chapter  Google Scholar 

  8. Ali, S., Smith, K.A.: On learning algorithm selection for classification. Applied Soft Computing 6(2), 119–138 (2006)

    Article  Google Scholar 

  9. Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)

    Article  Google Scholar 

  10. Watanabe, O.: Sequential sampling techniques for algorithmic learning theory. Theoretical Computer Science 348(1), 3–14 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Wu, X., Chu, C.H., Wang, Y., Liu, F., Yue, D.: Privacy preserving data mining research: Current status and key issues. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4489, pp. 762–772. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Janssen, F., Fürnkranz, J.: On meta-learning rule learning heuristics. In: Proceedings of the Seventh IEEE International Conference on Data Mining, pp. 529–534. IEEE Computer Society, Los Alamitos (2007)

    Chapter  Google Scholar 

  13. Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D.P., Schapire, R.E., Warmuth, M.K.: How to use expert advice. Journal of the ACM 44(3), 427–485 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  14. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pp. 459–468. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  15. Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM Press, New York (2002)

    Google Scholar 

  16. Chan, T.M.: Closest-point problems simplified on the RAM. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 472–473. SIAM, Philadelphia (2002)

    Google Scholar 

  17. Gretton, A., Borgwardt, K.M., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample-problem. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 513–520. MIT Press, Cambridge (2007)

    Google Scholar 

  18. Gretton, A., Borgwardt, K.M., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel approach to comparing distributions. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, pp. 1637–1641. AAAI Press, Menlo Park (2007)

    Google Scholar 

  19. Smola, A.J., Gretton, A., Song, L., Schölkopf, B.: A Hilbert space embedding for distributions. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 13–31. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  20. Borgwardt, K.M., Gretton, A., Rasch, M.J., Kriegel, H.P., Schölkopf, B., Smola, A.J.: Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 22(14), 49–57 (2006)

    Article  Google Scholar 

  21. Huang, J., Smola, A.J., Gretton, A., Borgwardt, K.M., Schölkopf, B.: Correcting sample selection bias by unlabeled data. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 601–608. MIT Press, Cambridge (2007)

    Google Scholar 

  22. Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research 2, 67–93 (2001)

    MathSciNet  Google Scholar 

  23. Raykar, V.C., Duraiswami, R.: The improved fast Gauss transform with applications to machine learning. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large-Scale Kernel Machines, pp. 175–201. MIT Press, Cambridge (2007)

    Google Scholar 

  24. Yang, C., Duraiswami, R., Davis, L.S.: Efficient kernel machines using the improved fast Gauss transform. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 1561–1568. MIT Press, Cambridge (2004)

    Google Scholar 

  25. Lee, D., Gray, A.G., Moore, A.W.: Dual-tree fast gauss transforms. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 747–754. MIT Press, Cambridge (2006)

    Google Scholar 

  26. Herbster, M.: Learning additive models online with fast evaluating kernels. In: Helmbold, D.P., Williamson, B. (eds.) COLT 2001 and EuroCOLT 2001. LNCS (LNAI), vol. 2111, pp. 444–460. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  27. Bengio, Y., LeCun, Y.: Scaling learning algorithms towards AI. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large-Scale Kernel Machines, pp. 321–388. MIT Press, Cambridge (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Berlin Heidelberg

About this paper

Cite this paper

Aho, T., Elomaa, T., Kujala, J. (2008). Unsupervised Classifier Selection Based on Two-Sample Test. In: Jean-Fran, JF., Berthold, M.R., Horváth, T. (eds) Discovery Science. DS 2008. Lecture Notes in Computer Science(), vol 5255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88411-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88411-8_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88410-1

  • Online ISBN: 978-3-540-88411-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics