Abstract
Supervised learning accounts for a lot of research activity in machine learning and many supervised learning techniques have found application in the processing of multimedia content. The defining characteristic of supervised learning is the availability of annotated training data. The name invokes the idea of a ‘supervisor’ that instructs the learning system on the labels to associate with training examples. Typically these labels are class labels in classification problems. Supervised learning algorithms induce models from these training data and these models can be used to classify other unlabelled data. In this chapter we ground or analysis of supervised learning on the theory of risk minimization. We provide an overview of support vector machines and nearest neighbour classifiers~– probably the two most popular supervised learning techniques employed in multimedia research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.
R. Avnimelech and N. Intrator. Boosted mixture of experts: An ensemble learning scheme. Neural Computation, 11(2):483–497, 1999.
A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of 23rd International Conference on Machine Learning (ICML 2006), 2006.
L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
H. Brighton and C. Mellish. Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery, 6(2):153–172, 2002.
C. Brodley. Addressing the selective superiority problem: Automatic algorithm/mode class selection. In Proceedings of the 10th International Conference on Machine Learning (ICML 93), pages 17–24. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.
R. M. Cameron-Jones. Minimum description length instance-based learning. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pages 368–373. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
Marquis J. A. Condorcet. Sur les elections par scrutiny. Histoire de l’Academie Royale des Sciences, 31–34, 1781.
N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines. Cambridge University Press, Cambridge, 2000.
P. Cunningham and J. Carney. Diversity versus quality in classification ensembles based on feature selection. In Ramon López de Mántaras and Enric Plaza, editors, Machine Learning: ECML 2000, 11th European Conference on Machine Learning, Barcelona, Catalonia, Spain, May 31–June 2, 2000, Proceedings, pages 109–116. Springer, New York, 2000.
S.J. Delany and D. Bridge. Feature-based and feature-free textual cbr: A comparison in spam filtering. In D. Bell, P. Milligan, and P. Sage, editors, Proceedings of the 17th Irish Conference on Artificial Intelligence and Cognitive Science (AICS’06), pages 244–253, 2006.
S.J. Delany and P. Cunningham. An analysis of case-base editing in a spam filtering system. In 7th European Conference on Case-Based Reasoning. Springer Verlag, New York, 2004.
H. Drucker. Improving regressors using boosting techniques. In D. H. Fisher, editor, Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8–12, 1997, pages 107–115. Morgan Kaufmann, San Francisco, CA, USA, 1997.
S. Esmeir and S. Markovitch. Anytime induction of decision trees: An iterative improvement approach. In AAAI. AAAI Press, Menlo Park, CA, USA, 2006.
G. W. Gates. The reduced nearest neighbor rule. IEEE Transactions on Information Theory, 18(3):431–433, 1972.
L. K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.
P. E. Hart. The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3):515–516, 1968.
T. K. Ho. Nearest neighbors in random subspaces. In Adnan Amin, Dov Dori, Pavel Pudil, and Herbert Freeman, editors, Advances in Pattern Recognition, Joint IAPR International Workshops SSPR ’98 and SPR ’98, Sydney, NSW, Australia, August 11–13, 1998, Proceedings, pages 640–648. Springer, New York, 1998.
T. K. Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.
E. J. Keogh, S. Lonardi, and C. Ratanamahatana. Towards parameter-free data mining. In W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors, KDD, pages 206–215. ACM, New York, Ny, USA, 2004.
R. Kohavi and D. Wolpert. Bias plus variance decomposition for zero–one loss functions. In ICML, pages 275–283. Morgan Kaufmann, 1996.
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In Gerald Tesauro, David S. Touretzky, and Todd K. Leen, editors, Advances in Neural Information Processing Systems 7, [NIPS Conference, Denver, Colorado, USA, 1994], pages 231–238. MIT Press, Cambridge, MA, USA, 1994.
S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–86, 1951.
L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2):181–207, 2003.
M. Lenz and H-D. Burkhard. Case retrieval nets: Basic ideas and extensions. In KI - Kunstliche Intelligenz, pages 227–239, 1996.
M. Lenz, H.-D.Burkhard, and S. Brückner. Applying case retrieval nets to diagnostic tasks in technical domains. In Ian F. C. Smith and Boi Faltings, editors, EWCBR, volume 1168 of Lecture Notes in Computer Science, pages 219–233. Springer, New York, 1996.
M. Li, X. Chen, X. Li, B. Ma, and P. M. B. Vitányi. The similarity metric. IEEE Transactions on Information Theory, 50(12):3250–3264, 2004.
E. McKenna and B. Smyth. Competence-guided editing methods for lazy learning. In W. Horn, editor, ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, pages 60–64. IOS Press, The Netherlands 2000.
S.I. Nitzan and J. Paroush. Collective Decision Making. Cambridge University Press, Cambridge, 1985.
G. L. Ritter, H. B. Woodruff, S. R. Lowry, and T. L. Isenhour. An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory, 21(6):665–669, 1975.
Y. Rubner, L. J. Guibas, and C. Tomasi. The earth mover’s distance, multi-dimensional scaling, and color-based image retrieval. In Proceedings of the ARPA Image Understanding Workshop, pages 661–668, 1997.
Y. Rubner, C. Tomasi, and L. J. Guibas. The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2):99–121, 2000.
J.W. Schaaf. Fish and Shrink. A next step towards efficient case retrieval in large-scale case bases. In I. Smith and B. Faltings, editors, European Conference on Case-Based Reasoning (WCBR’96, pages 362–376. Springer, New York, 1996.
R. E. Schapire. A brief introduction to boosting. In T. Dean, editor, Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, IJCAI 99, Stockholm, Sweden, July 31–August 6, 1999. 2 Volumes, 1450 pages, pages 1401–1406. Morgan Kaufmann, San Francisco, CA, USA, 1999.
B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.
J. Shawe-Taylor and N. Cristianini. Kernel methods for Pattern Analysis. Cambridge University Press, Cambridge ISBN 0-521-81397-2, 2004.
R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237:1317–1228, 1987.
B. Smyth and M. Keane. Remembering to forget: A competence preserving case deletion policy for cbr system. In C. Mellish, editor, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI (1995), pages 337–382. Morgan Kaufmann, San Francisco, CA, USA, 1995.
B. Smyth and E. McKenna. Footprint-based retrieval. In Klaus-Dieter Althoff, Ralph Bergmann, and Karl Branting, editors, ICCBR, volume 1650 of Lecture Notes in Computer Science, pages 343–357. Springer, New York, 1999.
I. Tomek. An experiment with the nearest neighbor rule. IEEE Transactions on Information Theory, 6(6):448–452, 1976.
S. Tong. Active Learning: Theory and Applications. PhD thesis, Stanford University, 2001.
A. Tsymbal, M. Pechenizkiy, and P. Cunningham. Diversity in random subspacing ensembles. In Yahiko Kambayashi, Mukesh K. Mohania, and Wolfram Wöß, editors, DaWaK, volume 3181 of Lecture Notes in Computer Science, pages 309–319. Springer, New York, 2004.
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–42, 1984.
V. Vapnik. Statistical Learning Theory. John Wiley, New York, 1998.
V. N. Vapnik and A.Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2):264–280, 1971.
K. Veropoulos. Controlling the sensivity of support vector machines. In International Joint Conference on Artificial Intelligence (IJCAI99), Stockholm, Sweden, 1999.
L. Wang. Image retrieval with svm active learning embedding euclidean search. In IEEE International Conference on Image Processing, Barcelona, September 2003.
D. Wilson and T. Martinez. Instance pruning techniques. In ICML ’97: Proceedings of the Fourteenth International Conference on Machine Learning, pages 403–411. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.
D. L. Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics, 2(3):408–421, 1972.
D. H. Wolpert The lack of a priori distinctions between learning algorithms. In Neural Computation, 7, pages 1341–1390, 1996.
J. Zhang. Selecting typical instances in instance-based learning. In Proceedings of the 9th International Conference on Machine Learning (ICML 92), pages 470–479. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
J. Zu and Q. Yang. Remembering to add: competence preserving case-addition policies for case-base maintenance. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 97), pages 234–239. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cunningham, P., Cord, M., Delany, S.J. (2008). Supervised Learning. In: Cord, M., Cunningham, P. (eds) Machine Learning Techniques for Multimedia. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75171-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-75171-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75170-0
Online ISBN: 978-3-540-75171-7
eBook Packages: Computer ScienceComputer Science (R0)