Making a Low-Dimensional Representation Suitable for Diverse Tasks

Intrator, Nathan; Edelman, Shimon

doi:10.1007/978-1-4615-5529-2_6

Nathan Intrator &
Shimon Edelman

2554 Accesses
2 Citations

Abstract

We introduce a new approach to the training of classifiers for performance on multiple tasks. The proposed hybrid training method leads to improved generalization via a better low-dimensional representation of the problem space. The quality of the representation is assessed by embedding it in a 2D space using multidimensional scaling, allowing a direct visualization of the results. The performance of the approach is demonstrated on a highly nonlinear image classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

High-Dimensional Classification

Relative Intrinsic Dimensionality Is Intrinsic to Learning

A Multi-attribute Classification Method to Solve the Problem of Dimensionality

References

Baxt, W. G. and White, H. (1995). Bootstrapping confidence intervals for clinical input variable effects in network trained to identify the presence of acute myocardial infraction. Neural Computation, 7(3): 624–638.
Article Google Scholar
Baxter, J. (1995). Learning internal representations. In Proc. COLT’95.
Google Scholar
Bellman, R. E. (1961). Adaptive Control Processes. Princeton University Press, Princeton, NJ.
MATH Google Scholar
Borg, I. and Lingoes, J. (1987). Multidimensional Similarity Structure Analysis. Springer, Berlin.
Book Google Scholar
Breiman, L. (1992). Stacked regression. Technical Report TR-367, Department of Statistics, University of California, Berkeley.
Google Scholar
Breiman, L. (1994). Bagging predictors. Technical Report TR-421, Department of Statistics, University of California, Berkeley.
Google Scholar
Brigham, J. C. (1986). The influence of race on face recognition. In Ellis, H. D., Jeeves, M. A., and Newcombe, F., editors, Aspects of face processing, pages 170–177. Martinus Nijhoff, Dordrecht.
Chapter Google Scholar
Caruana, R. (1993). Multitask connectionist learning. In Proceedings of the 1993 Connectionist Models Summer School, pages 372–379, San Mateo, CA.
Google Scholar
Caruana, R. (1995). Learning many related tasks at the same time with backpropagation. In Tesauro, G., Touretzky, D., and Leen, T, editors, Advances in Neural Information Processing Systems, volume 7, pages 657–664. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Cutzu, F. and Edelman, S. (1995). Explorations of shape space. CS-TR 95-01, Weizmann Institute of Science.
Google Scholar
Edelman, S. (1995a). Representation of similarity in 3D object discrimination. Neural Computation, 7:407–422.
Article Google Scholar
Edelman, S. (1995b). Representation, Similarity, and the Chorus of Prototypes. Minds and Machines, 5:45–68.
Article Google Scholar
Efron, B. and Tibshirani, R. (1993). An introduction to the bootstrap. Chapman and Hall, London.
MATH Google Scholar
Gasser, M. (1995). Transfer in a connectionist model of the acquisition of morphology.CogSci TR 147, Indiana University, Bloomington, IN. an expanded version of a paper presented at the Morphology Workshop, Nijmegen, June 13, 1995.
Google Scholar
Grossman, T. and Lapedes, A. (1993). Use of bad training data for better prediction. In Cowan, J. D., Tesauro, G., and Alspector, J., editors, Advances in Neural Information Processing Systems, volume 6, pages 342–350. Morgan Kaufmann.
Google Scholar
Hintzman, D. L. (1994). Twenty-five years of learning and memory: was the cognitive revolution a mistake? In Umiltá, C. and Moscovitch, M., editors, Attention and Performance, volume XV, chapter 16, pages 360–391. MIT Press.
Google Scholar
Hofmann, T. and Buhmann, J. (1994). Multidimensional scaling and data clustering. In J. D. Cowan, G. T. and Alspector, J., editors, Neural Information Processing Systems, volume 7, pages 459–466. Morgan Kaufmann.
Google Scholar
Huber, P. J. (1985). Projection pursuit (with discussion). The Annals of Statistics, 13:435–475.
Article MathSciNet MATH Google Scholar
Intrator, N. (1993). Combining exploratory projection pursuit and projection pursuit regression with application to neural networks. Neural Computation, 5(3):443–455.
Article Google Scholar
Intrator, N. and Cooper, L. N. (1991) Objective function formulation of the BCM theory of visual cortical plasticity: Statistical connections, stability conditions. Neural Networks, 5, 3–17
Article Google Scholar
Kramer, A. F., Strayer, D. L., and Buckley, J. (1990). Development and transfer of automatic processing. Journal of Experimental Psychology: Human Perception and Performance, 16:505–522.
Article Google Scholar
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1): 1–27.
Article MathSciNet MATH Google Scholar
Lando, M. and Edelman, S. (1995). Receptive field spaces and class-based generalization from a single view in face recognition. Network, 6:551–576.
Article Google Scholar
LeBlanc, M. and Tibshirani, R. (1994). Combining estimates in regression and classification. Preprint.
Google Scholar
Logan, G. (1988). Towards an instance theory of automatization. Psychological Review, 95:492–527.
Article Google Scholar
Maddox, W. T. and Ashby, F. G. (1993). Comparing decision bound and exemplar models of categorization. Perception and Psychophysics, 53:49–70.
Article Google Scholar
Martin, G. (1988). The effects of old learning on new in hopfield and backpropagation nets. Technical Report ACA-HI-019, Microelectronics and Computer Technology Corporation (MCC).
Google Scholar
McLaren, I. P. L., Leevers, H. J., and Mackintosh, N. J. (1994). Recognition, categorization, and perceptual learning (or, how learning to classify things together helps one to tell them apart). In Umiltá, C. and Moscovitch, M., editors, Attention and Performance, volume XV, chapter 35, pages 889–909. MIT Press.
Google Scholar
Murre, J. M. J. (1995). Transfer of learning in backpropagation networks and in related neural network models. In Levy, Bairaktaris, Bullinaria, and Cairns, editors, Connectionist Models of Memory and Language. UCL Press, London. To appear.
Google Scholar
Pickover, C. (1990). Computers, Pattern, Chaos, and Beauty. St. Martin’s Press.
Google Scholar
Pratt, L. Y. (1993). Transferring previously learned back-propagation neural networks to new learning tasks. Technical report ml-tr-37, Rutgers University, CS Dept.
Google Scholar
Price, D., Knerr, S., Personnaz, L., and Dreyfus, G. (1995). Pairwise neural network classifiers with probabilistic outputs. In G. Tesauro, D. S. T. and Leen, T. K., editors,Advances in Neural Information Processing 7, pages 1109–1116. MIT Press.
Google Scholar
Raviv, Y. and Intrator, N. (1996). Bootstrapping with noise: An effective regularization technique. Connection Science, Special issue on Combining Estimators, 8:356–372.
Google Scholar
Reder, L. and Klatzky, R. L. (1994). Transfer: training for performance. In Druckman, D. and Bjork, R. A., editors, Learning, remembering, believing: enhancing human performance, chapter 3, pages 25–56. National Academy Press, Washington, DC. Also available as TR CMU-CS-94-187; The effect of context on training: is learning situated?
Google Scholar
Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Trans. Compute18:401–409.
Article Google Scholar
Sas (1989). SAS/STAT User’s Guide, Version 6. SAS Institute Inc., Cary, NC.
Google Scholar
Shepard, R. N. (1966). Metric structures in ordinal data. J. Math. Psychology, 3:287–315.
Article Google Scholar
Shepard, R. N. (1980). Multidimensional scaling, tree-fitting, and clustering. Science, 210:390–397.
Article MathSciNet MATH Google Scholar
Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237:1317–1323.
Article MathSciNet MATH Google Scholar
Simard, P., Victorri, B., LeCun, Y., and Denker, J. (1992). Tangent prop-a formalism for specifying selected invariances in an adaptive network. In Moody, J., Lippman, R., and Hanson, S. J., editors, Neural Information Processing Systems, volume 4, pages895–903. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions (with discussion). J. Royal Statistics Society B, 36:111–147.
MATH Google Scholar
Thrun, S. and Mitchell, T. (1995). Learning one more thing. In Mellish, C., editor, Proc. 14th IJCAI, volume 2, pages 1217–1223, San Mateo, CA. Morgan Kaufmann.
Google Scholar
Young, G. and Householder, A. S. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika, 3:19–22.
Article MATH Google Scholar

Download references

Authors

Nathan Intrator
View author publications
You can also search for this author in PubMed Google Scholar
Shimon Edelman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Carnegie Mellon University, USA
Sebastian Thrun
Evolving Systems, Inc., USA
Lorien Pratt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Intrator, N., Edelman, S. (1996). Making a Low-Dimensional Representation Suitable for Diverse Tasks. In: Thrun, S., Pratt, L. (eds) Learning to Learn. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5529-2_6

Download citation

DOI: https://doi.org/10.1007/978-1-4615-5529-2_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7527-2
Online ISBN: 978-1-4615-5529-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics