Learning Machines

Jankowski, Norbert; Grabczewski, Krzysztof

doi:10.1007/978-3-540-35488-8_2

Norbert Jankowski⁶ &
Krzysztof Grabczewski⁶

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 207))

9429 Accesses

Abstract

Learning from data may be a very complex task. To satisfactorily solve a variety of problems, many different types of algorithms may need to be combined. Feature extraction algorithms are valuable tools, which prepare data for other learning methods. To estimate their usefulness one must examine the whole complex processes they are parts of.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Machine Learning

Fundamental Concepts of Machine Learning

Supervised Machine Learning in a Nutshell

References

R. Adamczak, W. Duch, and N. Jankowski. New developments in the feature space mapping model. In Third Conference on Neural Networks and Their Applications, pages 65–70, Kule, Poland, 1997. Polish Neural Networks Society.
Google Scholar
A. Ben-Hur, D. Horn, H.T. Siegelman, and V. Vapnik. Suppor vector clustering. Journal of Machine Learning Research, 2:125–137, 2001.
Article Google Scholar
K. P. Bennett and O. L. Mangasarian. Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1:23–34, 1992.
Article Google Scholar
C.M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, London, UK, 1995.
Google Scholar
C.M. Bishop, M. Svensén, and C.K.I. Williams. EM optimization of latent-variable density models. In Advances in Neural Information Processing Systems, volume 8. MIT Press, Cambridge, MA, 1996.
Google Scholar
L. Bobrowski and M. Krtowski. Induction of multivariate decision trees by using dipolar criteria. In D. A. Zighed, J. Komorowski, and J. M. ytkow, editors, Principles of data mining and knowledge discovery: 5th European Conference: PKDD’2000, pages 331–336, Berlin, 2000. Springer Verlag.
Google Scholar
B.E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Fifth Annual Workshop on Computational Learning Theory, pages 144–152. ACM, 1992.
Google Scholar
L. Breiman. Bias, variance, and arcing classifiers. Technical Report Technical Report 460, Statistics Department, University of California, Berkeley, CA 94720, April 1996.
Google Scholar
L. Breiman. Bias-variance, regularization, instability and stabilization. In C. M. Bishop, editor, Neural Networks and Machine Learning, pages 27–56. Springer, 1998.
Google Scholar
L. Breiman, J. H. Friedman, A. Olshen, and C. J. Stone. Classification and regression trees. Wadsworth, Belmont, CA, 1984.
MATH Google Scholar
W. Buntine. Learning classification trees. In D. J. Hand, editor, Artificial Intelligence frontiers in statistics, pages 182–201. Chapman & Hall, London, 1993. URL citeseer.nj.nec.com/buntine91learning.html.
Google Scholar
C. Campbell and C.V. Perez. Target switching algorithm: a constructive learning procedure for feed-forward neural networks. Neural Networks, pages 1221–1240, 1995.
Google Scholar
B. Cestnik. Estimating probabilities: A crucial task in machine learning. In Proceedings of the Ninth European Conference on Artificial Intelligence, pages 147–149, 1990.
Google Scholar
S. Chen, S.A. Billings, and W. Luo. Orthogonal least squares methods and their application to non-linear system identification. International Journal of Control, 50:1873–1896, 1989.
Article MATH MathSciNet Google Scholar
V. Cherkassky and F. Mulier. Learning from data. Adaptive and learning systems for signal processing, communications and control. John Wiley & Sons, Inc., New York, 1998.
Google Scholar
C. Cortes and V. Vapnik. Soft margin classifiers. Machine Learning, 20:273–297, 1995.
MATH Google Scholar
T.M. Cover and P.E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27, 1967.
Article MATH Google Scholar
G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems, 2:303–314, 1989.
Article MATH MathSciNet Google Scholar
J. P. Marques de Sá. Pattern Recognition. Concepts, Methods and Applications. Springer Verlag, 2001.
Google Scholar
W. Duch. Similarity based methods: a general framework for classification, approximation and association. Control and Cybernetics, 29:937–968, 2000.
MATH MathSciNet Google Scholar
W Duch and K. Grabczewski. Heterogeneous adaptive systems. In Proceedings of the World Congress of Computational Intelligence, Honolulu, May 2002.
Google Scholar
W. Duch and N. Jankowski. Survey of neural transfer functions. Neural Computing Surveys, 2:163–212, 1999.
Google Scholar
W. Duch, R. Adamczak, and K. Grabczewski. Extraction of logical rules from backpropagation networks. Neural Processing Letters, 7:1–9, 1998.
Article Google Scholar
W. Duch, Ł. Itert, and K. Grudziński. Competent undemocratic committees. In L. Rutkowski and J. Kacprzyk, editors, 6th International Conference on Neural Networks and Soft Computing, pages 412–417, Zakopane, Poland, 2002. Springer-Verlag.
Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Patter Classification. John Wiley and Sons, New York, 2001.
Google Scholar
R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
MATH Google Scholar
S.E. Fahlman. Fast-learning variations on back-propagation: An empirical study. In D. Touretzky, G. Hinton, and T. Sejnowski, editors, Proceedings of the 1988 Connectionist Models Summer School, pages 38–51, Pittsburg, 1989. Morgan Kaufmann, San Mateo.
Google Scholar
S.E. Fahlman and C. Lebiere. The cascade-correlation learning architecture. In D.S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 524–532, Denver, CO, 1990. Morgan Kaufmann, San Mateo.
Google Scholar
E. Fiesler. Comparative bibliography of ontogenic neural networks. In International Conference on Artificial Neural Networks, pages 793–796, 1994.
Google Scholar
W. Finnoff, F. Hergert, and H.G. Zimmermann. Improving model detection by nonconvergent methods. Neural Networks, 6(6):771–783, 1993.
Article Google Scholar
R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7:179–188, 1936. Reprinted in Contributions to Mathematical Statistics, John Wiley & Sons, New York, 1950.
Google Scholar
R. Fletcher and C.M. Reeves. Function minimization by conjugate gradients. Computer journal, 7:149–154, 1964.
Article MATH MathSciNet Google Scholar
M. Frean. Small nets and short paths: optimizing neural computation. PhD thesis, Center for cognitive science. University of Edinburgh, 1990.
Google Scholar
Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of Thirteenth International Conference, pages 148–156, 1996.
Google Scholar
Y. Freund and R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Science, 55(1):119–139, 1997.
Article MATH MathSciNet Google Scholar
J. Friedman, T. Hastie, and R. Tibshirani. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, 2001.
Google Scholar
G.M. Fung and O.L. Mangasarian. A feature selection newton method for support vector machine classification. Comput. Optim. Appl., 28(2):185–202, 2004. ISSN 0926-6003. doi: http://dx.doi.org/10.1023/B:COAP.0000026884.66338.df.
Article MATH MathSciNet Google Scholar
K. Grabczewski and W. Duch. A general purpose separability criterion for classification systems. In Proceedings of the 4th Conference on Neural Networks and Their Applications, pages 203–208, Zakopane, Poland, June 1999.
Google Scholar
M. Grochowski and N. Jankowski. Comparison of instances seletion algorithms II: Algorithms survey. In Artificial Intelligence and Soft Computing, pages 598–603, 2004.
Google Scholar
I. Guyon and D.G. Stork. Advances in large margin classifiers, chapter Linear discriminant and support vector classiers, pages 147–169. MIT Press, 2000.
Google Scholar
I. Guyon, J. Weston, and S. Barnhilland V. Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 2002.
Google Scholar
E.J. Hartman, J.D. Keeler, and J.M. Kowalski. Layered neural networks with gaussian hidden units as universal approximations. Neural Computation, 2:210–215, 1990.
Article Google Scholar
B. Hassibi and D.G. Stork. Second order derivatives for network pruning: Optimal brain surgeon. In C.L. Giles, S.J. Hanson, and J.D. Cowan, editors, Advances in Neural Information Processing Systems 5, pages 164–171, San Mateo, CA, 1993. Morgan Kaufmann.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
Google Scholar
S. Haykin. Neural Networks — A Comprehensive Foundation. Maxwell MacMillian Int., New York, 1994.
MATH Google Scholar
G.E. Hinton. Learning translation invariant in massively parallel networks. In J.W. de Bakker, A.J. Nijman, and P.C. Treleaven, editors, Proceedings of PARLE Conference on Parallel Architectures and Languages Europe, pages 1–13, Berlin, 1987. Springer-Verlag.
Google Scholar
K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2:359–366, 1989.
Article Google Scholar
R.A. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton. Adaptive mixture of local experts. Neural Computation, 3:79–87, 1991.
Article Google Scholar
N. Jankowski and W. Duch. Optimal transfer function neural networks. In 9th European Symposium on Artificial Neural Networks, pages 101–106, Bruges, Belgium, 2001.
Google Scholar
N. Jankowski and V. Kadirkamanathan. Statistical control of RBF-like networks for classification. In 7th International Conference on Artificial Neural Networks, pages 385–390, Lausanne, Switzerland, October 1997. Springer-Verlag.
Google Scholar
N. Jankowski, K. Grabczewski, and W. Duch. Ghostminer 3.0. FQS Poland, Fujitsu, Kraków, Poland, 2003.
Google Scholar
Norbert Jankowski and Krzysztof Grabczewski. Heterogenous committees with competence analysis. In Proceedings of the Fifth International conference on Hybrid Intelligent Systems.
Google Scholar
T. Joachims. Advances in kernel methods — support vector learning, chapter Making large-scale SVM learning practical. MIT Press, Cambridge, MA, 1998.
Google Scholar
G. H. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, San Mateo, 1995. Morgan Kaufmann Publishers.
Google Scholar
M.I. Jordan and R.A. Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6:181–214, 1994.
Article Google Scholar
R. Kohavi. Wrappers for performance enhancement and oblivious decision graphs. PhD thesis, Stanford University, 1995.
Google Scholar
R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools with Artificial Intelligence, pages 234–245. IEEE Computer Society Press, 1996. http://www.sgi.com/tech/mlc.
R. Kohavi, B. Becker, and D. Sommerfield. Improving simple bayes. In Proceedings of the European Conference on Machine Learning, 1997. URL citeseer.nj. nec.com/kohavi97improving.html.
Google Scholar
T. Kohonen. Self-organizing maps. Springer, Heidelberg Berlin, 1995.
Google Scholar
S. Kullback and R.A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:76–86, 1951.
Article MathSciNet Google Scholar
L.I. Kuncheva. Combining Pattern Classifiers. Methods and Algorithms. Wiley-Interscience, 2004.
Google Scholar
Y. LeCun, J.S. Denker, and S.A. Solla. Optimal brain damage. In D.S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 598–605, Denver, CO, 1990. Morgan Kaufmann, San Mateo.
Google Scholar
K. Levenberg. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathematics, II(2):164–168, 1944.
MathSciNet Google Scholar
W.-Y. Loh and Y.-S. Shih. Split selection methods for classification trees. Statistica Sinica, 7:815–840, 1997.
MATH MathSciNet Google Scholar
W.-Y. Loh and N. Vanichsetakul. Tree-structured classification via generalized discriminant analysis (with discussion). Journal of the American Statistical Association, 83:715–728, 1988.
Article MATH MathSciNet Google Scholar
R. Maclin. Boosting classifiers regionally. In Proceeding of AAAI, 1998.
Google Scholar
D.W. Marquardt. An algorithm for least-squares estimation of non-linear parameters. Journal of the Society of Industrial and Applied Mathematics, 11(2): 431–441, 1963.
Article MATH MathSciNet Google Scholar
W. S. McCulloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115–133, 1943.
Article MATH MathSciNet Google Scholar
M. Mézard and J.-P. Nadal. Learning in feedforward layered networks: The tiling algorithm. Journal of Physics A, 22:2191–2204, 1989.
Article Google Scholar
T. Mitchell. Machine learning. McGraw Hill, 1997.
Google Scholar
W. Müller and F. Wysotzki. Automatic construction of decision trees for classification. Annals of Operations Research, 52:231–247, 1994.
Article MATH Google Scholar
S. K. Murthy, S. Kasif, and S. Salzberg. A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2:1–32, August 1994.
MATH Google Scholar
R.M. Neal. Bayesian Learning for Neural Networks. Number 118 in Lecture Notes in Statistics. Springer-Verlag, 1996.
Google Scholar
M. Orr. Introduction to radial basis function networks. Technical report, Centre for Cognitive Science, University of Edinburgh, 1996.
Google Scholar
E. Osuna, R. Freund, and F. Girosi. Training support vector machines: An application to face detection. In CVPR’97, pages 130–136, New York, NY, 1997. IEEE.
Google Scholar
J. Park and I.W. Sandberg. Universal approximation using radial-basis-function networks. Neural Computation, 3(2):246–257, 1991.
Article Google Scholar
Z. Pawlak. Rough sets. International Journal of Computer and Information Sciences, 11(5):341–356, 1982.
Article MATH MathSciNet Google Scholar
J. Platt. A resource-allocating network for function interpolation. Neural Computation, 3(2):213–225, 1991.
Article MathSciNet Google Scholar
J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods — Support Vector Learning. MIT Press, Cambridge, MA., 1998.
Google Scholar
T. Poggio and F. Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78(9):1481–1497, 1990.
Article Google Scholar
F. Provost and P. Domingos. Well-trained PETs: Improving probability estimation trees. Technical Report IS-00-04, Stern School of Business, New York University, 2000.
Google Scholar
J. Quinlan. Programs for machine learning, 1993.
Google Scholar
J. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
Google Scholar
J. R. Quinlan and R. M. Cameron-Jones. Oversearching and layered search in empirical learning. In IJCAI, pages 1019–1024, 1995. URL citeseer.nj.nec. com/quinlan95oversearching.html.
Google Scholar
B. D. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, 1996.
MATH Google Scholar
J. Rissanen. Modeling by shortest data description. Automatica, 14:445–471, 1978.
Article Google Scholar
F. Rosenblatt. Principles of Neurodynamics. Spartan, New York, 1962.
MATH Google Scholar
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representations by error propagation. In J. L. McCleland D. E. Rumelhart, editor, Parallel Distributed Processing: Explorations in Microstructure of Congnition, volume 1: Foundations, pages 318–362. Cambridge, 1986.
Google Scholar
R. Schalkoff. Pattern Recognition: statistical, structural and neural approaches. Wiley, 1992.
Google Scholar
R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651–1686, 1998.
Article MATH MathSciNet Google Scholar
B. Schölkopf and A.J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2001.
Google Scholar
B. Schölkopf, A.J. Smola, R.C. Williamson, and P.L. Bartlett. New support vector algorithms. Neural Computation, 12:1207–1245, 2000.
Article Google Scholar
B. Schölkopf, J.C. Platt, J. Shawe-Taylor, A.J. Smola, and R.C. Williamson. Estimating the support of a high-dimensional distribution. Neural Computation, 13(7):1443–1471, 2001.
Article MATH Google Scholar
C.E. Shannon and W. Weaver. The mathematical theory of communication. University of Illinois Press, Urbana, 1949.
MATH Google Scholar
S.K. Shevade, S.S. Keerthi, C. Bhattacharyya, and K.R.K. Murthy. Improvements to the SMO algorithm for SVM regression. IEEE Transactions on Neural Networks, 11:1188–1194, Sept. 2000.
Article Google Scholar
A.N. Tikhonov. On solving incorrectly posed problems and method of regularization. Doklady Akademii Nauk USSR, 151:501–504, 1963.
Google Scholar
A.N. Tikhonov and V.Y. Arsenin. Solutions of Ill-posed Problems. W.H. Winston, Washington D.C., 1977.
MATH Google Scholar
P. E. Utgoff and C. E. Brodley. Linear machine decision trees. Technical Report UMCS-1991-010, Department of Computer Science, University of Massachusetts,, 1991. URL citeseer.nj.nec.com/utgoff91linear.html.
Google Scholar
A.S. Weigend, D.E. Rumelhart, and B.A. Huberman. Back-propagation, weight elimination and time series prediction. In Proceedings of the 1990 Connectionist Models Summer School, pages 65–80, Los Altos, Palo Alto, San Francisco, 1990. Morgan Kaufmann, San Mateo.
Google Scholar
A.S. Weigend, D.E. Rumelhart, and B.A. Huberman. Generalization by weight elimination with application to forecasting. In Advances in Neural Information Processing Systems 3, pages 875–882, Los Altos, Palo Alto, San Francisco, 1991. Morgan Kaufmann, San Mateo.
Google Scholar
P.J. Werbose. Beyond regression: New tools for prediction and analysis in the bahavioral sciences. PhD thesis, Harvard Univeristy, Cambridge, MA, 1974.
Google Scholar
J. Weston, A. Elisseeff, and B. Schölkopf. Use of the λ ₀-norm with linear models and kernel methods. Technical report, Biowulf Technologies, 2001.
Google Scholar
D.R. Wilson and T.R. Martinez. Reduction techniques for instance-based learning algorithms. Machine Learning, 38:257–286, 2000.
Article MATH Google Scholar
D.R. Wilson and T.R. Martinez. Instance pruning techniques. In 14th International Conference on Machine Learning, pages 403–411. Morgan Kaufmann, 1997. URL citeseer.nj.nec.com/wilson97instance.html.
Google Scholar
D.H. Wolpert. Stacked generalization. Neural Networks, 5:241–249, 1992.
Article Google Scholar
B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proc. 18th International Conf. on Machine Learning, pages 609–616. Morgan Kaufmann, San Francisco, CA, 2001. URL citeseer.nj.nec.com/zadrozny01obtaining.html.
Google Scholar
J.M. Zurada. Artificial neural systems. West Publishing Company, 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Norbert Jankowski & Krzysztof Grabczewski

Authors

Norbert Jankowski
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Grabczewski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Clopinet, 955 Creston Road, 94708, Berkeley, USA
Isabelle Guyon
Department of Electrical Engineering & Computer Science — EECS, University of California, 94720, Berkeley, USA
Masoud Nikravesh
School of Electronics and Computer Sciences, University of Southampton, SO17 1BJ, Southampton Highfield, UK
Steve Gunn
Division of Computer Science Lab. Electronics Research, University of California, Soda Hall 387, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jankowski, N., Grabczewski, K. (2006). Learning Machines. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds) Feature Extraction. Studies in Fuzziness and Soft Computing, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-35488-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-35488-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35487-1
Online ISBN: 978-3-540-35488-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Learning Machines

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Machine Learning

Fundamental Concepts of Machine Learning

Supervised Machine Learning in a Nutshell

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning Machines

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Machine Learning

Fundamental Concepts of Machine Learning

Supervised Machine Learning in a Nutshell

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation