Abstract
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.
High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aizerman, M., Braverman, E., & Rozonoer, L. (1964). Theoretical foundations of the potential function method in pattern recognition learning.Automation and Remote Control, 25:821–837.
Anderson, T.W., & Bahadur, R.R. (1966). Classification into two multivariate normal distributions with different covariance matrices.Ann. Math. Stat., 33:420–431.
Boser, B.E., Guyon, I., & Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. InProceedings of the Fifth Annual Workshop of Computational Learning Theory, 5, 144–152. Pittsburgh, ACM.
Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Jackel, L.D., LeCun, Y., Sackinger, E., Simard, P., Vapnik, V., & Miller, U.A. (1994). Comparison of classifier methods: A case study in handwritten digit recognition.Proceedings of 12th International Conference on Pattern Recognition and Neural Network.
Bromley, J., & Sackinger, E. (1991). Neural-network andk-nearest-neighbor classifiers. Technical Report 11359-910819-16TM, AT&T.
Cournant, R., & Hilbert, D. (1953).Methods of Mathematical Physics, Interscience, New York.
Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems.Ann. Eugenics, 7:111–132.
LeCun, Y. (1985). Une procedure d'apprentissage pour reseau a seuil assymetrique.Cognitiva 85: A la Frontiere de l'Intelligence Artificielle des Sciences de la Connaissance des Neurosciences, 599–604, Paris.
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., & Jackel, L.D. (1990). Handwritten digit recognition with a back-propagation network.Advances in Neural Information Processing Systems, 2, 396–404, Morgan Kaufman.
Parker, D.B. (1985). Learning logic. Technical Report TR-47, Center for Computational Research in Economics and Management Science, Massachusetts Institute of Technology, Cambridge, MA.
Rosenblatt, F. (1962).Principles of Neurodynamics, Spartan Books, New York.
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by backpropagating errors.Nature, 323:533–536.
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1987). Learning internal representations by error propagation. In James L. McClelland & David E. Rumelhart (Eds.),Parallel Distributed Processing, 1, 318–362, MIT Press.
Vapnik, V.N. (1982).Estimation of Dependences Based on Empirical Data, Addendum 1, New York: Springer-Verlag.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Cortes, C., Vapnik, V. Support-vector networks. Mach Learn 20, 273–297 (1995). https://doi.org/10.1007/BF00994018
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00994018