Content area
Full Text
Artif Intell Rev (2006) 26:159190 DOI 10.1007/s10462-007-9052-3
Machine learning: a review of classication and combining techniques
S. B. Kotsiantis I. D. Zaharakis P. E. Pintelas
Published online: 10 November 2007 Springer Science+Business Media B.V. 2007
Abstract Supervised classication is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large number of techniques have been developed based on Articial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various classication algorithms and the recent attempt for improving classication accuracyensembles of classiers.
Keywords Classiers Data mining techniques Intelligent data analysis Learning algorithms
1 Introduction
There are several applications for Machine Learning (ML), the most significant of which is predictive data mining. Every instance in any dataset used by machine learning algorithms is represented using the same set of features. The features may be continuous, categorical
S. B. Kotsiantis (B) P. E. PintelasDepartment of Computer Science and Technology, University of Peloponnese, Peloponnese Greece e-mail: [email protected]
S. B. Kotsiantis P. E. PintelasEducational Software Development Laboratory, Department of Mathematics, University of Patras,P. O. Box 1399, Patras Greece
P. E. Pintelase-mail: [email protected]
I. D. Zaharakis Computer Technology Institute, Patras Greece e-mail: [email protected]
123
160 S. B. Kotsiantis et al.
or binary. If instances are given with known labels (the corresponding correct outputs) then the learning is called supervised, in contrast to unsupervised learning, where instances are unlabeled (Jain et al. 1999).
Numerous ML applications involve tasks that can be set up as supervised. In the present paper, we have concentrated on the techniques necessary to do this. In particular, this work is concerned with classication problems in which the output of instances admits only discrete, unordered values. We have limited our references to recent refereed journals, published books and conferences. A brief review of what ML includes can be found in (Dutton and Conroy 1996). De Mantaras and Armengol (1998) also presented a historical survey of logic...