Content area

Abstract

Data mining algorithms discover knowledge from data. The knowledge are commonly expressed as dependency relationships in various forms, like rules, decision trees and Bayesian Networks (BNs). Moreover, many real-world problems are multi-class problems, in which more than one of the variables in the data set are considered as classes. However, most of the rule learners available were proposed for single-class problems only and would produce cyclic rules if they are applied to multi-class ones. In addition, most of them produce rules with conflicts, i.e. more than one of the rules classify the same data items and different rules have different predictions. Similarly, existing decision trees learners cannot handle multi-class problems, and thus cannot detect and avoid cycles. In contrast, BNs represent acyclic dependency relationships among variables, but they can handle discrete values only. They cannot manage continuous, interval and ordinal values and cannot represent higher-order relationships. Consequently, BNs have higher network complexity and lower understandability when they are used for such problems.

This thesis has studied in depth discovering dependency relationships in various forms by Evolutionary Computation (EC). Through analysis of the reasons leading to the disadvantages of rules, decision trees and BNs, and their learners, we have proposed a sequence of EAs, a novel functional dependency network (FDN) and two techniques for dependency relationship learning and for multi-class problems. They are the multi-population Genetic Programming (GP) using backward chaining procedure and the GP employing co-operating scoring stage for acyclic rules learning. The dependency network with functions can manage all kinds of values and represent any kind of relationships among variables, the flexible and robust MDLGP to learn the novel dependency network and BN. Based on the FDN we have further developed the techniques to learn rules without conflict and acyclic decision trees for multi-class problems respectively. The new self-organizing map (SOM) with expanding force for clustering and data visualization for data preprocessing have also been given in the appendix.

Details

Title
Discovering acyclic dependency relationships by Evolutionary Computation
Author
Shum, Wing Ho
Year
2007
Publisher
ProQuest Dissertations & Theses
ISBN
978-0-549-40099-8
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
304718914
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.