Content area
Abstract
Data collection and storage is becoming fast and cheap, as a result, it has become challenging to discover useful knowledge from large amount of data that aid business decisions. The goal of data mining is to discover hidden knowledge automatically from large amount of data. Discriminative pattern mining is one of the branch of data mining in searching for patterns that have different statistical properties among different groups of data set. Different algorithms have been proposed for this problem, but most of the algorithms utilize tree structure, which is difficult to parallelize and limit to the available memory. New approaches for discriminative pattern mining are therefore needed.This thesis develops algorithms for discriminative pattern mining based on Self-Organizing Map (SOM). The potential benefits for the new mechanisms are more flexibility, more control on memory usage and parallelizability. These advantages make it a good candidate for the Big Data environment and have a potential impact on various industries like stock markets and macroeconomics. The new algorithm will be tested, and its performance will be compared against existing algorithm using real data in a simulated environment. Two potential extensions will also be described for potential improvement in precision and time performance.