Content area

Abstract

A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 through 1998. We present the problem of mining contrast sets: conjunctions of attributes and values that differ meaningfully in their distribution across groups. We provide a search algorithm for mining contrast sets with pruning rules that drastically reduce the computational complexity. Once the contrast sets are found, we post-process the results to present a subset that are surprising to the user given what we have already shown. We explicitly control the probability of Type I error (false positives) and guarantee a maximum error rate for the entire analysis by using Bonferroni corrections.

Details

Title
Detecting Group Differences: Mining Contrast Sets
Author
Bay, Stephen D; Pazzani, Michael J
Pages
213-246
Publication year
2001
Publication date
Jul 2001
Publisher
Springer Nature B.V.
ISSN
13845810
e-ISSN
1573756X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
230119610
Copyright
Kluwer Academic Publishers 2001