Identification of interaction patterns and classification with applications to microarray data
Anne-Laure Boulesteix; Gerhard Tutz
Emerging patterns represent a class of interaction structures which has been recently proposed as a tool in data mining. In this paper, a new and more general definition refering to underlying probabilities is proposed. The defined interaction patterns carry information about the relevance of combinations of variables for distinguishing between classes. Since they are formally quite similar to the leaves of a classification tree, we propose a fast and simple method which is based on the CART algorithm to find the corresponding empirical patterns in data sets. In simulations, it can be shown that the method is quite effective in identifying patterns. In addition, the detected patterns can be used to define new variables for classification. Thus, we propose a simple scheme to use the patterns to improve the performance of classification procedures. The method may also be seen as a scheme to improve the performance of CARTs concerning the identification of interaction patterns as well as the accuracy of prediction.