Information Theoretic Model Selection for Pattern Analysis
Exploratory data analysis requires (i) to define a set of patterns
hypothesized to exist in the data, (ii) to specify a suitable
quantification principle or cost function to rank these patterns and
(iii) to validate the inferred patterns. For data clustering, the
patterns are object partitionings into $k$ groups; for PCA or
truncated SVD, the patterns are orthogonal transformations with
projections to a low-dimensional space. We propose an information
theoretic principle for model selection and model-order
selection. Our principle ranks competing pattern cost functions
according to their ability to extract context sensitive information
from noisy data with respect to the chosen hypothesis class. Sets of
approximative solutions serve as a basis for a communication
protocol. Analogous to [1], inferred models maximize
the so-called approximation capacity that is the mutual information
between coarsened training data patterns and coarsened test data
patterns.
We demonstrate how to apply our validation framework by the well-known Gaussian mixture model.
[1]
Joachim M. Buhmann. Information theoretic model validation for clustering.
In International Symposium on Information Theory, Austin Texas
pages 1398-1402. IEEE, 2010.
doi: 10.1109/ISIT.2010.5513616.