Information Theoretic Model Selection for Pattern Analysis

  Exploratory data analysis requires (i) to define a set of patterns
  hypothesized to exist in the data, (ii) to specify a suitable
  quantification principle or cost function to rank these patterns and
  (iii) to validate the inferred patterns. For data clustering, the
  patterns are object partitionings into $k$ groups; for PCA or
  truncated SVD, the patterns are orthogonal transformations with
  projections to a low-dimensional space. We propose an information
  theoretic principle for model selection and model-order
  selection. Our principle ranks competing pattern cost functions
  according to their ability to extract context sensitive information
  from noisy data with respect to the chosen hypothesis class. Sets of
  approximative solutions serve as a basis for a communication
  protocol. Analogous to [1], inferred models maximize
  the so-called approximation capacity that is the mutual information
  between coarsened training data patterns and coarsened test data
  patterns.
  We demonstrate how to apply our validation framework by the well-known Gaussian mixture model.
  
  
  
[1] 
Joachim M. Buhmann. Information theoretic model validation for clustering. 
In International Symposium on Information Theory, Austin Texas
pages 1398-1402. IEEE, 2010. 
doi: 10.1109/ISIT.2010.5513616.