Kernel-independent component analysis


In statistics, kernel-independent component analysis is an efficient algorithm for independent component analysis which estimates source components by optimizing a generalized variance contrast function, which is based on representations in a reproducing kernel Hilbert space. Those contrast functions use the notion of mutual information as a measure of statistical independence.

Main idea

Kernel ICA is based on the idea that correlations between two random variables can be represented in a reproducing kernel Hilbert space, denoted by, associated with a feature map defined for a fixed. The -correlation between two random variables and is defined as
where the functions range over and
for fixed. Note that the reproducing property implies that for fixed and. It follows then that the -correlation between two independent random variables is zero.
This notion of -correlations is used for defining contrast functions that are optimized in the Kernel ICA algorithm. Specifically, if is a prewhitened data matrix, that is, the sample mean of each column is zero and the sample covariance of the rows is the dimensional identity matrix, Kernel ICA estimates a dimensional orthogonal matrix so as to minimize finite-sample -correlations between the columns of.