Fisher's linear discriminant. Fisher defined the separation between two
distributions to be the ratio of the variance between the classes to
the variance within the classes, which is, in some sense, a measure
of the signal-to-noise ratio for the class labeling. FLD finds a linear
combination of features which maximizes the separation after the projection.
The resulting combination may be used for dimensionality reduction
before later classification.
The terms Fisher's linear discriminant and LDA are often used
interchangeably, although FLD actually describes a slightly different
discriminant, which does not make some of the assumptions of LDA such
as normally distributed classes or equal class covariances.
When the assumptions of LDA are satisfied, FLD is equivalent to LDA.
FLD is also closely related to principal component analysis (PCA), which also
looks for linear combinations of variables which best explain the data.
As a supervised method, FLD explicitly attempts to model the
difference between the classes of data. On the other hand, PCA is a
unsupervised method and does not take into account any difference in class.
One complication in applying FLD (and LDA) to real data
occurs when the number of variables/features does not exceed
the number of samples. In this case, the covariance estimates do not have
full rank, and so cannot be inverted. This is known as small sample size
problem.
References
- Robust and Accurate Cancer Classification with Gene Expression Profiling http://alumni.cs.ucr.edu/~hli/paper/hli05tumor.pdf.