Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.
This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (In that case the coefficients in the output are based on the normalized data, not the original data --- this is important for interpreting the classifier.)
Multi-class problems are solved using pairwise classification (aka 1-vs-1).
To obtain proper probability estimates, use the option that fits calibration models to the outputs of the support vector machine. In the multi-class case, the predicted probabilities are coupled using Hastie and Tibshirani's pairwise coupling method.
Note: for improved speed normalization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, 1998.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy (2001). Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649.
Trevor Hastie, Robert Tibshirani: Classification by Pairwise Coupling. In: Advances in Neural Information Processing Systems, 1998.
BibTeX:
@incollection{Platt1998,
author = {J. Platt},
booktitle = {Advances in Kernel Methods - Support Vector Learning},
editor = {B. Schoelkopf and C. Burges and A. Smola},
publisher = {MIT Press},
title = {Fast Training of Support Vector Machines using Sequential Minimal Optimization},
year = {1998},
URL = {http://research.microsoft.com/\~jplatt/smo.html},
PS = {http://research.microsoft.com/\~jplatt/smo-book.ps.gz},
PDF = {http://research.microsoft.com/\~jplatt/smo-book.pdf}
}
@article{Keerthi2001,
author = {S.S. Keerthi and S.K. Shevade and C. Bhattacharyya and K.R.K. Murthy},
journal = {Neural Computation},
number = {3},
pages = {637-649},
title = {Improvements to Platt's SMO Algorithm for SVM Classifier Design},
volume = {13},
year = {2001},
PS = {http://guppy.mpe.nus.edu.sg/\~mpessk/svm/smo_mod_nc.ps.gz}
}
@inproceedings{Hastie1998,
author = {Trevor Hastie and Robert Tibshirani},
booktitle = {Advances in Neural Information Processing Systems},
editor = {Michael I. Jordan and Michael J. Kearns and Sara A. Solla},
publisher = {MIT Press},
title = {Classification by Pairwise Coupling},
volume = {10},
year = {1998},
PS = {http://www-stat.stanford.edu/\~hastie/Papers/2class.ps}
}
Valid options are:
-no-checks
Turns off all checks - use with caution!
Turning them off assumes that data is purely numeric, doesn't
contain any missing values, and has a nominal class. Turning them
off also means that no header information will be stored if the
machine is linear. Finally, it also assumes that no instance has
a weight equal to 0.
(default: checks on)
-C <double>
The complexity constant C. (default 1)
-N
Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-L <double>
The tolerance parameter. (default 1.0e-3)
-P <double>
The epsilon for round-off error. (default 1.0e-12)
-M
Fit calibration models to SVM outputs.
-V <double>
The number of folds for the internal
cross-validation. (default -1, use training data)
-W <double>
The random number seed. (default 1)
-K <classname and parameters>
The Kernel to use.
(default: weka.classifiers.functions.supportVector.PolyKernel)
-calibrator <scheme specification>
Full name of calibration model, followed by options.
(default: "weka.classifiers.functions.Logistic")
-output-debug-info
If set, classifier is run in debug mode and
may output additional info to the console
-do-not-check-capabilities
If set, classifier capabilities are not checked before classifier is built
(use with caution).
-num-decimal-places
The number of decimal places for the output of numbers in the model (default 2).
Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
-E <num>
The Exponent to use.
(default: 1.0)
-L
Use lower-order terms.
(default: no)
-C <num>
The size of the cache (a prime number), 0 for full cache and
-1 to turn it off.
(default: 250007)
-output-debug-info
Enables debugging output (if available) to be printed.
(default: off)
-no-checks
Turns off all checks - use with caution!
(default: checks on)
Options specific to calibrator weka.classifiers.functions.Logistic:
-C
Use conjugate gradient descent rather than BFGS updates.
-R <ridge>
Set the ridge in the log-likelihood.
-M <number>
Set the maximum number of iterations (default -1, until convergence).
-output-debug-info
If set, classifier is run in debug mode and
may output additional info to the console
-do-not-check-capabilities
If set, classifier capabilities are not checked before classifier is built
(use with caution).
-num-decimal-places
The number of decimal places for the output of numbers in the model (default 2).