🔗 LinearSVM
The LinearSVM
class implements an L2-regularized support vector machine for
numerical data that can train using any ensmallen optimizer. The class offers
standard classification functionality. Linear SVM is useful for multi-class
classification (i.e. classes are 0
, 1
, 2
, etc.).
Simple usage example:
// Train a linear SVM classifier on random data and predict labels:
// All data and labels are uniform random; 5 dimensional data, 4 classes.
// Replace with a data::Load() call or similar for a real application.
arma::mat dataset(5, 1000, arma::fill::randu); // 1000 points.
arma::Row<size_t> labels =
arma::randi<arma::Row<size_t>>(1000, arma::distr_param(0, 3));
arma::mat testDataset(5, 500, arma::fill::randu); // 500 test points.
mlpack::LinearSVM svm; // Step 1: create model.
svm.Train(dataset, labels, 4); // Step 2: train model.
arma::Row<size_t> predictions;
svm.Classify(testDataset, predictions); // Step 3: classify points.
// Print some information about the test predictions.
std::cout << arma::accu(predictions == 1) << " test points classified as class "
<< "1." << std::endl;
Quick links:
- Constructors: create
LinearSVM
objects. Train()
: train model.Classify()
: classify with a trained model.- Other functionality for loading, saving, and inspecting.
- Examples of simple usage and links to detailed example projects.
- Template parameters for using different element types for a model.
See also:
🔗 Constructors
svm = LinearSVM()
- Initialize the parameters of the model without training.
- You will need to call
Train()
later to train the model before callingClassify()
.
svm = LinearSVM(dimensionality, numClasses, lambda=0.0001, delta=1.0, fitIntercept=false)
- Initialize the model without training, to default weights.
Classify()
can immediately be called andParameters()
returns valid weights, but the model is otherwise untrained.- The model should be trained with
Train()
before callingClassify()
.
svm = LinearSVM(data, labels, numClasses, lambda=0.0001, delta=1.0, fitIntercept=false, [callbacks...])
- Train model, optionally specifying ensmallen callbacks for use during optimization.
svm = LinearSVM(data, labels, numClasses, optimizer, lambda=0.0001, delta=1.0, fitIntercept=false, [callbacks...])
- Train model with a custom ensmallen optimizer, optionally specifying callbacks for use during optimization.
Constructor Parameters:
name | type | description | default |
---|---|---|---|
data |
arma::mat |
Column-major training matrix. | (N/A) |
labels |
arma::Row<size_t> |
Training labels, between 0 and numClasses - 1 (inclusive). Should have length data.n_cols . |
(N/A) |
dimensionality |
size_t |
Dimension of input data (if data is not specified). Should be equal to data.n_rows . |
(N/A) |
numClasses |
size_t |
Number of classes in the dataset. | (N/A) |
optimizer |
any ensmallen optimizer | Instantiated ensmallen optimizer for differentiable functions or differentiable separable functions. | ens::L_BFGS() |
lambda |
double |
L2 regularization penalty parameter. Must be nonnegative. | 0.0 |
delta |
double |
Margin of difference between correct class and other classes. | 1.0 |
fitIntercept |
bool |
If true , then an intercept term is fitted to the model. |
false |
callbacks... |
any set of ensmallen callbacks | Optional callbacks for the ensmallen optimizer, such as e.g. ens::ProgressBar() , ens::Report() , or others. |
(N/A) |
As an alternative to passing lambda
, delta
, or fitIntercept
, these can be
set with a standalone method. The following functions can be used before
calling Train()
:
svm.Lambda() = lambda;
will set the L2 regularization penalty parameter tolambda
.svm.Delta() = delta;
will set the margin of difference todelta
.svm.FitIntercept() = fitIntercept;
will set whether the model fits an intercept tofitIntercept
.
🔗 Training
If training is not done as part of the constructor call, it can be done with the
Train()
function:
svm.Train(data, labels, numClasses, [callbacks...])
svm.Train(data, labels, numClasses, optimizer, [callbacks...])
- Train model without changing any hyperparameters, optionally using a custom ensmallen optimizer and specifying callbacks for use during optimization.
svm.Train(data, labels, numClasses, lambda=0.0001, delta=1.0, fitIntercept=false, [callbacks...])
svm.Train(data, labels, numClasses, optimizer, lambda=0.0001, delta=1.0, fitIntercept=false, [callbacks...])
- Train model on the given data, specifying hyperparameters and optionally also a custom ensmallen optimizer and callbacks for use during optimization.
Types of each argument are the same as in the table for constructors above.
Note: Training is not incremental. Successive calls to Train()
will
train entirely new models.
🔗 Classification
Once a LinearSVM
model is trained, the Classify()
member function
can be used to make class predictions for new data.
size_t predictedClass = svm.Classify(point)
- (Single-point)
- Classify a single point, returning the predicted class (
0
throughnumClasses - 1
, inclusive).
svm.Classify(point, prediction, probabilitiesVec)
- (Single-point)
- Classify a single point and compute class probabilities.
- The predicted class is stored in
prediction
. - The probability of class
i
can be accessed withprobabilitiesVec[i]
.
svm.Classify(data, predictions)
- (Multi-point)
- Classify a set of points.
- The prediction for data point
i
can be accessed withpredictions[i]
.
svm.Classify(data, predictions, probabilities)
- (Multi-point)
- Classify a set of points and compute class probabilities.
- The prediction for data point
i
can be accessed withpredictions[i]
. - The probability of class
j
for data pointi
can be accessed withprobabilities(j, i)
.
Classification Parameters:
usage | name | type | description |
---|---|---|---|
single-point | point |
arma::vec |
Single point for classification. |
single-point | prediction |
size_t& |
size_t to store class prediction into. |
single-point | probabilitiesVec |
arma::vec& |
arma::vec& to store class probabilities into; will have length 2. |
 |  |  |  |
multi-point | data |
arma::mat |
Set of column-major points for classification. |
multi-point | predictions |
arma::Row<size_t>& |
Vector of size_t s to store class prediction into; will be set to length data.n_cols . |
multi-point | probabilities |
arma::mat& |
Matrix to store class probabilities into (number of rows will be equal to 2; number of columns will be equal to data.n_cols ). |
🔗 Other Functionality
-
A
LinearSVM
model can be serialized withdata::Save()
anddata::Load()
. -
svm.Parameters()
will return the parameters of the model as anarma::mat
with eitherdata.n_rows
rows (ifFitIntercept()
isfalse
) ordata.n_rows + 1
rows (ifFitIntercept()
istrue
), andnumClasses
columns. The weight for dimensioni
for classj
can be accessed withsvm.Parameters()(i, j)
. IfFitIntercept()
istrue
, the last row ofsvm.Parameters()
represents the bias parameters for each class. -
svm.FeatureSize()
will return the number of features in the model. This is equivalent todata.n_rows
when the model was trained. The output is only valid if the model has been trained. -
svm.ComputeAccuracy(data, labels)
will return the accuracy of the model on the givendata
with the givenlabels
. The returned accuracy is between 0 and 100.
🔗 Simple Examples
See also the simple usage example for a trivial usage
of the LinearSVM
class.
Train a linear SVM using a custom SGD-like optimizer with callbacks.
// See https://datasets.mlpack.org/satellite.train.csv.
arma::mat dataset;
mlpack::data::Load("satellite.train.csv", dataset, true);
// See https://datasets.mlpack.org/satellite.train.labels.csv.
arma::Row<size_t> labels;
mlpack::data::Load("satellite.train.labels.csv", labels, true);
mlpack::LinearSVM svm;
svm.Lambda() = 0.1;
// Create AMSGrad optimizer with custom step size and batch size.
ens::AMSGrad optimizer(0.01 /* step size */, 16 /* batch size */);
optimizer.MaxIterations() = 100 * dataset.n_cols; // Allow 100 epochs.
// Print a progress bar and an optimization report when training is finished.
svm.Train(dataset, labels, 2, optimizer, ens::ProgressBar(), ens::Report());
// Now predict on test labels and compute accuracy.
// See https://datasets.mlpack.org/satellite.test.csv.
arma::mat testDataset;
mlpack::data::Load("satellite.test.csv", testDataset, true);
// See https://datasets.mlpack.org/satellite.test.labels.csv.
arma::Row<size_t> testLabels;
mlpack::data::Load("satellite.test.labels.csv", testLabels, true);
std::cout << std::endl;
std::cout << "Accuracy on training set: "
<< svm.ComputeAccuracy(dataset, labels) << "\%." << std::endl;
std::cout << "Accuracy on test set: "
<< svm.ComputeAccuracy(testDataset, testLabels) << "\%." << std::endl;
Train a linear SVM with SGD and save the model every epoch using a custom ensmallen callback:
// This callback saves the model into "model-<epoch>.bin" after every epoch.
class ModelCheckpoint
{
public:
ModelCheckpoint(mlpack::LinearSVM<>& model) : model(model) { }
template<typename OptimizerType, typename FunctionType, typename MatType>
bool EndEpoch(OptimizerType& /* optimizer */,
FunctionType& /* function */,
const MatType& /* coordinates */,
const size_t epoch,
const double /* objective */)
{
const std::string filename = "model-" + std::to_string(epoch) + ".bin";
mlpack::data::Save(filename, "svm", model, true);
return false; // Do not terminate the optimization.
}
private:
mlpack::LinearSVM<>& model;
};
With that callback available, the code to train the model is below:
// See https://datasets.mlpack.org/satellite.train.csv.
arma::mat dataset;
mlpack::data::Load("satellite.train.csv", dataset, true);
// See https://datasets.mlpack.org/satellite.train.labels.csv.
arma::Row<size_t> labels;
mlpack::data::Load("satellite.train.labels.csv", labels, true);
mlpack::LinearSVM svm;
// Create AdaDelta optimizer with a small step size and batch size of 1.
ens::AdaDelta adaDelta(0.001, 1);
adaDelta.MaxIterations() = 100 * dataset.n_cols; // 100 epochs maximum.
// Use the custom callback and an L2 penalty parameter of 0.01, with default
// delta and fitting an intercept.
svm.Train(dataset, labels, 2, adaDelta, 0.01, 1.0, true, ModelCheckpoint(svm),
ens::ProgressBar());
// Now files like model-1.bin, model-2.bin, etc. should be saved on disk.
Load a linear SVM from disk and print some information about it.
mlpack::LinearSVM svm;
// This assumes that a model called "svm" has been saved to the file
// "model-1.bin" (as in the previous example).
mlpack::data::Load("model-1.bin", "svm", svm, true);
// Print the dimensionality of the model and some other statistics.
std::cout << "The dimensionality of the model in model-1.bin is "
<< svm.FeatureSize() << "." << std::endl;
if (svm.FitIntercept())
{
std::cout << "Intercept values for each class: " << std::endl;
for (size_t i = 0; i < svm.Parameters().n_cols; ++i)
{
std::cout << " - Class " << i << ": "
<< svm.Parameters()(svm.Parameters().n_rows - 1, i) << "." << std::endl;
}
}
else
{
std::cout << "The model does not have an intercept fitted." << std::endl;
}
std::cout << "The L2 regularization penalty parameter is: " << svm.Lambda()
<< "." << std::endl;
std::cout << "Weights for the first dimension are: "
<< svm.Parameters().row(0);
🔗 Advanced Functionality: Different Element Types
The LinearSVM
class has one template parameter that can be used to
control the element type of the model. The full signature of the class is:
LinearSVM<ModelMatType>
ModelMatType
specifies the type of matrix used for training data and internal
representation of model parameters.
-
Any matrix type that implements the Armadillo API can be used.
-
Train()
andClassify()
functions themselves are templatized and can allow any matrix type that has the same element type. So, for instance, aLinearSVM<arma::mat>
can accept anarma::sp_mat
for training.
The example below trains a linear SVM on sparse 32-bit floating point data, but uses dense 32-bit floating point matrices to store the model itself.
// Create random, sparse 100-dimensional data, with 3 classes.
arma::sp_fmat dataset;
dataset.sprandu(100, 5000, 0.3);
arma::Row<size_t> labels =
arma::randi<arma::Row<size_t>>(5000, arma::distr_param(0, 2));
mlpack::LinearSVM<arma::fmat> svm(dataset, labels, 3);
// Now classify a test point.
arma::sp_fvec point;
point.sprandu(100, 1, 0.3);
size_t prediction;
arma::fvec probabilitiesVec;
svm.Classify(point, prediction, probabilitiesVec);
std::cout << "Prediction for random test point: " << prediction << "."
<< std::endl;
std::cout << "Class probabilities for random test point: "
<< probabilitiesVec.t();
Note: dense objects should be used for ModelMatType
, since in general
L2-regularized models are fully dense.