🔗 NMF
The NMF class implements non-negative matrix factorization, a technique to
decompose a large (potentially sparse) matrix V into two smaller matrices W
and H, such that V ~= W * H, and W and H only contain nonnegative
elements. This technique may be used for dimensionality reduction, or as part
of a recommender system.
The NMF class allows fully configurable behavior via template
parameters. For more general
matrix factorization strategies, see the
AMF (alternating matrix factorization)
class documentation.
Simple usage example:
// Create a random sparse matrix (V) of size 10x100, with 15% nonzeros.
arma::sp_mat V;
V.sprandu(100, 100, 0.15);
// W and H will be low-rank matrices of size 100x10 and 10x100.
arma::mat W, H;
mlpack::NMF nmf; // Step 1: create object.
double residue = nmf.Apply(V, 10, W, H); // Step 2: apply NMF to decompose V.
// Now print some information about the factorized matrices.
std::cout << "W has size: " << W.n_rows << " x " << W.n_cols << "."
<< std::endl;
std::cout << "H has size: " << H.n_rows << " x " << H.n_cols << "."
<< std::endl;
std::cout << "RMSE of reconstructed matrix: "
<< arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem) << "." << std::endl;
Quick links:
- Constructors: create
NMFobjects. Apply(): applyNMFdecomposition to data.- Examples of simple usage and links to detailed example projects.
- Template parameters for using different update rules, initialization strategies, and termination criteria.
- Advanced template examples of use with custom template parameters.
See also:
- mlpack transformations
- Non-negative matrix factorization on Wikipedia
- Learning the parts of objects by non-negative matrix factorization (original NMF paper, pdf)
🔗 Constructors
nmf = NMF()- Create an
NMFobject. - The rank of the decomposition is specified in the call to
Apply().
- Create an
nmf = NMF(SimpleResidueTermination(minResidue=1e-5, maxIterations=10000))- Create an NMF object with custom termination parameters.
minResidue(adouble) specifies the minimum difference of the norm ofW * Hbetween iterations for termination.maxIterationsspecifies the maximum number of iterations before decomposition terminates.
🔗 Applying Decompositions
double residue = nmf.Apply(V, rank, W, H)- Decompose the matrix
Vinto two non-negative matricesWandHwith rankrank. Wwill be set to sizeV.n_rowsxrank.Hwill be set to sizerankxV.n_cols.WandHare initialized randomly using the Acol initialization strategy; i.e., each column ofWis an average of 5 random columns ofV, andHis initialized uniformly randomly.- The residue (change in the norm of
W * Hbetween iterations) is returned.
- Decompose the matrix
Notes:
-
Low values of
rankwill give smaller matricesWandH, but the decomposition will be less accurate. Larger values ofrankwill give more accurate decompositions, but will take longer to compute. Every problem is different, sorankmust be specified manually. -
The expression
W * Hcan be used to reconstruct the matrixV. -
Custom behavior, such as custom initialization of
WandH, different or custom termination rules, and different update rules are discussed in the advanced functionality section.
Apply() Parameters:
| name | type | description |
|---|---|---|
V |
arma::sp_mat or arma::mat |
Input matrix to be factorized. |
rank |
size_t |
Rank of decomposition; lower is smaller, higher is more accurate. |
W |
arma::mat |
Output matrix in which W will be stored. |
H |
arma::mat |
Output matrix in which H will be stored. |
Note: Matrices with different element types can be used for V, W, and
H; e.g., arma::fmat. While V can be sparse or dense, W and H must be
dense matrices.
🔗 Simple Examples
See also the simple usage example for a trivial use of
NMF.
Decompose a dense matrix with custom termination parameters.
// Create a low-rank V matrix by multiplying together two random matrices.
arma::mat V = arma::randu<arma::mat>(500, 25) *
arma::randn<arma::mat>(25, 5000);
// Create the NMF object with a looser tolerance of 1e-3 and a maximum of 100
// iterations only.
mlpack::NMF nmf(mlpack::SimpleResidueTermination(1e-3, 500));
arma::mat W, H;
// Decompose with a rank of 25.
// W will have size 500 x 25, and H will have size 25 x 5000.
const double residue = nmf.Apply(V, 25, W, H);
std::cout << "Residue of decomposition: " << residue << "." << std::endl;
// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
Decompose the sparse MovieLens dataset using a rank-12 decomposition and float
element type.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;
mlpack::data::Load("movielens-100k.csv", V, true);
// Create the NMF object.
mlpack::NMF nmf;
arma::fmat W, H;
// Decompose the Movielens dataset with rank 12.
const double residue = nmf.Apply(V, 12, W, H);
std::cout << "Residue of MovieLens decomposition: " << residue << "."
<< std::endl;
// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
Compare quality of decompositions of MovieLens with different ranks.
// See https://datasets.mlpack.org/movielens-100k.csv
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
// Create the NMF object.
mlpack::NMF nmf;
arma::mat W, H;
for (size_t rank = 10; rank <= 100; rank += 15)
{
// Decompose with the given rank.
const double residue = nmf.Apply(V, rank, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE for rank-" << rank << " decomposition: " << rmse << "."
<< std::endl;
}
🔗 Advanced Functionality: Template Parameters
The NMF class has three template parameters that can be used for custom
behavior. The full signature of the class is:
NMF<TerminationPolicyType, InitializationRuleType, UpdateRuleType>
TerminationPolicyType: the strategy used to choose when to terminate NMF.InitializationRuleType: the strategy used to choose the initialWandHmatrices.UpdateRuleType: the update rules used for NMF:NMFMultiplicativeDistanceUpdate: update rule that ensure the Frobenius norm of the reconstruction error is decreasing at each iteration.NMFMultiplicativeDivergenceUpdate: update rules that ensure Kullback-Leibler divergence is decreasing at each iteration.NMFALSUpdate: alternating least-squares projections forWandH.- For custom update rules, use the more general
AMFclass.
🔗 Advanced Functionality: TerminationPolicyType
- Specifies the strategy to use to choose when to stop the NMF algorithm.
- An instantiated
TerminationPolicyTypecan be passed to the NMF constructor. - The following choices are available for drop-in usage:
SimpleResidueTermination (default):
- Terminates when a maximum number of iterations is reached, or when the
residue (change in norm of
W * Hbetween iterations) is sufficiently small. - Constructor:
SimpleResidueTermination(minResidue=1e-5, maxIterations=10000)minResidue(adouble) specifies the sufficiently small residue for termination.maxIterations(asize_t) specifies the maximum number of iterations.
nmf.Apply()will return the residue of the last iteration.
MaxIterationTermination:
- Terminates when the maximum number of iterations is reached.
- No other condition is checked.
- Constructor:
MaxIterationTermination(maxIterations=1000) nmf.Apply()will return the number of iterations performed.
SimpleToleranceTermination<MatType, WHMatType>:
- Terminates when the nonzero residual decreases a sufficiently small relative
amount between iterations (e.g.
(lastNonzeroResidual - nonzeroResidual) / lastNonzeroResidualis below a threshold), or when the maximum number of iterations is reached. - The residual must remain below the threshold for a specified number of iterations.
- The nonzero residual is defined as the root of the sum of squared elements in
the reconstruction error matrix
(V - WH), limited to locations whereVis nonzero. - Constructor:
SimpleToleranceTermination<MatType, WHMatType>(tol=1e-5, maxIter=10000, extraSteps=3)MatTypeshould be set to the type ofV(seeApply()Parameters).WHMatType(defaultarma::mat) should be set to the type ofWandH(seeApply()Parameters).tol(adouble) specifies the relative nonzero residual tolerance for convergence.maxIter(asize_t) specifies the maximum number of iterations before termination.extraSteps(asize_t) specifies the number of iterations where the relative nonzero residual must be below the tolerance for convergence.
- The best
WandHmatrices (according to the nonzero residual) from the finalextraStepsiterations are returned bynmf.Apply(). nmf.Apply()will return the nonzero residue of the iteration corresponding to the bestWandHmatrices.
ValidationRMSETermination<MatType>:
- Holds out a validation set of nonzero elements from
V, and terminates when the RMSE (root mean squared error) on this validation set is sufficiently small between iterations. - The validation RMSE must remain below the threshold for a specified number of iterations.
MatTypeshould be set to the type ofV(seeApply()Parameters).- Constructor:
ValidationRMSETermination<MatType>(V, numValPoints, tol=1e-5, maxIter=10000, extraSteps=3)Vis the matrix to be decomposed byApply(). This will be modified (validation elements will be removed).numValPoints(asize_t) specifies number of test points fromVto be held out.tol(adouble) specifies the relative tolerance for the validation RMSE for termination.maxIter(asize_t) specifies the maximum number of iterations before termination.extraSteps(asize_t) specifies the number of iterations where the validation RMSE must be below the tolerance for convergence.
- The best
WandHmatrices (according to the validation RMSE) from the finalextraStepsiterations are returned bynmf.Apply(). nmf.Apply()will return the best validation RMSE.
Custom policies:
- A custom class for termination behavior must implement the following functions.
// You can use this as a starting point for implementation.
class CustomTerminationPolicy
{
public:
// Initialize the termination policy for the given matrix V. (It is okay to
// do nothing.) This function is called at the beginning of Apply().
//
// If the termination policy requires V to compute convergence, store a
// reference or pointer to it in this function.
template<typename MatType>
void Initialize(const MatType& V);
// Check if convergence has occurred for the given W and H matrices. Return
// `true` if so.
//
// Note that W and H may have different types than V (i.e. V may be sparse,
// and W and H must be dense.)
template<typename WHMatType>
bool IsConverged(const MatType& H, const MatType& W);
// Return the value that should be returned for the `nmf.Apply()` function
// when convergence has been reached. This is called at the end of
// `nmf.Apply()`.
const double Index();
// Return the number of iterations that have been completed. This is called
// at the end of `nmf.Apply()`.
const size_t Iteration();
};
🔗 Advanced Functionality: InitializationRuleType
- Specifies the strategy to use to initialize
WandHat the beginning of the NMF algorithm. - An initialization
InitializationRuleTypecan be passed to the following constructor:nmf = NMF(terminationPolicy, initializationRule)
- The following choices are available for drop-in usage:
RandomAcolInitialization<N> (default):
- Initialize
Wby averagingNrandomly chosen columns ofV. - Initialize
Has uniform random in the range[0, 1]. - The default value for
Nis 5. - See also the paper describing the strategy.
NoInitialization:
- When
nmf.Apply(V, rank, W, H), the existing values ofWandHwill be used. - If
Wis not of sizeV.n_rowsxrank, or ifHis not of sizerankxV.n_cols, astd::invalid_argumentexception will be thrown.
GivenInitialization<MatType>:
- Set
Wand/orHto the given matrices whenApply()is called. MatTypeshould be set to the type ofWorH(defaultarma::mat); seeApply()Parameters.- Constructors:
GivenInitialization<MatType>(W, H)- Specify both initial
WandHmatrices.
- Specify both initial
GivenInitialization<MatType>(M, isW=true)- If
isWistrue, then set initialWtoM. - If
isWisfalse, then set initialHtoM. - This constructor is meant to only be used with
MergeInitialization(below).
- If
RandomInit:
- Initialize
WandHas uniform random in the range[0, 1].
AverageInitialization:
- Initialize each element of
WandHto the square root of the average value ofV, adding uniform random noise in the range[0, 1].
MergeInitialization<WRule, HRule>:
- Use two different initialization rules, one for
W(WRule) and one forH(HRule). - Constructors:
MergeInitialization<WRule, HRule>()- Create the merge initialization with default-constructed rules for
WandH.
- Create the merge initialization with default-constructed rules for
MergeInitialization<WRule, HRule>(wRule, hRule)- Create the merge initialization with instantiated rules for
WandH. wRuleandhRulewill be copied.
- Create the merge initialization with instantiated rules for
- Any
WRuleandHRuleclasses must implement theInitializeOne()function.
Custom rules:
- A custom class for initializing
WandHmust implement the following functions.
// You can use this as a starting point for implementation.
class CustomInitialization
{
public:
// Initialize the W and H matrices, given V and the rank of the decomposition.
// This is called at the start of `Apply()`.
//
// Note that `MatType` may be different from `WHMatType`; e.g., `V` could be
// sparse, but `W` and `H` must be dense.
template<typename MatType, typename WHMatType>
void Initialize(const MatType& V,
const size_t rank,
WHMatType& W,
WHMatType& H);
// Initialize one of the W or H matrices, given V and the rank of the
// decomposition.
//
// If `isW` is `true`, then `M` should be treated as though it is `W`;
// if `isW` is `false`, then `M` should be treated as thought it is `H`.
//
// This function only needs to be implemented if it is intended to use the
// custom initialization strategy with `MergeInitialization`.
template<typename MatType, typename WHMatType>
void InitializeOne(const MatType& V,
const size_t rank,
WHMatType& M,
const bool isW);
};
🔗 Advanced Functionality Examples
Use a pre-specified initialization for W and H.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W, H;
// Pre-initialize W and H.
// W will be filled with random values from a normal distribution.
// H will be filled with 1s.
W.randn(V.n_rows, 15);
H.set_size(15, V.n_cols);
H.fill(0.2);
mlpack::NMF<mlpack::SimpleResidueTermination, mlpack::NoInitialization> nmf;
const double residue = nmf.Apply(V, 15, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of NMF decomposition with pre-specified W and H: " << rmse
<< "." << std::endl;
Use ValidationRMSETermination to decompose the MovieLens dataset until the
RMSE of the held-out validation set is sufficiently low.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W, H;
// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);
// Create the NMF object with the instantiated termination policy.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>> nmf(t);
// Perform NMF with a rank of 20.
// Note the RMSE returned here is the RMSE on the validation set.
const double rmse = nmf.Apply(V, 20, W, H);
const double rmseTrain = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "Training RMSE: " << rmseTrain << "." << std::endl;
std::cout << "Validation RMSE: " << rmse << "." << std::endl;
Use all three sets of NMF update rules and compare the RMSE on a held-out validation set.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W1, W2, W3;
arma::mat H1, H2, H3;
// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);
// Multiplicative distance update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDistanceUpdate> nmf1(t);
// Multiplicative divergence update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDivergenceUpdate> nmf2(t);
// Alternating least squares update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFALSUpdate> nmf3(t);
const double rmse1 = nmf1.Apply(V, 15, W1, H1);
const double rmse2 = nmf2.Apply(V, 15, W2, H2);
const double rmse3 = nmf3.Apply(V, 15, W3, H3);
// Print the RMSEs.
std::cout << "Mult. dist. update RMSE: " << rmse1 << "." << std::endl;
std::cout << "Mult. div. update RMSE: " << rmse2 << "." << std::endl;
std::cout << "ALS update RMSE: " << rmse3 << "." << std::endl;
Use a custom termination policy that sets a limit on how long NMF is allowed to take. First, we define the termination policy:
class CustomTimeTermination
{
public:
CustomTimeTermination(const double totalAllowedTime) :
totalAllowedTime(totalAllowedTime) { }
template<typename MatType>
void Initialize(const MatType& /* V */)
{
totalTime = 0.0;
iteration = 0;
c.tic();
}
template<typename WHMatType>
bool IsConverged(const WHMatType& /* W */, const WHMatType& /* H */)
{
totalTime += c.toc();
c.tic();
++iteration;
return (totalTime > totalAllowedTime);
}
const double Index() const { return totalTime; }
const size_t Iteration() const { return iteration; }
private:
double totalAllowedTime;
double totalTime;
size_t iteration;
arma::wall_clock c; // used for convenient timing
};
Then we can use it in the test program:
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;
mlpack::data::Load("movielens-100k.csv", V, true);
CustomTimeTermination t(5 /* seconds */);
mlpack::NMF<CustomTimeTermination> nmf(t);
arma::fmat W, H;
const double actualTime = nmf.Apply(V, 10, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "Actual time used for decomposition: " << actualTime << "."
<< std::endl;
std::cout << "RMSE after ~5 seconds: " << rmse << "." << std::endl;