- Machine Learning Algorithms
- Giuseppe Bonaccorso
- 255字
- 2021-07-02 18:53:31
Non-negative matrix factorization
When the dataset is made up of non-negative elements, it's possible to use non-negative matrix factorization (NNMF) instead of standard PCA. The algorithm optimizes a loss function (alternatively on W and H) based on the Frobenius norm:

If dim(X) = n x m, then dim(W) = n x p and dim(H) = p x m with p equal to the number of requested components (the n_components parameter), which is normally smaller than the original dimensions n and m.
The final reconstruction is purely additive and it has been shown that it's particularly efficient for images or text where there are normally no non-negative elements. In the following snippet, there's an example using the Iris dataset (which is non-negative). The init parameter can assume different values (see the documentation) which determine how the data matrix is initially processed. A random choice is for non-negative matrices which are only scaled (no SVD is performed):
from sklearn.datasets import load_iris
from sklearn.decomposition import NMF
>>> iris = load_iris()
>>> iris.data.shape
(150L, 4L)
>>> nmf = NMF(n_components=3, init='random', l1_ratio=0.1)
>>> Xt = nmf.fit_transform(iris.data)
>>> nmf.reconstruction_err_
1.8819327624141866
>>> iris.data[0]
array([ 5.1, 3.5, 1.4, 0.2])
>>> Xt[0]
array([ 0.20668461, 1.09973772, 0.0098996 ])
>>> nmf.inverse_transform(Xt[0])
array([ 5.10401653, 3.49666967, 1.3965409 , 0.20610779])
NNMF, together with other factorization methods, will be very useful for more advanced techniques, such as recommendation systems and topic modeling.
- 新編Visual Basic程序設計上機實驗教程
- C和C++安全編碼(原書第2版)
- Python王者歸來
- 秒懂設計模式
- Visual C
- Haskell Data Analysis Cookbook
- Frank Kane's Taming Big Data with Apache Spark and Python
- Emgu CV Essentials
- 精通MySQL 8(視頻教學版)
- Django 5企業級Web應用開發實戰(視頻教學版)
- Maker基地嘉年華:玩轉樂動魔盒學Scratch
- 區塊鏈架構之美:從比特幣、以太坊、超級賬本看區塊鏈架構設計
- Backbone.js Testing
- Python面試通關寶典
- PHP動態網站開發實踐教程