官术网_书友最值得收藏!

TensorForest Estimator

TensorForest is a highly scalable implementation of random forests built by combining a variety of online HoeffdingTree algorithms with the extremely randomized approach.

Google published the details of the TensorForest implementation in the following paper:  TensorForest: Scalable Random Forests on TensorFlow  by Thomas Colthurst, D. Sculley, Gibert Hendry, Zack Nado, presented at Machine Learning Systems Workshop at the Conference on Neural Information Processing Systems ( NIPS) 2016. The paper is available at the following link:  https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxtbHN5c25pcHMyMDE2fGd4OjFlNTRiOWU2OGM2YzA4MjE.

TensorForest estimators are used to implementing the following algorithm:

Initialize the variables and sets
Tree = [root]
Fertile = {root}
Stats(root) = 0
Splits[root] = []

Divide training data into batches.
For each batch of training data:
Compute leaf assignment for each feature vector
Update the leaf stats in Stats
For each in Fertile set:
if |Splits| < max_splits
then add the split on a randomly selected feature to Splits
else if is fertile and |Splits| = max_splits
then update the split stats for
Calculate the fertile leaves that are finished.
For every non-stale finished leaf:
turn the leaf into an internal node with its best scoring split
remove the leaf from Fertile
add the leaf's two children to Tree as leaves
If |Fertile| < max_fertile
Then add the max_fertile ? |Fertile| leaves with
the highest weighted leaf scores to Fertile and
initialize their Splits and split statistics.
Until |Tree| = max_nodes or |Tree| stays the same for max_batches_to_grow batches

More details of this algorithm implementation can be found in the TensorForest paper.

主站蜘蛛池模板: 青铜峡市| 务川| 中阳县| 习水县| 栖霞市| 盐池县| 滨海县| 乐业县| 鄂尔多斯市| 南涧| 岢岚县| 武宣县| 建宁县| 宁都县| 建始县| 徐水县| 崇州市| 安仁县| 鹤壁市| 游戏| 宜章县| 宽城| 体育| 莫力| 道孚县| 鹤峰县| 犍为县| 博爱县| 凌海市| 雅江县| 西林县| 平果县| 喜德县| 龙里县| 定南县| 金溪县| 榕江县| 博罗县| 错那县| 盘山县| 朝阳县|