官术网_书友最值得收藏!

Setting parameters in Random Forests

The Random Forest implementation in scikit-learn is called RandomForestClassifier, and it has a number of parameters. As Random Forests use many instances of DecisionTreeClassifier, they share many of the same parameters such as the criterion (Gini Impurity or Entropy/information gain), max_features, and min_samples_split.

There are some new parameters that are used in the ensemble process:

  • n_estimators: This dictates how many decision trees should be built. A higher value will take longer to run, but will (probably) result in a higher accuracy.
  • oob_score: If true, the method is tested using samples that aren't in the random subsamples chosen for training the decision trees.
  • n_jobs: This specifies the number of cores to use when training the decision trees in parallel.

The scikit-learn package uses a library called Joblib for inbuilt parallelization. This parameter dictates how many cores to use. By default, only a single core is used--if you have more cores, you can increase this, or set it to -1 to use all cores.

主站蜘蛛池模板: 衡阳市| 信宜市| 古蔺县| 碌曲县| 堆龙德庆县| 唐海县| 灯塔市| 兰坪| 克拉玛依市| 恭城| 前郭尔| 天祝| 灌阳县| 上杭县| 策勒县| 盱眙县| 阜城县| 二手房| 山阴县| 双牌县| 白城市| 盈江县| 彭泽县| 阜新| 衡南县| 共和县| 长治市| 汝阳县| 海原县| 余姚市| 龙州县| 甘泉县| 宣威市| 巨野县| 高密市| 芦山县| 大连市| 弋阳县| 江门市| 商城县| 南昌市|