官术网_书友最值得收藏!

Setting parameters in Random Forests

The Random Forest implementation in scikit-learn is called RandomForestClassifier, and it has a number of parameters. As Random Forests use many instances of DecisionTreeClassifier, they share many of the same parameters such as the criterion (Gini Impurity or Entropy/information gain), max_features, and min_samples_split.

There are some new parameters that are used in the ensemble process:

  • n_estimators: This dictates how many decision trees should be built. A higher value will take longer to run, but will (probably) result in a higher accuracy.
  • oob_score: If true, the method is tested using samples that aren't in the random subsamples chosen for training the decision trees.
  • n_jobs: This specifies the number of cores to use when training the decision trees in parallel.

The scikit-learn package uses a library called Joblib for inbuilt parallelization. This parameter dictates how many cores to use. By default, only a single core is used--if you have more cores, you can increase this, or set it to -1 to use all cores.

主站蜘蛛池模板: 天气| 玉溪市| 上杭县| 洮南市| 商丘市| 武威市| 偏关县| 呼伦贝尔市| 英山县| 宝应县| 沈阳市| 原平市| 且末县| 浏阳市| 河间市| 宾川县| 伊川县| 晋江市| 慈利县| 沭阳县| 罗江县| 乐东| 六安市| 都匀市| 高平市| 娄底市| 甘洛县| 平定县| 淳安县| 聊城市| 什邡市| 永昌县| 湘阴县| 云南省| 贵港市| 大新县| 额敏县| 墨玉县| 宜兴市| 姚安县| 蕲春县|