- Learning Data Mining with Python(Second Edition)
- Robert Layton
- 169字
- 2021-07-02 23:40:10
Setting parameters in Random Forests
The Random Forest implementation in scikit-learn is called RandomForestClassifier, and it has a number of parameters. As Random Forests use many instances of DecisionTreeClassifier, they share many of the same parameters such as the criterion (Gini Impurity or Entropy/information gain), max_features, and min_samples_split.
There are some new parameters that are used in the ensemble process:
- n_estimators: This dictates how many decision trees should be built. A higher value will take longer to run, but will (probably) result in a higher accuracy.
- oob_score: If true, the method is tested using samples that aren't in the random subsamples chosen for training the decision trees.
- n_jobs: This specifies the number of cores to use when training the decision trees in parallel.
The scikit-learn package uses a library called Joblib for inbuilt parallelization. This parameter dictates how many cores to use. By default, only a single core is used--if you have more cores, you can increase this, or set it to -1 to use all cores.
推薦閱讀
- LabVIEW Graphical Programming Cookbook
- 軟件架構設計:大型網站技術架構與業務架構融合之道
- Rust編程:入門、實戰與進階
- C語言程序設計基礎與實驗指導
- Java程序設計與計算思維
- x86匯編語言:從實模式到保護模式(第2版)
- Mastering OpenCV 4
- Instant RubyMotion App Development
- H5頁面設計:Mugeda版(微課版)
- C語言程序設計
- Multithreading in C# 5.0 Cookbook
- INSTANT Sinatra Starter
- Java Web開發實例大全(基礎卷) (軟件工程師開發大系)
- Node.js區塊鏈開發
- 數據科學中的實用統計學(第2版)