- Hands-On Ensemble Learning with R
- Prabhanjan Narayanachar Tattar
- 118字
- 2021-07-23 19:10:50
Pima Indians Diabetes
Diabetes is a health hazard, which is mostly incurable, and patients who are diagnosed with it have to adjust their lifestyles in order to cater to this condition. Based on variables such as pregnant
, glucose
, pressure
, triceps
, insulin
, mass
, pedigree
, and age
, the problem here is to classify the person as diabetic or not. Here, we have 768 observations. This dataset is drawn from the mlbench
package:
> data("PimaIndiansDiabetes") > set.seed(12345) > Train_Test <- sample(c("Train","Test"),nrow(PimaIndiansDiabetes),replace = TRUE, + prob = c(0.7,0.3)) > head(Train_Test) [1] "Test" "Test" "Test" "Test" "Train" "Train" > PimaIndiansDiabetes_Train <- PimaIndiansDiabetes[Train_Test=="Train",] > PimaIndiansDiabetes_TestX <- within(PimaIndiansDiabetes[Train_Test=="Test",], + rm(diabetes)) > PimaIndiansDiabetes_TestY <- PimaIndiansDiabetes[Train_Test=="Test","diabetes"] > PID_Formula <- as.formula("diabetes~.")
The five datasets described up to this point are classification problems. We look at one example each for regression, time series, survival, clustering, and outlier detection problems.
推薦閱讀
- Clojure Data Analysis Cookbook
- 計算機應用
- Getting Started with Oracle SOA B2B Integration:A Hands-On Tutorial
- 數據庫原理與應用技術學習指導
- 快學Flash動畫百例
- 小型電動機實用設計手冊
- Photoshop CS3圖像處理融會貫通
- 西門子S7-200 SMART PLC實例指導學與用
- 中國戰略性新興產業研究與發展·智能制造
- 悟透AutoCAD 2009完全自學手冊
- 多媒體制作與應用
- Salesforce Advanced Administrator Certification Guide
- Windows安全指南
- 重估:人工智能與賦能社會
- Xilinx FPGA高級設計及應用