- Hands-On Ensemble Learning with R
- Prabhanjan Narayanachar Tattar
- 300字
- 2021-07-23 19:10:51
Primary Biliary Cirrhosis
The pbc
dataset from the survival package is a benchmark dataset in the domain of clinical trials. Mayo Clinic collected the data, which is concerned with the primary biliary cirrhosis (PBC) of the liver. The study was conducted between 1974 and 1984. More details can be found by running pbc
, followed by library(survival)
on the R terminal. Here, the main time to the event of interest is the number of days between registration and either death, transplantation, or study analysis in July 1986, and this is captured in the time variable. Similarly to a survival study, the events might be censored and the indicator is in the column status. The time to event needs to be understood, factoring in variables such as trt
, age
, sex
, ascites
, hepato
, spiders
, edema
, bili
, chol
, albumin
, copper
, alk.phos
, ast
, trig
, platelet
, protime
, and stage
.
The eight datasets discussed up until this point have a target variable, or a regressand/dependent variable, and are examples of the supervised learning problem. On the other hand, there are practical cases in which we simply attempt to understand the data and find useful patterns and groups/clusters in it. Of course, it is important to note that the purpose of clustering is to find an identical group and give it a sensible label. For instance, if we are trying to group cars based on their characteristics such as length, width, horsepower, engine cubic capacity, and so on, we may find groups that might be labeled as hatch, sedan, and saloon classes, while another clustering solutions might result in labels of basic, premium, and sports variant groups. The two main problems posed in clustering are the choice of the number of groups and the formation of robust clusters. We consider a simple dataset from the factoextra
R package.
- Practical Data Analysis
- Circos Data Visualization How-to
- 人工免疫算法改進及其應用
- Verilog HDL數(shù)字系統(tǒng)設計入門與應用實例
- Hands-On Machine Learning with TensorFlow.js
- 永磁同步電動機變頻調速系統(tǒng)及其控制(第2版)
- 控制系統(tǒng)計算機仿真
- 西門子變頻器技術入門及實踐
- 從零開始學Java Web開發(fā)
- 經(jīng)典Java EE企業(yè)應用實戰(zhàn)
- Mastering Ansible(Second Edition)
- Wireshark Revealed:Essential Skills for IT Professionals
- Hands-On Geospatial Analysis with R and QGIS
- 智能小車機器人制作大全(第2版)
- 巧學活用AutoCAD