- R for Data Science Cookbook
- Yu Wei Chiu (David Chiu)
- 227字
- 2021-07-14 10:51:28
Renaming the data variable
The use of a data frame enables the user to select and filter data by row names and column names. As not all imported datasets contain row names and column names, we need to rename this dataset with a built-in naming function.
Getting ready
In this recipe, you need to prepare your environment with R installed and a computer that can access the Internet.
How to do it…
Perform the following steps to rename data:
- First, download
employees.csv
from the GitHub link https://github.com/ywchiu/rcookbook/raw/master/chapter3/employees.csv:> download.file("https://github.com/ywchiu/rcookbook/raw/master/chapter3/employees.csv", " employees.csv")
- Additionally, download
salaries.csv
from the GitHub link https://github.com/ywchiu/rcookbook/raw/master/chapter3/salaries.csv:> download.file("https://github.com/ywchiu/rcookbook/raw/master/chapter3/salaries.csv", "salaries.csv")
- Next, read the file into an R session with the
read.csv
function:> employees <- read.csv('employees.csv', head=FALSE) > salaries <- read.csv('salaries.csv', head=FALSE)
- Use the
names
function to view the column names of the dataset:> names(employees) [1] "V1" "V2" "V3" "V4" "V5" "V6" > names(salaries) [1] "V1" "V2" "V3" "V4"
- Next, rename columns with a given
names
vector:> names(employees) <- c("emp_no", "birth_date", "first_name", "last_name", "gender", "hire_date") > names(employees) [1] "emp_no" "birth_date" "first_name" "last_name" [5] "gender" "hire_date"
- Besides using
names
, you can also rename columns with thecolnames
function:> colnames (salaries) <- c("emp_no", "salary", "from_date", "to_date") > colnames (salaries) [1] "emp_no" "salary" "from_date" "to_date"
- In addition to revising the column names, we can also revise row names with the
rownames
function:> rownames (salaries) <- salaries$emp_no
How it works…
In this recipe, we demonstrated how to rename datasets with the names
function. First, we used the download.file
function to download both salaries.csv
and employees.csv
from GitHub. Then, we used the names
function to examine the column names of these two datasets. To revise the column names of these two datasets, we simply assigned a character vector to the name of the dataset. We can also revise column names with the colnames
function. Finally, we can revise the row names of the dataset to emp_no
with the rownames
function.
There's more…
To avoid having to specify column names and row names separately with the colnames
and rownames
functions, we can use the dimnames
function to specify both column names and row names in one operation:
> dimnames(employees) <- list(c(1,2,3,4,5,6,7,8,9,10), c("emp_no", "birth_date", "first_name", "last_name", "gender", "hire_date"))
In this code, the first input vector within the list indicates the row names, and the second input vector points to the column names.