- R for Data Science Cookbook
- Yu Wei Chiu (David Chiu)
- 514字
- 2021-07-14 10:51:29
Adding new records
For those of you familiar with databases, you may already know how to perform an insert
operation to append a new record to the dataset. Alternatively, you can use an alter
operation to add a new column (attribute) into a table. In R, you can also perform insert
and alter
operations but much more easily. We will introduce the rbind
and cbind
function in this recipe so that you can easily append a new record or new attribute to the current dataset with R.
Getting ready
Refer to the Converting data types recipe and convert each attribute of imported data into the proper data type. Also, rename the columns of the employees
and salaries
datasets by following the steps from the Renaming the data variable recipe.
How to do it…
Perform the following steps to add a new record or new variable into the dataset:
- First, use
rbind
to insert a new record toemployees
:> employees <- rbind(employees, c(10011, '1960-01-01', 'Jhon', 'Doe', 'M', '1988-01-01'))
- We can then reassign the combined results of the data frame
employees
and new records back toemployees
:> employees <- rbind(employees, c(10011, '1960-01-01', 'Jhon', 'Doe', 'M', '1988-01-01'))
- Besides adding a new record to the original dataset, we can add a new
position
attribute withNA
as the default value:> cbind(employees, position = NA)
- Furthermore, we can add a new
age
attribute, based on a calculation using the current date andbirth_date
of each employee:> span <- interval(ymd(employees$birth_date), now()) > time_period <- as.period(span) > employees$age <- year(time_period)
- Alternatively, we can use the
transform
function to add multiple variables:> transform(employees, age = year(time_period), position = "RD", marrital = NA)
How it works…
Similar to database operations, we can add a new record to the data frame by the schema of the dataset (the number of attributes and data type of each attribute). Here, we first introduced how to use the rbind
function to add a new record to a data frame. As the employees dataset consists of six columns, we can add a record with six values to the employees
dataset with the rbind
function. In the first column, emp_no
is in integer format. Thus, we do not have to wrap the input value with single quotes. For the first_name
and last_name
attributes, we can freely input any character string as a value because we already converted their type to character type. For the last gender
attribute, which is in factor type, we can only input either M
or F
as a value.
In addition to adding a new record to a target dataset, we can add a new variable with the cbind
function. To add a new variable, we can assign a variable with a default value while calling cbind
. Here, we use NA
as the default value for a new position variable. We can also assign the calculated results from other columns as the value of the new variable. In this demonstration, we first computed each employee's age from the current date to their birthday. Then, we used the dollar sign to assign the computed value to a new attribute, age
. Besides using the dollar sign to assign a new variable, we can use the transform function to create age
, position
, and marital
variables in the employees
dataset.
There's more…
Besides using the dollar sign and transform function, we can use the with
function to create new variables:
> with(employees, year(birth_date)) [1] 1953 1964 1959 1954 1955 1953 1957 1958 1952 1963 > employees $birth_year <- with(employees, year(birth_date))
- Learning LibGDX Game Development(Second Edition)
- Instant Apache Stanbol
- 前端跨界開發指南:JavaScript工具庫原理解析與實戰
- Python機器學習實戰
- 手把手教你學C語言
- QGIS:Becoming a GIS Power User
- ADI DSP應用技術集錦
- 小程序開發原理與實戰
- 程序員修煉之道:通向務實的最高境界(第2版)
- BeagleBone Black Cookbook
- 零基礎入門學習Python(第2版)
- Babylon.js Essentials
- Python項目實戰從入門到精通
- 現代C++編程實戰:132個核心技巧示例(原書第2版)
- JSP程序設計實例教程(第2版)