官术网_书友最值得收藏!

Data types in R

We explored the various ways that we can read the data from R in the previous session. Let's have a look at the various data types that are supported by R. Before going into the details of the data type, we will first explore the variable data types in R.

Variable data types

The common variable data types in R are numerical, integer, character, and logical. We will explore each one of them using R.

Numeric is the default variable type for all the variables holding numerical values in R:

a <- 10
class(a)
[1] "numeric"

In the preceding code, we actually passed an integer to the a variable but it is still being saved in a numeric format.

We can now convert this variable defined as a numeric in R into an integer using the as.integer function:

a <- as.integer(a)
class(a)
[1] "integer"

Similarly, here is a variable of the character and logical types:

name <- "Sharan"
class(name)
[1] "character"

# Logical Type
flag <- TRUE
class(flag)
[1] "logical"

Having explored the variable data types, now we will move up the hierarchy and explore these data types: vector, matrix, list, and dataframe.

A vector is a sequence of elements of a basic data type. It could be a sequence of numeric or logical characters. A vector can't have a sequence of elements with different data types. The following are the examples for the numeric, character, and logical vectors:

v1 <- c(12, 34, -21, 34.5, 100) # numeric vector
class(v1)
 [1] "numeric"
v2 <- c("sam", "paul", "steve", "mark") # character vector
class(v2)
[1] "character"
v3 <- c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE) #logical vector
class(v3)
[1] "logical"

Now, let's consider the v1 numeric vector and the v2 character vector, combine these two, and see the resulting vector:

newV <- c(v1,v2)
class(newV)
[1] "character"

We can see that the resultant vector is a character vector; we will see what happened to the numeric elements of the first vector. From the following output, we can see that the numeric elements are now converted into character vectors represented in double quotes, whereas the numeric vector will be represented without any quotes:

newV
 [1] "12" "34" "-21" "34.5" "100" "sam" "paul" "steve" "mark" 

A matrix is a collection of elements that has a two-dimensional representation, that is, columns and rows. A matrix can contain elements of the same data type only. We can create a matrix using the following code. First, we pass the intended row names and column names to the rnames and cnames variables, then using the matrix function, we will create the matrix. We specify the row names and column names using the dimnames parameter:

rnames <- c("R1", "R2", "R3", "R4", "R5")
cnames <- c("C1", "C2", "C3", "C4", "C5")
matdata <-matrix(1:25, nrow=5,ncol=5, dimnames=list(rnames, cnames))
class(matdata)
[1] "matrix"
typeof(matdata)
[1] "integer"
Matdata
C1 C2 C3 C4 C5
R1 1 6 11 16 21
R2 2 7 12 17 22
R3 3 8 13 18 23
R4 4 9 14 19 24
R5 5 10 15 20 25

A list is a sequence of data elements similar to a vector but can hold elements of different datatypes. We will combine the variables that we created in the vector section. As in the following code, these variables hold numeric, character, and logical vectors. Using the list function, we combine them, but their individual data type still holds:

l1 <- list(v1, v2, v3)

typeof(l1)
> l1
[[1]]
[1] 12.0 34.0 -21.0 34.5 100.0

[[2]]
[1] "sam" "paul" "steve" "mark" 

[[3]]
[1] TRUE FALSE TRUE FALSE TRUE FALSE

Factors are categorical variables in R, which means that they take values from a limited known set. In case of factor variables, R internally stores an equivalent integer value and maps it to the character string.

A dataframe is similar to the matrix, but in a data frame, the columns can hold data elements of different types. The data frame will be the most commonly used data type for most of the analysis. As any dataset would have multiple data points, each could be of a different type. R comes with a good number of built-in datasets such as mtcars. When we use sample datasets to cover the various examples in the coming chapters, you will get a better understanding about the data types discussed so far.

主站蜘蛛池模板: 镇远县| 伊通| 德化县| 深州市| 白山市| 泰顺县| 山西省| 横山县| 仁怀市| 正定县| 四子王旗| 江安县| 古交市| 阜南县| 乌兰察布市| 长岛县| 汉川市| 简阳市| 攀枝花市| 新巴尔虎右旗| 彭阳县| 莫力| 保定市| 江城| 轮台县| 济南市| 鱼台县| 九龙坡区| 丹阳市| 牡丹江市| 交口县| 红河县| 饶平县| SHOW| 安徽省| 永宁县| 喀喇沁旗| 滨州市| 龙井市| 墨玉县| 白银市|