官术网_书友最值得收藏!

  • Learning Shiny
  • Hernán G. Resnizky
  • 1227字
  • 2021-07-09 21:46:10

Element selection

Let's now examine how elements can be selected from various class features.

Selecting elements from vectors

At this point in the chapter, you probably already suspect that in order to select specific items from a vector, the selection condition must be enclosed in [].

There are basically three ways of selecting elements from arrays in R. They are as follows:

  1. By index: A set of integers that indicate the position of the elements to select:
    > LETTERS[c(1,5,6)]
    [1] "A" "E" "F"
    

    Note

    LETTERS is a character vector built-in object in R that contains the entire alphabet in upper case. For lower case, use LETTERS.

    Using negative subscripts removes specific elements from an object (unlike in languages such as Python, where it implies reverse order):

    > LETTERS[-c(1,5,6)]
    [1] "B" "C" "D" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X"
    [22] "Y" "Z"
    

    Note

    In R, indexing starts at 1 and not at 0.

    The reverse order of the vector can be obtained with the rev() function:

    > rev(LETTERS)
    [1] "Z" "Y" "X" "W" "V" "U" "T" "S" "R" "Q" "P" "O" "N" "M" "L" "K" "J" "I" "H" "G" "F"
    [22] "E" "D" "C" "B" "A"
    
  2. By name: The elements in a vector can be named. It is mainly an attribute of its elements. A vector of any class will have an equal length vector of attribute names. When an element does not have a name associated with it, R defaults it to NA:
    > aaa <- 1:10
    > names(aaa) <- LETTERS[1:5]
    > names(aaa)
    [1] "A" "B" "C" "D" "E" NA NA NA NA NA
    

    When a vector has names associated with its elements, they can also be accessed by name:

    > aaa <- 1:10
    > names(aaa) <- LETTERS[1:10]
    > aaa[c("A","C","D")]
    A C D
    1 3 4
    

    When a vector has names associated with its values, the names are also printed by default.

  3. By logical vector: By passing a logical vector, elements matching TRUE are selected and elements matching FALSE are not:
    > aaa <- 1:5
    > aaa[c(T,F,F,T,T)]
    [1] 1 4 5
    

    Tip

    T and F are shortcuts for TRUE and FALSE.

In the case of logical vectors, if the vector passed is shorter than one that is being selected, the logical vector is recycled. This means that it is repeated over the length of the vector sequentially, so if the length of the vector being selected is not a multiple of the logical vector, it will use the necessary elements to apply the logical vector over the whole vector:

> aaa[c(T,F)]
[1] 1 3 5

The T,F vector is recycled as T,F,T,F,T. In the case of c(F,T), the result is 2,4 because the logical vector results in F,T,F,T,F:

> aaa[c(F,T)]
[1] 2 4

Vector recycling is not exclusive to logical vectors. When comparing two vectors, for example, if they differ in length, the shorter one will be recycled in the same way. However, among the three methods of selecting elements from a vector mentioned here, this is the only one where this occurs.

Note

As you may have already realized, in order to select the elements of a vector, another vector is passed inside the brackets.

Lastly, it is worth mentioning that when a non-existing element is selected (when the index number is greater than the length of the vector, it is called by a non-existing name, or the logical vector is larger than the selected vector), NAs are returned. NA denotes a missing value.

When the index number is greater than the length of the vector, you get the following:

> aaa <- LETTERS
> aaa[50]
[1] NA

When a non-existing name is used, you get the following:

> aaa <- 1:10
> names(aaa) <- LETTERS[1:10]
> aaa["Z"]
<NA>
NA

Note

In this case, either the value or the name exists, so the vector returns NAs for both the name of the element (enclosed in <>) and for the value itself.

Selecting elements from arrays

The way to select elements from arrays is the same as the preceding method (that is, by index number, by name, and by logical vector) with the sole difference that the selections over the different dimensions of the array are separated by commas. The selection, in the case of the arrays, does not refer to individual elements but to the whole element of that dimension.

For example, if the object was a matrix (that is, a two-dimensional array), then [c(1,3),c(2,4)] would be the selection of the first and third rows and the second and fourth columns, this means four values as follows:

aaa<-matrix(1:16,4,4)
aaa
## [,1] [,2] [,3] [,4]
## [1,] 1 5 9 13
## [2,] 2 6 10 14
## [3,] 3 7 11 15
## [4,] 4 8 12 16

A matrix of 16 numbers is created, as shown here:

aaa[c(1,3),c(2,4)]
## [,1] [,2]
## [1,] 5 13
## [2,] 7 15

This is the selection of first and third row, and second and fourth columns.

However, as the underlying structure of the matrix is a vector, objects can still be selected in the same way as previously explained. This is as follows:

> aaa[5]
[1] 5

The assignment of names in the arrays is slightly different. As explained previously, to assign names to the elements of a vector, the attribute to modify is names. This is done either via names(object) or attr(object, "names"). In the case of arrays, the attribute that describes the row and column names is dimnames. Unlike names, which is a character vector, dimnames is a list with two elements, a character vector for rows and a character vector for columns.

So in order to assign names to rows or columns, the index has to be specified as follows:

> dimnames(aaa) <- LETTERS[1:4]
Error in dimnames(aaa) <- LETTERS[1:4] : 'dimnames' must be a list

R throws an error when you try to pass a vector to dimnames.

As it was already mentioned, to access or modify values of a list, double brackets are needed. So in this case, to add names to the rows, the solution would be this:

> dimnames(aaa)[[1]] <- LETTERS[1:4]

When the result is printed, the names appear on the left, as shown here:

aaa
## [,1] [,2] [,3] [,4]
## A 1 5 9 13
## B 2 6 10 14
## C 3 7 11 15
## D 4 8 12 16

The process is exactly the same as in the case of the column names, except for the fact that the index number is 2:

> dimnames(aaa)[[2]] <- LETTERS[5:8]
aaa
## E F G H
## A 1 5 9 13
## B 2 6 10 14
## C 3 7 11 15
## D 4 8 12 16

In the case of matrices, there are simpler functions to get or assign values to rows and column names, row.names() and colnames(). They are used in the same way:

> row.names(aaa) <- LETTERS[1:4]
> colnames(aaa) <- LETTERS[5:8]

These two alternatives are equivalent to the ones previously explained. With a named array, the way to access the different vectors is identical:

aaa[c("A","C"), c("E","F")]
## E F
## A 1 5
## C 3 7

Lastly, to select all the elements from one of the dimensions, the selection for that dimension must be kept empty, but the comma must be maintained:

aaa[1:2,]
## E F G H
## A 1 5 9 13
## B 2 6 10 14

The preceding code will select the first two rows and all the columns.

Selecting elements from lists

As it was explained before in this chapter (see the Lists section), a list is an object that supports any type of object in its elements. So, there is a need to make a notation difference between the selection of the parts of the list (sublists) and the access to the element itself contained in the list. In this sense, Hadley Wickham gives a perfect explanation:

"[ selects sub-lists. It always returns a list; if you use it with a single positive integer, it returns a list of length one. [[ selects an element within a list."

You can get more information at http://adv-r.had.co.nz/Subsetting.html. Have a look at the following snippet:

> #List
> list.ex <- list(a=c(1,2,3),b=c("a","b","c"), c = list(var1="a",var2="b"))
> 
> #List of length one
> class(list.ex[2])
[1] "list"
> 
> #What is inside the second element of the list
> class(list.ex[[2]])
[1] "character"

As they might differ in their respective classes, it is not allowed in R to access multiple elements in a list. So for instance, list.ex[[1:3]] is not permitted. Analogously, the elements within the lists can be accessed by name in double brackets:

> #List
> list.ex <- list(a=c(1,2,3),b=c("a","b","c"), c=list(var1="a",var2="b"))
> 
> #Access per name
> list.ex[["b"]]
[1] "a" "b" "c"

Another way of selecting items over lists when the items are named is using the $ operator. In RStudio, in fact, you can see the named elements of the list by pressing Tab after the $ operator:

Selecting elements from lists

Selecting elements from data frames

As it has been previously explained, a data frame is a special type of list where all its elements have the same length. As such, all the alternatives to selecting elements from lists, naturally, can also be used with data frames:

> test.data.frame <-data.frame(Variable1=1:10,Variable2=LETTERS[1:10])
> test.data.frame$Variable1
 [1] 1 2 3 4 5 6 7 8 9 10
> test.data.frame[["Variable1"]]
 [1] 1 2 3 4 5 6 7 8 9 10
> test.data.frame[[1]]
 [1] 1 2 3 4 5 6 7 8 9 10

However, due to its matrix-like structure, R also provides the possibility of matrix-like indexing, as shown here:

> test.data.frame[5,1]
[1] 5
> test.data.frame[5,"Variable1"]
[1] 5
> test.data.frame[5,c(T,F)]
[1] 5

Finally, it is also possible to select elements over data frames with subset(). This function provides the possibility of selecting observations in the data frame based on conditions related to the variables in it:

> subset(test.data.frame, Variable1 >= 8)
 Variable1 Variable2
8 8 H
9 9 I
10 10 J

The subset() function also has the possibility of selecting variables from the data frame that are passed with the select argument, which can specify which elements to keep or which to eliminate. In this case, the elements should be preceded by a minus sign. The following example illustrates the function's use:

> test.data.frame <- data.frame(Variable1=1:10,Variable2=LETTERS[1:10], Variable3 = LETTERS[11:20])
> subset(test.data.frame, Variable1 >= 8, select = c(Variable1, Variable3))
 Variable1 Variable3
8 8 R
9 9 S
10 10 T
> subset(test.data.frame, Variable1 >= 8, select = -Variable2)
 Variable1 Variable3
8 8 R
9 9 S
10 10 T
主站蜘蛛池模板: 山阴县| 阿克陶县| 昆山市| 中山市| 武川县| 通江县| 永吉县| 贵州省| 兴隆县| 台州市| 封开县| 唐河县| 崇州市| 晋中市| 榆树市| 合肥市| 南平市| 峡江县| 江西省| 瓦房店市| 辉南县| 霸州市| 卢龙县| 略阳县| 大兴区| 谢通门县| 建瓯市| 木兰县| 苗栗市| 阳山县| 池州市| 武邑县| 江达县| 陇南市| 湖北省| 当涂县| 大荔县| 黔江区| 新河县| 孟津县| 玉树县|