官术网_书友最值得收藏!

Introducing a scatter plot

Scatter plots are used primarily to conduct a quick analysis of the relationships among different variables in our data. It is simply plotting points on the x-axis and y-axis. Scatter plots help us detect whether two variables have a positive, negative, or no relationship. In this recipe, we will study the basics of plotting in R using scatter plots. The following screenshot is an example of a scatter plot:

Introducing a scatter plot

Getting ready

For implementing the basic scatter plot in R, we would use Carseats data available with the ISLR package in R.

How to do it…

We will also start this recipe by installing necessary packages using the install.packages() function and loading the same in R using the library() function:

install.packages("ISLR")
library(ISLR)

Next, we need to load the data in R. Almost all R packages come with preloaded data and hence we can load the data only after we load the library in R. We can attach the data in R using the attach() function. We can view the entire list of datasets along with their respective libraries in R by typing data() in the R console window. The attach() function attaches the data to our R session. This allows us to access different variables of a database:

attach(Carseats)

Once we attach the data, it's a good practice to view the data using head(Carseats). The head() function will display the first six entries of the dataset and will allow us to know the exact column headings of the data:

head(Carseats)

The data can be plotted in R by calling the plot() function. The plot() function in R comes with a variety of options and the best way to know all the options is by simply typing ?plot() in the R console window:

plot(Income, Sales,col = c(Urban),pch = 20, main ="sales of Child Car Seats", xlab = "Income (000's of Dollars)", ylab ="Unit Sales (in 000's)" )

This particular plot requires us to plot the legends as the points have two different color schemes. In R, we can add a legend using the legend() function:

legend("topright",cex = 0.6, fill = c("red","black"), legend = c("Yes","No"))

How it works…

Readers who are new to R should definitely read the recipe Installing packages and getting help in R in Chapter 1, A Simple Guide to R. The install.packages() and library() functions are used in most of the recipes in this book.

The attach() function is a nice way to reference the data as this allows us to avoid typing the $ notation. The $ notation is another way to reference columns in data and is discussed in the next recipe. Once we attach the data, it's a good practice to view the data using head(Carseats). The head() function has data as its first argument. To view fewer number of lines in the R console window, we can also type head(Carseats, 3). The tail(Carseats) function will display data entries from the bottom of the dataset.

The data can be plotted in R by calling the plot() function. The first two arguments in the plot() function refer to the data to be displayed on the x-axis (Income) and y-axis (Sales). The col argument allows us to assign color to our data points. In this case, we would like to use a qualitative data column (Urban) to color our points. The default color in R is black but we can change this using the col = "blue" argument. Please refer to the code file to learn about various other options. The pch = 20 argument allows us to plot symbols; the value 20 will plot filled circles. To view all the available pch values, please type ?par or ?points in the R console window. We can also label the heading of the plot using the main ="Sales" argument. The xlab and ylab arguments are used to label the x and y axes in R.

To display a legend is necessary for this scatter plot as we would like to differentiate between sales in urban and rural areas. The first argument in the legend() function corresponds to the position of the legend. The cex argument is used to size the text, the default value for cex is 1. The fill argument fills the boxes with the specified colors and the legend argument applies the labels to each of the boxes.

主站蜘蛛池模板: 枝江市| 江北区| 张北县| 石泉县| 巴彦淖尔市| 津南区| 固始县| 什邡市| 绥宁县| 重庆市| 会理县| 永仁县| 黄龙县| 剑川县| 中西区| 抚宁县| 昭通市| 河北区| 山西省| 海宁市| 绥中县| 年辖:市辖区| 来安县| 铜陵市| 措勤县| 康定县| 龙陵县| 神农架林区| 清镇市| 南雄市| 哈巴河县| 东乡族自治县| 红安县| 永安市| 尉犁县| 稻城县| 临洮县| 石门县| 客服| 临漳县| 炉霍县|