官术网_书友最值得收藏!

Creating heat maps

Heat maps are colorful images that are very useful to summarize a large amount of data by highlighting hotspots or key trends in the data.

How to do it...

There are a few different ways to make heat maps in R. The simplest is to use the heatmap() function in the base library:

heatmap(as.matrix(mtcars), 
Rowv=NA, 
Colv=NA, 
col = heat.colors(256), 
scale="column",
margins=c(2,8),
main = "Car characteristics by Model")

How it works...

The example code has a lot of arguments, so it might look difficult at first sight. However, if we consider each argument in turn, we can understand how it works. The first argument to the heatmap() function is the dataset. We are using the built-in dataset mtcars, which holds data such as fuel efficiency (mpg), number of cylinders (cyl), weight (wt), and so on for different models of cars. The data needs to be in a matrix format, so we use the as.matrix() function. Rowv and Colv specify whether and how dendrograms should be displayed to the left and top of the heat map.

Note

See help(dendrogram) and http://en.wikipedia.org/wiki/Dendrogram for details on dendrograms.

In our example, we suppress them by setting the two arguments to NA, which is a logical indicator of a missing value in R. The scale argument tells R in which direction the color gradient should apply. We have set it to column, which means the scale for the gradient will be calculated on a per-column basis.

There's more...

Heat maps are very useful to look at correlations between variables in a large dataset. For example, in bioinformatics, heat maps are often used to study the correlations between groups of genes.

Let's look at an example with the genes.csv example data file. Let's first load the file:

genes<-read.csv("genes.csv",header=T)

Let's use the image() function to create a correlation heat map:

rownames(genes)<-colnames(genes)

image(x=1:ncol(genes),
y=1:nrow(genes),
z=t(as.matrix(genes)),
axes=FALSE,
xlab="",
ylab="" ,
main="Gene Correlation Matrix")

axis(1,at=1:ncol(genes),labels=colnames(genes),col="white",
las=2,cex.axis=0.8)
axis(2,at=1:nrow(genes),labels=rownames(genes),col="white",
las=1,cex.axis=0.8)

We used a few new commands and arguments in the previous example, especially to format the axes. We discuss these in detail starting in Chapter 3, Beyond the Basics – Adjusting Key Parameters, and with more examples in later chapters.

See also

Heat maps are explained in a lot more detail with more examples in Chapter 9, Creating Heat Maps and Contour Plots.

主站蜘蛛池模板: 精河县| 延川县| 佛坪县| 华亭县| 连平县| 汶上县| 武定县| 土默特左旗| 巴里| 隆化县| 北京市| 安远县| 定日县| 那曲县| 东丽区| 临朐县| 乐山市| 基隆市| 共和县| 集贤县| 新宁县| 宿州市| 定兴县| 吉林省| 铁岭县| 定襄县| 巴林右旗| 昌图县| 府谷县| 资溪县| 迁西县| 民勤县| 德庆县| 东乡| 四川省| 东明县| 饶河县| 瓮安县| 汝州市| 沙河市| 泽普县|