官术网_书友最值得收藏!

What this book covers

Chapter 1, RefresheR, reviews the aspects of R that subsequent chapters will assume knowledge of. Here, we learn the basics of R syntax, learn R's major data structures, write functions, load data and install packages.

Chapter 2, The Shape of Data, discusses univariate data. We learn about different data types, how to describe univariate data, and how to visualize the shape of these data.

Chapter 3, Describing Relationships, goes on to the subject of multivariate data. In particular, we learn about the three main classes of bivariate relationships and learn how to describe them.

Chapter 4, Probability, kicks off a new unit by laying foundation. We learn about basic probability theory, Bayes' theorem, and probability distributions.

Chapter 5, Using Data to Reason About the World, discusses sampling and estimation theory. Through examples, we learn of the central limit theorem, point estimation and confidence intervals.

Chapter 6, Testing Hypotheses, introduces the subject of Null Hypothesis Significance Testing (NHST). We learn many popular hypothesis tests and their non-parametric alternatives. Most importantly, we gain a thorough understanding of the misconceptions and gotchas of NHST.

Chapter 7, Bayesian Methods, introduces an alternative to NHST based on a more intuitive view of probability. We learn the advantages and drawbacks of this approach, too.

Chapter 8, Predicting Continuous Variables, thoroughly discusses linear regression. Before the chapter's conclusion, we learn all about the technique, when to use it, and what traps to look out for.

Chapter 9, Predicting Categorical Variables, introduces four of the most popular classification techniques. By using all four on the same examples, we gain an appreciation for what makes each technique shine.

Chapter 10, Sources of Data, is all about how to use different data sources in R. In particular, we learn how to interface with databases, and request and load JSON and XML via an engaging example.

Chapter 11, Dealing with Messy Data, introduces some of the snags of working with less than perfect data in practice. The bulk of this chapter is dedicated to missing data, imputation, and identifying and testing for messy data.

Chapter 12, Dealing with Large Data, discusses some of the techniques that can be used to cope with data sets that are larger than can be handled swiftly without a little planning. The key components of this chapter are on parallelization and Rcpp.

Chapter 13, Reproducibility and Best Practices, closes with the extremely important (but often ignored) topic of how to use R like a professional. This includes learning about tooling, organization, and reproducibility.

主站蜘蛛池模板: 临潭县| 泽州县| 甘德县| 凌云县| 峡江县| 武清区| 兴业县| 门头沟区| 青铜峡市| 新龙县| 莱州市| 营口市| 宜章县| 上林县| 横峰县| 甘泉县| 克山县| 鲁甸县| 德惠市| 小金县| 白水县| 赞皇县| 霍山县| 淮滨县| 合水县| 巴马| 久治县| 米脂县| 青铜峡市| 新野县| 上饶县| 原阳县| 邵阳市| 綦江县| 平利县| 林周县| 凤冈县| 钟祥市| 安义县| 元氏县| 亚东县|