官术网_书友最值得收藏!

  • Data Analysis with R
  • Tony Fischetti
  • 421字
  • 2021-07-30 09:55:09

What this book covers

Chapter 1, RefresheR, reviews the aspects of R that subsequent chapters will assume knowledge of. Here, we learn the basics of R syntax, learn R's major data structures, write functions, load data and install packages.

Chapter 2, The Shape of Data, discusses univariate data. We learn about different data types, how to describe univariate data, and how to visualize the shape of these data.

Chapter 3, Describing Relationships, goes on to the subject of multivariate data. In particular, we learn about the three main classes of bivariate relationships and learn how to describe them.

Chapter 4, Probability, kicks off a new unit by laying foundation. We learn about basic probability theory, Bayes' theorem, and probability distributions.

Chapter 5, Using Data to Reason About the World, discusses sampling and estimation theory. Through examples, we learn of the central limit theorem, point estimation and confidence intervals.

Chapter 6, Testing Hypotheses, introduces the subject of Null Hypothesis Significance Testing (NHST). We learn many popular hypothesis tests and their non-parametric alternatives. Most importantly, we gain a thorough understanding of the misconceptions and gotchas of NHST.

Chapter 7, Bayesian Methods, introduces an alternative to NHST based on a more intuitive view of probability. We learn the advantages and drawbacks of this approach, too.

Chapter 8, Predicting Continuous Variables, thoroughly discusses linear regression. Before the chapter's conclusion, we learn all about the technique, when to use it, and what traps to look out for.

Chapter 9, Predicting Categorical Variables, introduces four of the most popular classification techniques. By using all four on the same examples, we gain an appreciation for what makes each technique shine.

Chapter 10, Sources of Data, is all about how to use different data sources in R. In particular, we learn how to interface with databases, and request and load JSON and XML via an engaging example.

Chapter 11, Dealing with Messy Data, introduces some of the snags of working with less than perfect data in practice. The bulk of this chapter is dedicated to missing data, imputation, and identifying and testing for messy data.

Chapter 12, Dealing with Large Data, discusses some of the techniques that can be used to cope with data sets that are larger than can be handled swiftly without a little planning. The key components of this chapter are on parallelization and Rcpp.

Chapter 13, Reproducibility and Best Practices, closes with the extremely important (but often ignored) topic of how to use R like a professional. This includes learning about tooling, organization, and reproducibility.

主站蜘蛛池模板: 漯河市| 德令哈市| 克什克腾旗| 织金县| 新建县| 唐海县| 揭阳市| 和政县| 伽师县| 合作市| 吴忠市| 陆良县| 皮山县| 建平县| 南京市| 滦南县| 普兰县| 元朗区| 革吉县| 桂东县| 辛集市| 怀柔区| 化隆| 洛浦县| 博白县| 花莲县| 洱源县| 康乐县| 崇信县| 佛学| 壶关县| 锡林郭勒盟| 黔西| 普宁市| 兴城市| 卓资县| 彩票| 姜堰市| 靖边县| 邹城市| 拜泉县|