官术网_书友最值得收藏!

Chapter 1. Understanding R's Performance – Why Are R Programs Sometimes Slow?

R is a great tool used for statistical analysis and data processing. When it was first developed in 1993, it was designed as a tool that would teach data analysis courses. Because it is so easy to use, it became more and more popular over the next 20 years, not only in academia, but also in government and industry. R is also an open source tool, so its users can use it for free and contribute new statistical packages to the R public repository called the Comprehensive R Archive Network (CRAN). As the CRAN library became richer with more than 6,000 well-documented and ready-to-use packages at the time of writing this book, the attractiveness of R increased even further. In these 20 years, the volume of data being created, transmitted, stored, and analyzed, by organizations and individuals alike, has also grown exponentially. R programmers who need to process and analyze the ever growing volume of data sometimes find that R's performance suffers under such heavy loads. Why does R sometimes not perform well, and how can we overcome its performance limitations? This book examines the factors behind R's performance and offers a variety of techniques to improve the performance of R programs, for example, optimizing memory usage, performing computations in parallel, or even tapping the computing power of external data processing systems.

Before we can find the solutions to R's performance problems, we need to understand what makes R perform poorly in certain situations. This chapter kicks off our exploration of the high-performance R programming by taking a peek under the hood to understand how R is designed, and how its design can limit the performance of R programs.

We will examine three main constraints faced by any computational task—CPU, RAM, and disk input/output (I/O)—and then look at how these play out specifically in R programs. By the end of this chapter, you will have some insights into the bottlenecks that your R programs could run into.

This chapter covers the following topics:

  • Three constraints on computing performance—CPU, RAM, and disk I/O
  • R is interpreted on the fly
  • R is single-threaded
  • R requires all data to be loaded into memory
  • Algorithm design affects time and space complexity
主站蜘蛛池模板: 崇明县| 绥阳县| 孝义市| 临西县| 韶山市| 论坛| 上饶县| 渭源县| 奈曼旗| 夏津县| 三河市| 漳平市| 凤凰县| 疏附县| 临漳县| 峨山| 临漳县| 伊宁县| 天镇县| 太谷县| 尚义县| 崇义县| 绥滨县| 安多县| 渑池县| 垦利县| 汪清县| 灵丘县| 尼木县| 扶绥县| 鄂托克前旗| 铜山县| 女性| 红原县| 高州市| 调兵山市| 漾濞| 韶关市| 壤塘县| 昌平区| 天门市|