官术网_书友最值得收藏!

Chapter 1. Understanding R's Performance – Why Are R Programs Sometimes Slow?

R is a great tool used for statistical analysis and data processing. When it was first developed in 1993, it was designed as a tool that would teach data analysis courses. Because it is so easy to use, it became more and more popular over the next 20 years, not only in academia, but also in government and industry. R is also an open source tool, so its users can use it for free and contribute new statistical packages to the R public repository called the Comprehensive R Archive Network (CRAN). As the CRAN library became richer with more than 6,000 well-documented and ready-to-use packages at the time of writing this book, the attractiveness of R increased even further. In these 20 years, the volume of data being created, transmitted, stored, and analyzed, by organizations and individuals alike, has also grown exponentially. R programmers who need to process and analyze the ever growing volume of data sometimes find that R's performance suffers under such heavy loads. Why does R sometimes not perform well, and how can we overcome its performance limitations? This book examines the factors behind R's performance and offers a variety of techniques to improve the performance of R programs, for example, optimizing memory usage, performing computations in parallel, or even tapping the computing power of external data processing systems.

Before we can find the solutions to R's performance problems, we need to understand what makes R perform poorly in certain situations. This chapter kicks off our exploration of the high-performance R programming by taking a peek under the hood to understand how R is designed, and how its design can limit the performance of R programs.

We will examine three main constraints faced by any computational task—CPU, RAM, and disk input/output (I/O)—and then look at how these play out specifically in R programs. By the end of this chapter, you will have some insights into the bottlenecks that your R programs could run into.

This chapter covers the following topics:

  • Three constraints on computing performance—CPU, RAM, and disk I/O
  • R is interpreted on the fly
  • R is single-threaded
  • R requires all data to be loaded into memory
  • Algorithm design affects time and space complexity
主站蜘蛛池模板: 河东区| 和平区| 鹤峰县| 元阳县| 高要市| 小金县| 宁津县| 洛阳市| 神池县| 江华| 拉萨市| 永平县| 沭阳县| 凉城县| 綦江县| 深州市| 永仁县| 西乌珠穆沁旗| 乐安县| 滦平县| 贵阳市| 昌图县| 汶上县| 久治县| 达拉特旗| 翼城县| 成武县| 浪卡子县| 辽源市| 常山县| 麦盖提县| 金门县| 云龙县| 南投市| 化德县| 彰武县| 高安市| 卢湾区| 荔浦县| 桐柏县| 家居|