官术网_书友最值得收藏!

Doing reproducible research with R Markdown

Since the concepts and methods of reproducible research have their own range of topics that could easily fill several books, we would like to focus on the abilities of R Markdown in conjunction with RStudio in this chapter. But first, we will give you a few introductory statements.

What is Markdown?

Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write, using an easy-to-read, easy-to-write plain text format, and then convert it to structurally valid XHTML (or HTML).

--John Gruber, Creator of Markdown (http://daringfireball.net/projects/markdown/)

This simplification, and the fact that it is akin to the markup language, enables authors to perform a fast and intuitive formatting of text. Following are some examples:

  • The use of hashes will create headlines:
    # This is an H1 heading
    
    ## This is an H2 heading
    
    ### This is an H3 heading
    
    #### This is an H4 heading
  • If you want to create an ordered list, you can use numbers with a period:
    1. Blue
    
    2. Green
    
    3. Black
    
    4. Yellow
  • For unordered lists, you can use * (asterisks), + (plusses), and (hyphens) as interchangeable list markers, instead of numbers.

What is literate programming?

Literate programming refers to the writing of computer programs in such as a way that they are mainly readable for human beings. Technically, it means that both the documentation and the source code of a respective program are available in a single file. The following characteristics of a literate programming system are required:

  • The source code and comments can be mixed.
  • The source code sections can be arranged in any order. The literate programming system automatically composes the code as machine-readable and in an executable sequence.
  • The literate programming system automatically creates a human-readable document with a table of contents, references, citation, and other similar parts.

The process of creating human-readable documents is called weaving, while the creation of machine-readable documents is called tangling.

To use the principles of literate programming with R and RStudio, you can either use Sweave or the knitr package.

A brief side note on Sweave

Sweave is part of every R installation and was created by Prof. Friedrich Leisch. It uses LaTeX as documentation and R as the programming language. A heavy usage of Sweave is limited by the fact that LaTeX is a comparatively complex markup language. Furthermore, this tool lacks modern and important features such as the caching and using of various programming languages at once.

If you want to create an R Sweave document in RStudio, click on the new file button and choose R Sweave:

For more information about Sweave please visit: http://www.statistik.lmu.de/~leisch/Sweave/.

Dynamic report generation with knitr

The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package (knitr ≈ Sweave + cacheSweave + pgfSweave + weaver + animation::saveLatex + R2HTML::RweaveHTML + highlight::HighlightWeaveLatex + 0.2 * brew + 0.1 * SweaveListingUtils + more).

--Yihui Xie, Creator of knitr (http://yihui.name/knitr/)

Since knitr is actively maintained and does not have the technical limitations of Sweave, we will only use this package in the following sections to create R Markdown files. Knitr uses R as the programming language, but it is also feasible to use other languages such as Python, SAS, Perl, Ruby, and others. Furthermore, it is possible to use different formats or languages such as HTML, LaTeX, AsciiDoc, and Markdown as the documentation language.

What is R Markdown?

R Markdown is the integration of plain R code and Markdown, and is based on the knitr package and the open-source document converter, pandoc. It further combines dynamic documents, literate programming, and reproducible research. With the help of R Markdown, you can easily use R code and Markdown to create a report with human-readable documentations along with the results of your code. As an output option, you can choose between an HTML file, PDF, Microsoft Word, ioslides, and others.

A side note about LaTeX

LaTeX is an important component of R Markdown, therefore, the following are some notes about this technology:

LaTeX is a document preparation system for high-quality typesetting. It is most often used for medium-to-large technical or scientific documents, but it can be used for almost any form of publishing.

LaTeX is based on Donald E. Knuth's TeX typesetting language or certain extensions. LaTeX was first developed in 1985 by Leslie Lamport, and is now being maintained and developed by the LaTeX3 Project.

--(http://latex-project.org/intro.html)

Configuring R Markdown

To get started with R Markdown, you need to install and configure some required software. We assume that you already installed the latest version of R and RStudio. RStudio will automatically install the mandatory packages rmarkdown and knitr, as well as pandoc, the markup converter toolbox. Moreover, you need to install LaTeX, and also Tex for PDF, as a Markdown output format. If you want to use the output format Word, an installation of Microsoft Word or Libre Office should be installed on your computer.

主站蜘蛛池模板: 蒙自县| 长海县| 敦化市| 临清市| 雷山县| 南投市| 磴口县| 朝阳县| 四平市| 天津市| 屏边| 郴州市| 焦作市| 淮北市| 乌拉特前旗| 广安市| 云阳县| 胶南市| 浑源县| 天门市| 休宁县| 仁布县| 宿松县| 应城市| 堆龙德庆县| 宜兰市| 全州县| 中阳县| 贞丰县| 涞源县| 高唐县| 平江县| 迁西县| 平度市| 凤山县| 集安市| 苍南县| 新兴县| 汾西县| 金平| 阳泉市|