Intro to R Markdown
Clay Ford
Spring 2017
R Markdown
- R Markdown is “an authoring framework for data science”
- It allows you to combine R code and exposition in one document that can be rendered as a report, a presentation, an article, or a web page.
- This means not having to copy-and-paste statistical results, graphs, and tables into a separate document. In other words, you can do your analysis and write-up in one program
- Built-in to RStudio and easy to use
Note: this presentation was created with R Markdown in RStudio.
How it works
- An R Markdown file is a text file with a
.Rmd
extension that you create and edit in RStudio
- It usually contains three types of content
- A header with rendering options
- R code chunks
- text mixed with simple Markdown formatting
- Markdown is a simple language for adding formatting like headers and bulleted lists to plain text.
- To convert an
.Rmd
file into a report (or presentation, or web page) in RStudio, click the Knit button, or hit Ctrl + Shift + K (Win) or Command + Shift + K (Mac)
- “Knit” refers to the
knitr
package, which has functions that convert .Rmd
files to a finished deliverable
Simple R Markdown example
I did not insert this plot into my presentation. I simply included the R code that creates it. When I generated this presentation, the R code was executed and the plot was inserted into this slide.
How to get started
- In RStudio, go to File…New File…R Markdown…
- Select Document, Presentation, Shiny, or From Template
- Enter Title, Author, select Output Format (eg, html, pdf) and select OK
- This opens a new
Rmd
file with sample text, markdown and R code included to get you started. Save it.
- Make changes to the
Rmd
file and click Knit to see your output. Repeat as often as you like.
Note: The “Shiny” and “From Template” options work a little differently which we’ll discuss.
A note about PDF files
- PDF output requires you install LaTex
- MiKTeX on Windows
- MacTeX 2013+ on OS X
- TeX Live 2013+ on Linux
- All LaTex installations are free
- LaTex enables you to typeset mathematics in your output
Knitting your Rmd file
- When you Knit your
Rmd
file, a separate output file is created and saved wherever you saved your Rmd
file
- It will have the same name as your
Rmd
file, but with a different extension
- Example: If you’re creating a pdf report, a pdf file is created
- You can’t knit a
Rmd
file until you save it
- Each time you knit a
Rmd
file, the previous output file is overwritten
- If the
Rmd
file cannot be Knit, an error message is printed to the R console, which may or may not be helpful (Google is your friend); the usual cause is an error in R code
Activity 1
- Create a new folder called “rmd_workshop”
- In RStudio, start a new R Markdown file that will create an HTML “Document”
- Name it “Rmd Workshop” and make yourself the Author
- Save it as “rmd01” in the folder you created in step 1 (File…Save As)
- Click Knit and check out the result
- Go to the “rmd_workshop” folder and verify you now have an
html
file
Text
- The text is your exposition
- RStudio has a spell checker (
F7
)
- It can be formatted with Markdown
- Go to Help…Markdown Quick Reference to learn how to use Markdown
- Markdown pro: easy to use (very little code to learn)
- Markdown con: not much fine control of the text (Eg: can’t just make font size smaller on a particular slide in a presentation)
R Code chunks
- R code is preceded by
```{r}
and ends with ```
- Insert a chunk with Ctrl + Alt + I (Win) or Command + Option + I (Mac)
- Code chunks can have options such as
echo = TRUE
which says show the R code as well as its result
- Other options:
message=FALSE
(supress any messages)
eval=FALSE
(show the R code but don’t execute it)
include=FALSE
(execute the R code but don’t show it)
- separate multiple options by a comma
- R code can also be executed “inline” with text, beginning with
`r
and ending with `
Global chunk options
You may have noticed this at the top of the Rmd
file:
This sets global chunk options.
- For presentations, the default is
echo=FALSE
. This means only show the results of running R code, not the code itself.
- For documents, the default is
echo=TRUE
. This means show both the R code and its result.
- see
help(opts_chunk)
for other options
LaTeX
- You can use LaTeX in
Rmd
files to typeset math
- Surround with single
$
for inline LaTeX
- For example, this
$c = \sqrt{a^2 + b^2}$
creates \(c = \sqrt{a^2 + b^2}\)
- Surround with
$$
for a display equation.
- For example this
$$P(A) = \frac{2}{3}$$
creates… \[P(A) = \frac{2}{3} \]
- LaTeX is an entire typesetting language and completely separate from R Markdown and RStudio
Rmd files in RStudio
- As of RStudio version 1.0.44,
Rmd
files function like notebooks
- You can execute R code and see the results inline with the rest of your text, including plots, without needing to “knit” the whole document
- It will also preview your LaTeX equations as well
- The gutter provides a menu to quickly jump to sections or code chunks
- In the line number margins you can collapse (hide) text and code by clicking on the arrow
- Preface comments in Rmd files with
[//]:
- And of course you can preview your final document in RStudio
Activity 2
- Go to the bottom of the
rmd01
file and enter ## Simulate Data
- Under the new header, type some text such as
"Let's simulate some data"
- Under your text, insert a code chuck with Ctrl + Alt + I (Win) or Command + Option + I (Mac)
- Enter the following R code in your code chunk:
x <- rnorm(100)
- Below the code chunk type “The proportion of observations greater than 0 is
mean(x > 0)
”, where mean(x > 0)
is between `r
and `
.
- Click Knit
Presentations
- R Markdown is ideal for presentation slides
- File…New File…R Markdown…, Presentations
- Select from two HTML types (Slidy or isoslides), or PDF (PDF version requires LaTeX)
- This HTML presentation was created with Slidy
Markdown in presentations
- Text preceded with
##
becomes the header for a new slide
- Format bullets with
> -
to have them appear incrementally
- You can start a new slide with a horizontal rule
***
if you do not want a header
- Can still use R code chunks, but not much will fit on a slide
- The PDF version is LaTeX Beamer
Activity 3
- Go to File…New File…R Markdown…
- Start a new HTML (Slidy) presentation titled “Rmd Presentation” (or whatever you want)
- Save as “
rmd02
” in your workshop folder and click Knit
- Review the presentation then close
- Add a new slide called “R Functions”
- Enter some text as bullets using “
> -
”
- Click Knit and try out the presentation
PDF and Word documents
- If you have LaTeX and MS Word installed you can create PDF and Word documents using R Markdown
- These will often have less-than-ideal page breaks due to plots; you may have to tweak image size
- Use
fig_width
and fig_height
options (inches) in the R code chunks
- Example:
{r fig.height=3, fig.width=4}
Adding a table of contents
- We can “turn on” the table of contents in the header for HTML and PDF documents but not MS Word
- Use header options
toc = true
and toc_depth = x
where x is the lowest level of headings to add to TOC
- Notice how the header is formatted; spacing is important:
Activity 4
- Start a new R Markdown PDF Document titled “Rmd PDF report” (or whatever you want), Click OK
- Save as “
rmd03
” and click Knit
- Close the file and try make the image smaller so it fits on the first page; try setting
fig.width
and fig.height
to 4 and 3, respectively
- Add the table of contents options
- Add a new section with
##
and add an R code chunk: try head(iris, n = 10)
- Click Knit and examine PDF
R Markdown Reference Guide and Cheat Sheet
In addition to Rmd
templates and the Markdown Quick Reference, RStudio provides quick access to an R Markdown Reference Guide and an R Markdown Cheat Sheet
- Go to Help…Cheatsheets…
- R Markdown Cheat Sheet
- R Markdown Reference Guide
R Markdown website steps
- Start a new RStudio Project in an existing or new directory
- Create a
_site.yml
file that contains the site architecture and navigation; see links on previous slide for example
- Create an
index.Rmd
file that will serve as a homepage
- Build other Rmd files as needed and link accordingly in the
_site.yml
file
- run
rmarkdown::render_site()
to build web site files
- web site files are created in a folder called
_site
The HTML files in _site
can be deployed as a stand-alone static website.
Using R Markdown Templates
- Some
R
packages provide R Markdown templates
- The
rticles
packages provides templates for the Journal of Statistical Software, The R Journal, and useR Conference abstracts, among others
- To use
rticles
templates you need to first install the package
- When using an R Markdown template, you are forced to create a new directory since the template involves multiple files
Using Shiny with R Markdown
- Shiny is a web application framework for R that allows you to create interactive web applications
- R Markdown has been extended to support interactive documents that contain Shiny applications
- Shiny is beyond the scope of this workshop but RStudio provides templates to get you started
- Choose either an HTML document or presentation
- You can preview the interactive document in RStudio but can’t email it to someone or upload to Collab
- Beyond previewing in RStudio, output has to be deployed on a Shiny Server or on ShinyApps hosted service
The state of R Markdown
- R Markdown is excellent for
- Notebooks
- Presentations
- Informally communicating analysis results
- Web pages
- R Markdown not (yet) quite perfect for
- writing journal articles, though progress is being made
- highly customized reports or documents with fine-tuned formatting
- It is in active development and continues to evolve