Chapter 2 R markdown
Before you go ahead and run the codes in this coursebook, it’s often a good idea to go through some initial setup. Under the Training Objectives section we’ll outline the syllabus, identify the key objectives and set up expectations for each module. Under the Libraries and Setup section you’ll see some code to initialize our workspace and the libraries we’ll be using for the projects. You may want to make sure that the libraries are installed beforehand by referring back to the packages listed here.
2.1 Preface
2.1.1 Introduction
Business reporting has become a common task in many business practices. Business reporting is a regular provision of information about operational or financial data for decision-makers within an organization to support their work. Writing a business report is oftentimes a time-consuming process. It involves the process of gathering data, data analysis, and report preparation which may require us to operate multiple software at once with a very repetitive task. Thus prevent us from focusing on the other more important part of our business.
This 2-day online workshop is a beginner-friendly introduction to automate business reporting with R. By learning how to automate various business reports, you will have more time to focus on what matters the most.
2.1.2 Training Objectives
This is the very first course of Automate: Business Reporting with R. The primary objective of this course is to provide a participant a comprehensive introduction about tools and software for producing a high-quality publication using R and R Markdown. This course will provide participants the basic knowledge for developing Automated Reporting with R. The syllabus covers:
- Introduction to R Markdown
- R and R Markdown
- Using R Markdown
- Writing Codes & Naration
- Chunk & Global Options
- Report Template using YAML
- Generate Report from R Markdown
- HTML Document
- PDF Document
- Word Document
- Interactive PowerPoint Presentation
- Parameterized Reports
- Declaring parameters
- Using Parameter in Code Chunks
- Using Parameter Inline Code
2.1.3 Library and Setup
In this Library and Setup section you’ll see some code to initialize our workspace, and the packages we’ll be using for this project.
Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. R comes with a standard set of packages. Others are available for download and installation. Once installed, they have to be loaded into the session to be used.
You will need to use install.packages()
to install any packages that are not yet downloaded onto your machine. To install packages, type the command below on your console then press ENTER.
Then you need to load the package into your workspace using the library()
function. Special for this course, the rmarkdown packages do not need to be called using library()
.
2.2 Introduction to R Markdown
2.2.1 R and R Markdown
Business reporting involves a series of gathering data from various resources, performing data analysis, producing summary and visualization, and sometimes even provide a future outlook for a specific business practice within an organization. Numerous software was developed to provide such a rich data analytics process. Throughout the years, R has developed into one of the most used tools for data analysis, supported by RStudio as an Integrated Development Environment (IDE) which provide an easy-to-use user interface in working with R. There are reasons why R is widely used for data analysis:
Built by Statisticians, for Statisticians.
R is a statistical programming language created by Ross Ihaka and Robert Gentleman at the Department of Statistics, at the University of Auckland (New Zealand). R is created for the purpose of data analysis and as such, is different in nature from traditional programming languages. R is not just a statistical programming language, it is a complete environment for data analyst and the most widely used data analysis software today.
Plentiful Libraries.
R provides numerous additional packages for which add out-of-the-box functionalities for various statistical tests (confidence tests, P-value, t-test, etc), time-series analysis, beautiful visualization, and various machine learning tasks such as regression algorithms, classification algorithms, and clustering algorithms. The R community is noted for its active contributions in terms of packages.
Open Source.
Part of the reason for its active and rapidly growing community is the open-source nature of R. Users can contribute packages – many of which packaged some of the most advanced statistical tools and customized templates for visualization that is not found in other commercials, proprietary statistical computing software.
Used by The Biggest Software Companies in The World.
Dive deeper into R’s analytical capability, R has been used not only for descriptive analytics but also to develop machine learning and artificial intelligence project of major software companies in the world. R is used by Google to calculate ROI on advertising campaigns and estimate causal effect (say, estimate the impact of an app feature on app downloads or number of additional sales from an AdWords campaign); In fact, it even released its own R packages to allow other R users to do similar analysis using the same tool. Data Science employees at Google participate in User Groups to discuss how R is used in Google. It’s used very widely in a production environment at Google and Google integrates R with many of their own technologies, publishing its own R client for the Google Prediction API, Google’s R style guide, and its developers have released a number of R packages over the years. Microsoft first uses R for Azure capacity planning, Xbox’s TrueSkill Matchmaking System, player churn analysis, in-game purchase optimization, fraud detection, and other internal services across Microsoft’s line of products, and then went on to acquire Revolution Analytics, whom products were then rebranded and renewed by Microsoft and now known as Microsoft R Server, Microsoft R Open, Microsoft Data Science Virtual Machine etc.
Ready for Big Data
RHadoop, ParallelR, Revolution R Enterprise and a handful of other toolkits adds powerful big data support, allowing data engineers to create custom parallel and distributed algorithms to handle parallel / map-reduce programming in R. This makes R a popular choice for big data analytics and high performance, enterprise-level analytics platform.
In addition to using R we will be using packages named R Markdown. R Markdown is R packages made for easy development of dynamic documents for R. R Markdown turn our analysis into high-quality documents, reports, presentations, and dashboards. R Markdown supports the development of various reports and publications using R and open opportunities for automated reporting.
R Markdown supports users with easy analysis in R because it enables users to weave together narrative text and code in the document. It supports multiple programming languages including R, Python, and SQL. After performing analysis, R Markdown supports dozens of static and dynamic output formats including PDF, Word document, HTML document, and Interactive PowerPoint Presentation. This is made entirely from one R Markdown document, allowing us to simultaneously export our analysis into various outputs.
To truly experience the excitement and convenience of using R, our next section will more interesting because we will directly use R and R Markdown to familiarize ourselves with them.
2.2.2 Using R Markdown
In this first section of R Markdown, we will explain briefly how to use R Markdown, from creating an R Markdown document, its component, and how we can produce a report using R Markdown.
Before using R Markdown you need to install the package rmarkdown into your machine. To further learn about packages and installation you can see the Library and Setup section above. Make sure that you already have the latest version of R and RStudio installed on your machine. If this is your first time using RStudio, you will see this in your RStudio window:
Above is the default view of RStudio. There are 4 panels each with its function:
- Editor: is where we can input codes and narration on specific files that can be saved into our computer.
- Console: is where we can input codes and perform analysis without saving it into our computer.
- Environment: is where R stores our data temporarily when doing data analysis in R. This allows us to see and track our data while doing data analysis. There is also tab history and connection, though we will not use these in this workshop.
- Files, packages, help, etc: is where we can track our files in our computer, our packages, and search for documentation and description about specific function/command we use in our project. Additionally, there are also plots and viewer to preview plots and files generated using R.
To easily analyze data and produce business reports using R we will be using R Markdown. We can create new R Markdown document by clicking on the menu File > New File > R Markdown. Alternatively, we can hover our mouse to a dropdown menu on the left corner of RStudio and then choose “R Markdown”. We will be directed to a pop-up for creating a new R Markdown document.
We can choose the title and author for our project and there are several output options we can choose. For the introduction, let’s use the default HTML output. An R Markdown document, a plain text file with the extension .Rmd
, will be created on our Editor panel.
The document contains three types of content:
- YAML Header
- surrounded by
---
before and after its section. - this is where we can custom our report template (will be discussed in the following section).
- surrounded by
- Code Chunks
- surrounded by
```
before and after its section, colored gray. - this is where we can put R function/commands for data analysis.
- surrounded by
- Text/Naration
- space colored white.
- this is where we write paragraphs or explanations for our business report.
- it can be added with various text formatting such as the use of
#
for heading.
This content allow us to write both R command for data analysis and business explanation in one file. That’s like working with Excel and Word at the same time, with added functionality to export it into various outputs with a customized template! Such a lot of work can be done with one document.
In addition to the versatility, another benefit of using R Markdown is its notebook interface. With R Markdown, the code inside the chunk can be executed independently and interactively, with output displayed immediately beneath the chunk. This allows complex data analysis using R to be performed and previewed easily. For example, if we run the code in the last chunk by clicking the green ‘play’ icon on the right side, a plot will come out.
Finally, we can also export the document into certain or multiple formats by using the Knit button in RStudio on the upper part of the document.
If we haven’t saved our document, R will direct us to save our file. In the example below, the document is saved with the name dummy.Rmd
inside the same working directory (folder) of our workshop material. The best practice is to store our R Markdown document and the data we use in one working directory. This is to prevent any connection error while importing data and such. After knitting the document, R will produce the document output based on the format we choose earlier.
In the next section, we will explore various ways of writing R codes and narration in R Markdown, including simple text formatting that will produce elegantly formatted output.
2.2.3 Writing Codes & Naration
2.2.3.1 Codes
Writing codes in R Markdown can be done inside a chunk. In the demo we made earlier, we have 3 chunks with its respective codes. We can quickly insert a chunk into our document using:
- keyboard shortcut Ctrl + Alt + I (OS X: Cmd + Option + I)
- click the add chunk
button in the Editor toolbar.
- manually type chunk delimiters
```{r}
and```
.
When we knit our document, the code output will be displayed beneath the code chunk. Below is an example:
#> [1] "Hello!"
Alternatively, you can also insert code directly into an R Markdown text or we call it as inline code by enclosing the code with `r `
. For example, did you notice that I write this using R codes? Because I did, using the following syntax.
For example, `r paste("did you notice that I write this using R codes?")`
This facilitates users to include data analysis result into his narration in the business report. R Markdown will display the result of an inline code but not the code, making it indistinguishable from the surrounding text. This allows flexibility in making an automated business report because we can generate narration adjusting to its changing input and its analysis.
2.2.3.2 Narration
Writing narration in R Markdown is mostly similar to writing narration in any word processing tools. The difference is how R Markdown use Pandoc’s Markdown, a set of markup annotations to format the text for our narration.
There are several commonly used annotations in R Markdown. For convenience, we have made a list of it and the preview below. You can access the document text_formatting.Rmd
in the same folder of course material. These annotations were taken from R Markdown Cheatsheet which you can download freely to explore more about various features in R Markdown.
In several cases, you might also want to write a code explanation or narration inside a chunk. When you type a narration inside a chunk it will produce an error because R recognizes your narration as a function/command that needs to be executed. To write a narration inside a chunk you will need to mark it as a comment by using #
before the text. Below is an example:
#> [1] 3.333333
2.2.4 Chunk & Global Options
2.2.4.1 Chunk Options
If you noticed earlier, most of the code we made inside the chunk will be displayed and so is the output. This is because we follow a set of default chunk options when generating an R Markdown output. Chunk options is a set of customizable argument to manage how chunks should be rendered when generating outputs. By default, chunks use an argument called echo = TRUE
which means to display the code in the generated output.
Above is an example of chunks taken from dummy.Rmd
that we made earlier. The document contains 3 chunks in which two of them have chunk options that were set. The chunk options can be set inside the bracket {}
of a chunk header, following after a chunk name/id (chunk name is optional). Below is an example:
```{r pressure, echo = FALSE}```
We can see a chunk named pressure followed by comma ,
and chunk option echo
which set as FALSE
. Based on the option that was set, R will execute the code, display the output but will not display or echoing the code in the R Markdown output.
There are several chunk options users commonly use:
- include = default to TRUE, if FALSE chunk will not be included in the final document but code will still be executed.
- eval = default to TRUE, if FALSE code will not be executed.
- echo = default to TRUE, if FALSE code will not be displayed in the final document.
- message = default to TRUE, if FALSE the message generated from the code will not be displayed.
- warning = default to TRUE, if FALSE the warning generated from the code will not be displayed.
- fig.height, fig.width = the figure width & height used for creating a figure/plot output.
For a complete list of chunk options, you can see the R Markdown Cheatsheet or for a more compressed version in R Markdown Reference Guide.
2.2.4.2 Global Options
Sometimes, you want to set specific chunk options to all the chunk in a document. We can set these using the global chunk options. We can type knitr::opts_chunk$set
in a code chunk and R will treat each option we pass into it as global chunk options.
The chunk named setup from dummy.Rmd
is an example where we set the echo = TRUE
for all chunks. If we want to hide all codes within our document then we can just change the echo options into echo = FALSE
. Below is an example of how we hide the code which made a summary of cars data, which previously displayed in the generated output.
Note that global chunk options can be overwritten by options we set in the individual chunk headers. This is why the generated output did not display code chunk to make the pressure plot even though it stated echo = TRUE
in the global chunk options. Notice that we have set echo = FALSE
in the individual chunk.
2.2.5 Report Template using YAML
The last part of the R Markdown introduction is to customize our report template using YAML. YAML stated in the first part of any R Markdown document. A default YAML will contain the following:
title: "title of the document"
author: "author of the document"
date: "date of the document"
output: html_document
The default output for an R Markdown is HTML document. Which is why HTML document has the richest features among other output formats. We can specify other outputs inside YAML output options (which we will cover in the next section), but prior understanding about how YAML in HTML document work will help us a lot in understanding the other outputs.
Before exploring many features of the HTML document, let’s add more content to our dummy.Rmd
. Use the knowledge you have learned from the previous section to do the following tasks:
- Change the title of the document to “Automate: Business Reporting with R”.
- Change the author of the document to your name.
- Change the date of the document into today’s date.
- Add a level 1 Header named “Lessons” before the header “R Markdown”.
- Change “Including Plots” into a level 3 header (thus will be located inside the “R Markdown” section).
2.2.5.1 Table of Content
To add various features to our HTML output, we can position html_document
in a new line with a tab (ENTER + TAB). Indentation is important in YAML and therefore should be checked thoroughly.
The first feature is table of content which you can add using toc
options below (ENTER + Double TAB). You can specify the depth of headers that applies to the table of content using toc_depth
options (defaults to 3). Using the YAML below, the generated output will only display the table of content for level 1-3 headers.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
You can also position the table of content on the left side of the document and make it float by using toc_float
. Floating table of content will always be visible even when the document is scrolled. You can set two additional options into the toc_float
:
- collapsed: default to true, whether toc will display only the top-level header (level 1-2).
- smooth scroll: default to true, whether page scrolls are animated when TOC items are navigated to via mouse clicks.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
2.2.5.2 Section Numbering
You can also add section numbering using number_sections
options.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
2.2.5.3 Tabbed Sections
If you notice, this coursebook has a tabbed section at Preface. The tabbed section can be made using {.tabset}
after the headers. This will cause all sub-headers of the header with the {.tabset}
attribute to appear within tabs rather than as standalone sections.
We can also control the appearance of the tab using:
- .tabset-fade: causes the tabs to fade in and out when switching between tabs.
- .tabset-pills: causes the visual appearance of the tabs to be “pill” rather than traditional tab
Below is an example:
2.2.5.4 Appearence and Style
The most enhanced feature in R Markdown is probably the ability to directly specifies themes for the document. You can choose between default, cerulean, journal, flatly, darkly, readable, spacelab, united, cosmo, lumen, paper, sandstone, simplex, and yeti. You can preview each one in here included some other themes using additional packages such as prettydoc and rmdformats. Alternatively, you can also pass null
for no theme or if you want to use CSS parameter of your own.
We can also specify highlight for the syntax highlighting style. Supported styles include default, tango, pygments, kate, monochrome, espresso, zenburn, haddock, breezedark, and textmate. Pass null
to prevent any syntax highlighting.
In this example, we will use theme united and highlighting style breezedark.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
theme: united
highlight: breezedark
2.2.5.5 Figure Options
We can also set the parameter for all figures in our R Markdown using figure options:
- fig_width and fig_height: to control figure width and height (by default is 7x5).
- fig_caption: whether figures will be rendered with captions.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
theme: united
highlight: breezedark
fig_width: 6
fig_height: 4
We can see that the plot is slightly smaller than the one generated earlier.
2.2.5.6 Data Display
Most of the time we will need to print our data in the form of tables in our business report. We can enhance the display of our data via df_print
option. For example, we can display it in the form of paged HTML tables by setting the df_print: paged
. This will allow us to navigate through paged rows and columns.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
theme: united
highlight: breezedark
fig_width: 6
fig_height: 4
df_print: paged
For example, below is Motor Trend Car Road Test data. The data was extracted from the 1974 Motor Trend US magazine and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
The df_print: paged
options can be very convenient for HTML output, but other formats such as kable
might be more suitable for PDF and Word. We will discuss further about how to make awesome table using kableExtra in the following course.
2.2.5.7 Code Folding
Previously we have discussed how to display and not display code chunk using chunk options in global options/individual chunk. In some cases, we might want the code to be available to show but not visible by default. The code_folding option enables us to include R code in an R Markdown output with choices to hide or show the R code by default. With this users can choose whether to show R code chunks individually or document-wide. For example:
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
theme: united
highlight: breezedark
fig_width: 6
fig_height: 4
df_print: paged
code_folding: hide
We have briefly explored YAML options that are commonly used for R Markdown output (default output HTML document is used for demonstration). Even so, there are still many more YAML options that are available to explore listed and described in R Markdown: The Definitive Guide, a book by Yihui Xie, J. J. Allaire, and Garrett Grolemund which created using R Markdown itself. Meanwhile, for a daily lookup, you check the full YAML options in R Markdown Cheatsheet or R Markdown Reference Guide for a simplified version.
2.3 Generate Report from R Markdown
Often times users need to create business reports in various formats. For example, A person needs his report to be shared with His teammates in the format of Word Document, some others asked for more interactivity and convenience for accessing via browsers so they asked for HTML Document. He also needs one to be printed using PDF and the one to be presented in front of the board/team members in Interactive PowerPoint Presentation. The most unexpected request then came from His boss who wants His report to be published in the form of Interactive Dashboard, so that He can easily access and so do His colleagues and the other department. That’s a lot of things to do and create! A person might be too focused on polishing just one business report for each requested format than in developing an in-depth analysis of the case/project that He needs to analyze.
This is the part where R Markdown can help! R Markdown offers various outputs by only using one file including HTML Document, PDF, Word Document, and Interactive PowerPoint Presentation which mostly used for business reports (and will be discussed in this section). In addition to those, R Markdown also supports formats including interactive dashboard, websites, books and more other which you can explore here.
We can specify the desired output by choosing the format in the pop-up that appeared when we first create a new R Markdown file. If we click the Presentation tab we will see several formats for presentation. If we click the Shiny tab we can choose between developing interactive documents or presentations. We can also use a built-in or customized template for our document by choosing From Template.
When we choose an output, R Markdown will automatically specify that output in the output options of our YAML. For example, the dummy.Rmd we made earlier is taking the html_document
output. Below is a glimpse of the YAML from the dummy.Rmd
we made earlier.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
When we Knit our document, R will automatically create the first specified output stated in our YAML options (in this case an HTML Document). We can also directly create another format by clicking the dropdown menu of the Knit button, and choose the available output. For example, if we choose PDF output than this PDF will be generated:
When we look back to our YAML options, R automatically added output: pdf_document
as a result, complete with the default feature options.
The YAML in R Markdown allows users to specify more than one output format for his R Markdown document. Moreover, users can also simultaneously customize the appearance for all of their desired output using only one document. This is quite an advantage because users do not have to spend an excessive amount of time customizing individually for each specific output in each respective software. Why not use such precious time with something that really matters.
We have understood that there are various outputs that an R Markdown can create. Each output and its format can be set inside the YAML of an R Markdown. Further on, we will be discussing the four outputs that we commonly use for making a business report and how to create one using the R Markdown document we have made earlier.
2.3.1 HTML Document
HTML Document is a default output of an R Markdown document. The dummy.Rmd
we made earlier is an example of HTML Document output. Besides choosing HTML output in the “create new R Markdown” pop-up. We can also manually set the output in YAML.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output: html_document
The above YAML example will result in default HTML document output, without specific themes or features such as the floating table of content. When we want to specify themes and features we can add YAML options like the one we have discussed in the previous section “Report Template using YAML”.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
number_sections: true
theme: united
highlight: breezedark
fig_width: 6
fig_height: 4
df_print: paged
code_folding: hide
2.3.2 PDF
To create a PDF output, we can create a new R Markdown and choose the output for PDF or manually setting output: pdf_document
in YAML. Since many of the HTML document features apply to PDF output, it is preferable to just add the output manually in the YAML if we are working on multiple outputs. After that knitting can be done for PDF output. For example,
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_float: true
pdf_document:
toc: true
Notice that the indentation for pdf_document
is inline with the other output html_document
. YAML options for features in each output can be specified manually too.
2.3.2.1 Features
Many of the features available in HTLM document also available in PDF and can be set in YAML. Those are table of content and its depth (default to 3), number sections, figure options, and syntax highlighting. Other features such as setting the font, document margin, and specific types of citation can be explored further in here.
2.3.2.2 Data Display
Displaying data or tables in PDF is particularly different than in HTML. While HTML document can use df_print: paged
, PDF can not. This is because of the limitation of static output that PDF has. The options available for displaying data in PDF output is default, kable and tibble, whereas kable offers a more tidy look compared to the others. Below is an example of a data displayed using option kable.
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
pdf_document:
toc: yes
df_print: kable
2.3.3 Word
To create a Word document, we can create a new R Markdown and choose the output for Word or manually setting output: word_document
in YAML. Same as PDF, many of the HTML document features apply to Word output. Therefore, we can just add the output manually in the YAML. For example,
title: "Automate: Business Reporting with R"
author: "Team Algoritma"
date: "8 April 2020"
output:
html_document:
toc: true
toc_float: true
word_document:
toc: true
df_print: kable
R Markdown also has a notable feature to create style reference document or a Word template. Later on, the template can be passed to the reference_docx
options of the word_document
in YAML. You can explore more features for Word document here.
2.3.4 Interactive PowerPoint Presentation
The last format we will discuss is the presentation formats which includes:
- Beamer Presentation: PDF presentations with beamer
- Ioslides Presentation: HTML presentations with ioslides
- Slidy Presentation: HTML presentations with slidy
- Powerpoint Presentation: PowerPoint presentation
- Revealjs Presentation: HTML presentations with reveal.js (using additional package revealjs)
In this section, we will be focusing more on the Beamer Presentation. Nevertheless, each format will intuitively divide our content into slides. The following will discuss how a Beamer Presentation works, its features, and how to customize its appearance.
2.3.4.1 How It Works
Creating a Beamer Presentation is almost the same as creating a new HTML document. You can create a new R Markdown and choose the output for Beamer Presentation. Alternatively, you can also manually set output: beamer_presentation
in YAML.
title: "Beamer Presentation"
author: "Team Algoritma"
date: "8 April 2020"
output: beamer_presentation
The difference with the HTML document is that it will divide your content into several slides according to #
and ##
(headers) of your document. Alternatively, you can also create a new slide without using header but a horizontal rule ---
. For example, Here is a demo of beamer presentation:
With this ability, we can simply create a presentation from the report we made beforehand by adjusting our content into its simplified version.
2.3.4.2 Themes
Beamer presentation also supports themes! You can specify its appearance in YAML using theme
,colortheme
, and fonttheme
, for example:
title: "Beamer Presentation"
author: "Team Algoritma"
date: "8 April 2020"
output:
beamer_presentation:
theme: "Ilmenau"
colortheme: "beaver"
fonttheme: "structurebold"
To explore more variety of themes for the beamer presentation you can access the link here and here.
2.3.4.3 Incremental Bullets
If you want to use incremental bullets on your presentation, you can simply use YAML options incremental: true
. Either way, you can also set incremental bullets for specific slides by adding >
before the bullets. Below is an example:
Setting incremental bullets in YAML:
title: "Beamer Presentation"
author: "Team Algoritma"
date: "8 April 2020"
output:
beamer_presentation:
theme: "Ilmenau"
colortheme: "beaver"
fonttheme: "structurebold"
incremental: true
Setting incremental bullets in each slide:
## Slide with Bullets incremental bullet
- Bullet 1
- Bullet 2
- Bullet 3
Beamer presentation also supports several features that available in HTML document such as table of content, figure options, and displaying data or tables.
There are more to explore for Beamer Presentation and so are various presentation outputs using R Markdown. For example, if you are interested in creating a dynamic presentation, you can explore Ioslides Presentation or the famous Revealjs Presentation. For more features, you can explore the R Markdown Definitive Guide, Chapter 4: Presentations.
We have explored various outputs supported by R Markdown which can be used to develop business reports. Even so, there is still so much more to explore. To look for even more outputs generated from R Markdown, you can explore The R Markdown Gallery.
2.4 Parameterized Reports
R Markdown is great at making reports, but to optimize its potential, let’s discuss on how R Markdown can make our report reproducible by using Parameterized Report.
When we use Parameterized Report, R Markdown will include one or more parameter whose values will be set to create the report. This allows us to determine, for example, the dataset to be made into reports. Allowing us to create Automated Reporting!
2.4.1 Declaring Parameters
Parameters are declared using the params
option within the YAML. For example, the file below creates the parameter named head and assigns 5 as its default value.
2.4.2 Using Parameter in Code Chunks
Parameters are made available to be used in code chunk by accessing the value using params$<parameter name>
. For example, the code below can be used to display the first 5 observations of the data. This is done by passing 5 (the default value) from params$head
into the head()
function which will return the first n-observation of a data.
As a side note, head()
and other R codes used for data analysis will be discussed further in the following course “Basic Data Analysis in R”.
2.4.3 Using Parameter Inline Code
Parameters can also be called in an inline code. Below is an example of how to use params in a code chunk as well as in inline code.
The convenience of using a Parameterized Document is when we want to create a document with specific parameters/inputs. We can create a document with a parameter by clicking Knit with Parameters on the Knit drop-down menu.
In the example below, we try to knit with parameters by specifying ‘10’ for the value in params$head
. We can see on the right side of the picture, the output created will follow our desired input.
For further exploration about Parameterized Document, you can access the full guide here.
We have completed the course R Markdown for Automate: Business Reporting with R. In the next chapter, we will be focusing on understanding and exploring the basic data analysis in R which hopefully will enhance your analytical skills when dealing with data.