The goal of reproducible research is to tie specific instructions to data analysis
and experimental data so that scholarship can be recreated, better understood and
verified.
R largely facilitates reproducible research using literate programming; a document
that is a combination of content and data analysis code. The
Sweave
function (in the
base R utils package) and the
knitr
package can be used to blend the subject matter and R code
so that a single document defines the content and the algorithms.
Basic packages can be structured into the following groups:
-
LaTeX Markup
:
The
Hmisc,
xtable
and
tables
packages contain functions to write R objects into LaTeX
representations.
Hmisc
also includes methods for
translating strings to proper LaTeX markup (e.g., ">=" to
"$\geq$"). Animations can be inserted into LaTeX documents
being converted to PDF via the
animation
package. The
pictex
function in the base grDevices package is a
PicTeX graphics driver and the
tikzDevice
can convert R graphics to
TikZ
markup. The
tth
package can convert TeX to HTML.
-
HTML Markup
:
The
R2HTML
package
has drivers that allow
Sweave
to process HTML documents via
Sweave.
Both
R2HTML
and
hwriter
can be used to build HTML pages sequentially.
R2HTML,
xtable
and
hwriter
can also convert some
R objects into HTML representations.
knitr
also has facilities to weave
R code with HTML as well as convert markdown to HTML.
-
ODF Markup
:
The
odfWeave
package extends
Sweave
to the
Open Document Format
.
Word processing tools, such as OpenOffice.org, can then be used to blend content and programs.
Many word processors can be used to translate the ODF document to other formats
(e.g., Word, PDF, HTML, etc.)
-
Microsoft Formats
:
The
R2wd
and
R2PPT
packages for Windows
can be used to communicate between R and Word or PowerPoint via
the COM interface. Document elements (e.g. sections, text,
images, etc) that are created in R can be inserted into the
document from R. The
rtf
can also be used to create
RTF format documents directly from R. Commercial R products
that work with RTF and/or Word are
RTFGen
,
Inference for R
and
SWord
.The output from other
packages (odfWeave
and
R2HTML) can also
be opened by Word.
RExcel
can integrate code with
Microsoft Excel.
Additionally, the
table1xls
can convert summary tables to
Excel files.
-
Plain Text Formats
:
R code and output in
Sweave
files can be converted into
AsciiDoc
and other structured
text formats using the
ascii
package. The
markdown
and
knitr
packages have tools for
markdown
format.
-
Syntax Highlighting
:
The
SweaveListingUtils
package can also provide enhanced control over how
R code chunks and their output are rendered in LaTeX.
-
Caching of R Objects
:
The
cacheSweave
and
weaver
packages allow caching of specific
code chunks. The
cacher
and
R.cache
packages can also be used but are
not integrated with
Sweave.
The
SRPM
package (for shared reproducibility package management) creates
an R package that organize the results of an
Sweave
document into different
directories (e.g., article, figures, etc).
knitr
also has the ability to
cache the results of code chunks.
-
Others
:
The
brew
and
R.rsp
packages contain alternative approaches
to embedding R code into various markups.
knitr
is a comprehensive package
derived from
Sweave
that includes code formatting, highlighting,
caching, fine control of graphics, conditional evaluation, multiple
markup formats and other features. The
pander
package can write R
objects into
Pandoc's
markdown
and also to convert those or complex reports to PDF/HTML/docx/ODT. The
rapport
package builds on
pander
and provides a way to create
reproducible statistical report templates with graphs, tables and annotations to
be applied to any R data frame and export the results in different formats. The
installr
package for Windows can download and install MikTeX, pandoc
(and other software), as well as quickly update R itself.
An incomplete list of packages which facilitate literate programming for specific
types of analysis or objects:
-
The base R utils package has generic functions to convert objects to
LaTeX (via
toLatex) and BibTeX (via
toBibtex).
The
bibtex
can also be used to parse BibTeX files.
-
Functions for creating LaTeX representations of summary statistics and visualizations
can be found in the
Hmisc,
reporttools, and
r2lh
packages.
Hmisc
also has functions for marking up data frames and the
quantreg
and
memisc
packages can mark up matrices.
-
Cross-tabulations can be converted to LaTeX code using the
Hmisc
and
memisc
packages.
-
The
xtable
and
rms
packages provide LaTeX
representations of some common models (e.g., Cox proportional hazards model, etc.).
For example, processing an
aov
object with the
xtable
function will generate LaTeX markup of the ANOVA table. Similarly, methods exist
for
glm,
prcomp,
ts
and other types of objects.
-
The
quantreg
contains LaTeX markup functions for quantile regression
fit summaries.
-
Standardized exams can be created using the
exams
package
-
The
odfWeave.survey
and
TeachingSampling
packages provide
ODF and LaTeX functions, respectively, for survey sampling objects. The
suRtex
package can create LaTeX markup of descriptive statistical summaries of
survey data.
-
The
texreg
has functions to create nice LaTeX and HTML representations
of one or more objects (e.g.
lm,
lme4, etc.). The
stargazer
has similar functionality for showing models and summary tables
in LaTeX and ASCII.
-
The
sparktex
package can create LaTeX representations of
sparklines
.
-
LaTeX markup for several object types can be created using the
papeR
package.