Good Practice for Graph Production
Current theory is based upon Edward Tufte, a pioneer in the field of data visualisation. The main principles of his work are:
Plotting in R, ggplot2 vs plot
When comparing methods of plotting within R, most packages will be based upon one of two styles, The base R
plot() function, or the Tidyverse
ggplot() function. Although both are useful, both can be seen to have limitations in their usage.
plot() function, is ideal for quick and dirty scatter plotting, requiring only three basic parameters,
data =, and
y ~ x. And although variants do exist for bar plots (
barplot()), histograms (
hist()) and others, these (in my opinion) require more knowledge about their individual functions than the
ggplot() system as a whole. This being said, for those who wish to generate (and master) one particular type of plot, this function may be more ideal.
As mentioned previous,
ggplot() is a much more diverse function, which allows for a standard method of plotting across different plot types, with only minor variants depending on the type. This function (as seen in Figure 1, below) works through the layering of multiple information based layers in order to form a plot. This makes it ideal for plotting multiple layers of clear information with the capacity to edit areas at the individual or global level accordingly.
Overall, although both methods are useful, this session will focus on using
ggplot() given is accessibility and standard practice, making it easier to create beautiful plots.