20 ggplot2 internals

Throughout this book I have described ggplot2 from the perspective of a user rather than a developer. From the user’s point of view, the important thing is to understand how the interface to ggplot2 works. To make a data visualisation the user needs to know how functions like ggplot() and geom_point() can be used to specify a plot, but rarely does the user need to understand how ggplot2 translates this plot specification into an image. For a ggplot2 developer who hopes to design extensions, however, this understanding is paramount.

When making the jump from user to developer, it is common to encounter frustrations because the nature of the ggplot2 interface is very different to the structure of the underlying machinery that makes it work. As extending ggplot2 becomes more common, so too does the frustration related to understanding how it all fits together. This chapter is dedicated to providing a description of how ggplot2 works “behind the curtains”. I focus on the design of the system rather than technical details of its implementation, and the goal is to provide a conceptual understanding of how the parts fit together. I begin with an general overview of the process that unfolds when a ggplot object is plotted, and then dive into details, describing how the data flows through this whole process and ends up as visual elements in your plot.

20.1 The plot() method

To understand the machinery underpinning ggplot2, it is important to recognise that almost everything related to the plot drawing happens when you print the ggplot object, not when you construct it. For instance, in the code below, the object p is an abstract specification of the plot data, the layers, etc. It does not construct the image itself:

p <- ggplot(mpg, aes(displ, hwy, color = drv)) + 
  geom_point(position = "jitter") +
  geom_smooth(method = "lm", formula = y ~ x) + 
  facet_wrap(vars(year)) + 
  ggtitle("A plot for expository purposes")

The reason ggplot2 is designed this way is to allow the user to continue to add new elements to a plot at a later point, without needing to recalculate anything. One implication of this is that if you want to understand the mechanics of ggplot2, you have to follow your plot as it goes down the plot()46 rabbit hole. You can inspect the print method for ggplot objects by typing ggplot2:::plot.ggplot at the console, but for this chapter I will work with a simplified version. Stripped to its bare essentials, the ggplot2 plot method has the same structure as the following ggprint() function:

ggprint <- function(x) {
  data <- ggplot_build(x)
  gtable <- ggplot_gtable(data)
  grid::grid.newpage()
  grid::grid.draw(gtable)
  return(invisible(x))
}

This function does not handle every possible use case, but it is sufficient to draw the plot specified above:

ggprint(p) 

The code in our simplified print method reveals four distinct steps:

  • First, it calls ggplot_build() where the data for each layer is prepared and organised into a standardised format suitable for plotting.

  • Second, the prepared data is passed to the ggplot_gtable() and turns it into graphic elements stored in a gtable (we’ll come back to what that is later).

  • Third, the gtable object is converted to an image with the assistance of the grid package.

  • Fourth, the original ggplot object is invisibly returned to the user.

One thing that this process reveals is that ggplot2 itself does none of the low-level drawing: its responsibility ends when the gtable object has been created. Nor does the gtable package (which implements the gtable class) do any drawing. All drawing is performed by the grid package together with the active graphics device. This is an important point, as it means ggplot2 – or any extension to ggplot2 – does not concern itself with the nitty gritty of creating the visual output. Rather, its job is to convert user data to one or more graphical primitives such as polygons, lines, points, etc and then hand responsibility over to the grid package.

Although it is not strictly correct to do so, we will refer to this conversion into graphical primitives as the rendering process. The next two sections follow the data down the rendering rabbit hole through the build step (Section 20.2) and the gtable step (Section 20.3) whereupon – rather like Alice in Lewis Carroll’s novel – it finally arrives in the grid wonderland as a collection of graphical primitives.

20.2 The build step

ggplot_build(), as discussed above, takes the declarative representation constructed with the public API and augments it by preparing the data for conversion to graphic primitives.

20.2.1 Data preparation

The first part of the processing is to get the data associated with each layer and get it into a predictable format. A layer can either provide data in one of three ways: it can supply its own (e.g., if the data argument to a geom is a data frame), it can inherit the global data supplied to ggplot(), or else it might provide a function that returns a data frame when applied to the global data. In all three cases the result is a data frame that is passed to the plot layout, which orchestrates coordinate systems and facets. When this happens the data is first passed to the plot coordinate system which may change it (but usually doesn’t), and then to the facet which inspects the data to figure out how many panels the plot should have and how they should be organised. During this process the data associated with each layer will be augmented with a PANEL column. This column will (must) be kept throughout the rendering process and is used to link each data row to a specific facet panel in the final plot.

The last part of the data preparation is to convert the layer data into calculated aesthetic values. This involves evaluating all aesthetic expressions from aes() on the layer data. Further, if not given explicitly, the group aesthetic is calculated from the interaction of all non-continuous aesthetics. The group aesthetic is, like PANEL a special column that must be kept throughout the processing. As an example, the plot p created earlier contains only the one layer specified by geom_point() and at the end of the data preparation process the first 10 rows of the data associated with this layer look like this:

#>      x  y colour PANEL group
#> 1  1.8 29      f     1     2
#> 2  1.8 29      f     1     2
#> 3  2.0 31      f     2     2
#> 4  2.0 30      f     2     2
#> 5  2.8 26      f     1     2
#> 6  2.8 26      f     1     2
#> 7  3.1 27      f     2     2
#> 8  1.8 26      4     1     1
#> 9  1.8 25      4     1     1
#> 10 2.0 28      4     2     1

20.2.2 Data transformation

Once the layer data has been extracted and converted to a predictable format it undergoes a series of transformations until it has the format expected by the layer geometry.

The first step is to apply any scale transformations to the columns in the data. It is at this stage of the process that any argument to trans in a scale has an effect, and all subsequent rendering will take place in this transformed space. This is the reason why setting a position transform in the scale has a different effect than setting it in the coordinate system. If the transformation is specified in the scale it is applied before any other calculations, but if it is specified in the coordinate system the transformation is applied after those calculations. For instance, our original plot p involves no scale transformations so the layer data remain untouched at this stage. The first three rows are shown below:

#>     x  y colour PANEL group
#> 1 1.8 29      f     1     2
#> 2 1.8 29      f     1     2
#> 3 2.0 31      f     2     2

In contrast, if our plot object is p + scale_x_log10() and we inspect the layer data at this point in processing, we see that the x variable has been transformed appropriately:

#>       x  y colour PANEL group
#> 1 0.255 29      f     1     2
#> 2 0.255 29      f     1     2
#> 3 0.301 31      f     2     2

The second step in the process is to map the position aesthetics using the position scales, which unfolds differently depending on the kind of scale involved. For continuous position scales – such as those used in our example – the out of bounds function specified in the oob argument (Section 10.1.1) is applied at this point, and NA values in the layer data are removed. This makes little difference for p, but if we were plotting p + xlim(2, 8) instead the oob function – scales::censor() in this case – would replace x values below 2 with NA as illustrated below:

#> Warning: Removed 22 rows containing non-finite values (stat_smooth).
#>    x  y colour PANEL group
#> 1 NA 29      f     1     2
#> 2 NA 29      f     1     2
#> 3  2 31      f     2     2

For discrete positions the change is more radical, because the values are matched to the limits values or the breaks specification provided by the user, and then converted to integer-valued positions. Finally, for binned position scales the continuous data is first cut into bins using the breaks argument, and the position for each bin is set to the midpoint of its range. The reason for performing the mapping at this stage of the process is consistency: no matter what type of position scale is used, it will look continuous to the stat and geom computations. This is important because otherwise computations such as dodging and jitter would fail for discrete scales.

At the third stage in this transformation the data is handed to the layer stat where any statistical transformation takes place. The procedure is as follows: first, the stat is allowed to inspect the data and modify its parameters, then do a one off preparation of the data. Next, the layer data is split by PANEL and group, and statistics are calculated before the data is reassembled.47 Once the data has been reassembled in its new form it goes through another aesthetic mapping process. This is where any aesthetics whose computation has been delayed using stat() (or the old ..var.. notation) get added to the data. Notice that this is why stat() expressions – including the formula used to specify the regression model in the geom_smooth() layer of our example plot p – cannot refer to the original data. It simply doesn’t exist at this point.

As an example consider the second layer in our plot, which produces the linear regressions. Before the stat computations have been performed the data for this layer simply contain the coordinates and the required PANEL and group columns.

#>     x  y colour PANEL group
#> 1 1.8 29      f     1     2
#> 2 1.8 29      f     1     2
#> 3 2.0 31      f     2     2

After the stat computations have taken place, the layer data are changed considerably:

#>      x    y ymin ymax    se flipped_aes colour PANEL group
#> 1 1.80 24.3 23.1 25.6 0.625       FALSE      4     1     1
#> 2 1.86 24.2 22.9 25.4 0.612       FALSE      4     1     1
#> 3 1.92 24.0 22.8 25.2 0.598       FALSE      4     1     1

At this point the geom takes over from the stat (almost). The first action it takes is to inspect the data, update its parameters and possibly make a first pass modification of the data (same setup as for stat). This is possibly where some of the columns gets reparameterised e.g. x+width gets changed to xmin+xmax. After this the position adjustment gets applied, so that e.g. overlapping bars are stacked, etc. For our example plot p, it is at this step that the jittering is applied in the first layer of the plot and the x and y coordinates are perturbed:

#>      x    y colour PANEL group
#> 1 1.76 28.9      f     1     2
#> 2 1.80 28.9      f     1     2
#> 3 1.97 30.7      f     2     2

Next—and perhaps surprisingly—the position scales are all reset, retrained, and applied to the layer data. Thinking about it, this is absolutely necessary because, for example, stacking can change the range of one of the axes dramatically. In some cases (e.g., in the histogram example above) one of the position aesthetics may not even available until after the stat computations and if the scales were not retrained it would never get trained.

The last part of the data transformation is to train and map all non-positional aesthetics, i.e. convert whatever discrete or continuous input that is mapped to graphical parameters such as colours, linetypes, sizes etc. Further, any default aesthetics from the geom are added so that the data is now in a predictable state for the geom. At the very last step, both the stat and the facet gets a last chance to modify the data in its final mapped form with their finish_data() methods before the build step is done. For the plot object p, the first few rows from final state of the layer data look like this:

#>    colour    x    y PANEL group shape size fill alpha stroke
#> 1 #00BA38 1.84 28.9     1     2    19  1.5   NA    NA    0.5
#> 2 #00BA38 1.83 29.3     1     2    19  1.5   NA    NA    0.5
#> 3 #00BA38 2.00 31.3     2     2    19  1.5   NA    NA    0.5

20.2.3 Output

The return value of ggplot_build() is a list structure with the ggplot_built class. It contains the computed data, as well as a Layout object holding information about the trained coordinate system and faceting. Further it holds a copy of the original plot object, but now with trained scales.

20.3 The gtable step

The purpose of ggplot_gtable() is to take the output of the build step and turn it into a single gtable object that can be plotted using grid. At this point the main elements responsible for further computations are the geoms, the coordinate system, the facet, and the theme. The stats and position adjustments have all played their part already.

20.3.1 Rendering the panels

The first thing that happens is that the data is converted into its graphical representation. This happens in two steps. First, each layer is converted into a list of graphical objects (grobs). As with stats the conversion happens by splitting the data, first by PANEL, and then by group, with the possibility of the geom intercepting this splitting for performance reasons. While a lot of the data preparation has been performed already it is not uncommon that the geom does some additional transformation of the data during this step. A crucial part is to transform and normalise the position data. This is done by the coordinate system and while it often simply means that the data is normalised based on the limits of the coordinate system, it can also include radical transformations such as converting the positions into polar coordinates. The output of this is for each layer a list of gList objects corresponding to each panel in the facet layout. After this the facet takes over and assembles the panels. It does this by first collecting the grobs for each panel from the layers, along with rendering strips, backgrounds, gridlines,and axes based on the theme and combines all of this into a single gList for each panel. It then proceeds to arranging all these panels into a gtable based on the calculated panel layout. For most plots this is simple as there is only a single panel, but for e.g. plots using facet_wrap() it can be quite complicated. The output is the basis of the final gtable object. At this stage in the process our example plot p looks like this:

20.3.2 Adding guides

There are two types of guides in ggplot2: axes and legends. As our plot p illustrates at this point the axes has already been rendered and assembled together with the panels, but the legends are still missing. Rendering the legends is a complicated process that first trains a guide for each scale. Then, potentially multiple guides are merged if their mapping allows it before the layers that contribute to the legend is asked for key grobs for each key in the legend. These key grobs are then assembled across layers and combined to the final legend in a process that is quite reminiscent of how layers gets combined into the gtable of panels. In the end the output is a gtable that holds each legend box arranged and styled according to the theme and guide specifications. Once created the guide gtable is then added to the main gtable according to the legend.position theme setting. At this stage, our example plot is complete in most respects: the only thing missing is the title.

20.3.3 Adding adornment

The only thing remaining is to add title, subtitle, caption, and tag as well as add background and margins, at which point the final gtable is done.

20.3.4 Output

At this point ggplot2 is ready to hand over to grid. Our rendering process is more or less equivalent to the code below and the end result is, as described above, a gtable:

p_built <- ggplot_build(p)
p_gtable <- ggplot_gtable(p_built)

class(p_gtable)
#> [1] "gtable" "gTree"  "grob"   "gDesc"

What is less obvious is that the dimensions of the object is unpredictable and will depend on both the faceting, legend placement, and which titles are drawn. It is thus not advised to depend on row and column placement in your code, should you want to further modify the gtable. All elements of the gtable are named though, so it is still possible to reliably retrieve, e.g. the grob holding the top-left y-axis with a bit of work. As an illustration, the gtable for our plot p is shown in the code below:

p_gtable
#> TableGrob (13 x 15) "layout": 22 grobs
#>     z         cells        name                                          grob
#> 1   0 ( 1-13, 1-15)  background               rect[plot.background..rect.741]
#> 2   1 ( 8- 8, 5- 5)   panel-1-1                      gTree[panel-1.gTree.612]
#> 3   1 ( 8- 8, 9- 9)   panel-2-1                      gTree[panel-2.gTree.627]
#> 4   3 ( 6- 6, 5- 5)  axis-t-1-1                                zeroGrob[NULL]
#> 5   3 ( 6- 6, 9- 9)  axis-t-2-1                                zeroGrob[NULL]
#> 6   3 ( 9- 9, 5- 5)  axis-b-1-1           absoluteGrob[GRID.absoluteGrob.631]
#> 7   3 ( 9- 9, 9- 9)  axis-b-2-1           absoluteGrob[GRID.absoluteGrob.631]
#> 8   3 ( 8- 8, 8- 8)  axis-l-1-2                                zeroGrob[NULL]
#> 9   3 ( 8- 8, 4- 4)  axis-l-1-1           absoluteGrob[GRID.absoluteGrob.639]
#> 10  3 ( 8- 8,10-10)  axis-r-1-2                                zeroGrob[NULL]
#> 11  3 ( 8- 8, 6- 6)  axis-r-1-1                                zeroGrob[NULL]
#> 12  2 ( 7- 7, 5- 5) strip-t-1-1                                 gtable[strip]
#> 13  2 ( 7- 7, 9- 9) strip-t-2-1                                 gtable[strip]
#> 14  4 ( 5- 5, 5- 9)      xlab-t                                zeroGrob[NULL]
#> 15  5 (10-10, 5- 9)      xlab-b titleGrob[axis.title.x.bottom..titleGrob.694]
#> 16  6 ( 8- 8, 3- 3)      ylab-l   titleGrob[axis.title.y.left..titleGrob.697]
#> 17  7 ( 8- 8,11-11)      ylab-r                                zeroGrob[NULL]
#> 18  8 ( 8- 8,13-13)   guide-box                             gtable[guide-box]
#> 19  9 ( 4- 4, 5- 9)    subtitle         zeroGrob[plot.subtitle..zeroGrob.737]
#> 20 10 ( 3- 3, 5- 9)       title          titleGrob[plot.title..titleGrob.736]
#> 21 11 (11-11, 5- 9)     caption          zeroGrob[plot.caption..zeroGrob.739]
#> 22 12 ( 2- 2, 2- 2)         tag              zeroGrob[plot.tag..zeroGrob.738]

The final plot, as one would hope, looks identical to the original:

grid::grid.newpage()
grid::grid.draw(p_gtable)

20.4 Introducing ggproto

ggplot2 has undergone a couple of rewrites during its long life. A few of these have introduced new class systems to the underlying code. While there is still a small amount of leftover from older class systems, the code has more or less coalesced around the ggproto class system introduced in ggplot2 v2.0.0. ggproto is a custom build class system made specifically for ggplot2 to facilitate portable extension classes. Like the more well-known R6 system it is a system using reference semantics, allowing inheritance and access to methods from parent classes. On top of the ggproto is a set of design principles that, while not enforced by ggproto, is essential to how the system is used in ggplot2.

20.4.1 ggproto syntax

A ggproto object is created using the ggproto() function, which takes a class name, a parent class and a range of fields and methods:

Person <- ggproto("Person", NULL,
  first = "",
  last = "",
  birthdate = NA,
  
  full_name = function(self) {
    paste(self$first, self$last)
  },
  age = function(self) {
    days_old <- Sys.Date() - self$birthdate
    floor(as.integer(days_old) / 365.25)
  },
  description = function(self) {
    paste(self$full_name(), "is", self$age(), "old")
  }
)

As can be seen, fields and methods are not differentiated in the construction, and they are not treated differently from a user perspective. Methods can take a first argument self which gives the method access to its own fields and methods, but it won’t be part of the final method signature. One surprising quirk if you come from other reference based object systems in R is that ggproto() does not return a class constructor; it returns an object. New instances of the class is constructed by subclassing the object without giving a new class name:

Me <- ggproto(NULL, Person,
  first = "Thomas Lin",
  last = "Pedersen",
  birthdate = as.Date("1985/10/12")
)

Me$description()
#> [1] "Thomas Lin Pedersen is 35 old"

When subclassing and overwriting methods, the parent class and its methods are available through the ggproto_parent() function:

Police <- ggproto("Police", Person,
  description = function(self) {
    paste(
      "Detective",
      ggproto_parent(Person, self)$description()
    )
  }
)

John <- ggproto(NULL, Police,
  first = "John",
  last = "McClane",
  birthdate = as.Date("1955/03/19")
)

John$description()
#> [1] "Detective John McClane is 66 old"

For reasons that we’ll discuss below, the use of ggproto_parent() is not that prevalent in the ggplot2 source code.

All in all ggproto is a minimal class system that is designed to accommodate ggplot2 and nothing else. It’s structure is heavily guided by the proto class system used in early versions of ggplot2 in order to reduce the required changes to the ggplot2 source code during the switch, and its features are those required by ggplot2 and nothing more.

20.4.2 ggproto style guide

While ggproto is flexible enough to be used in many ways, it is used in ggplot2 in a very deliberate way. As you are most likely to use ggproto in the context of extending ggplot2 you will need to understand these ways.

20.4.2.1 ggproto classes are used selectively

The use of ggproto in ggplot2 is not all-encompassing. Only select functionality is based on ggproto and it is not expected, nor advised to create new ggproto classes to encapsulate logic in your extensions. This means that you, as an extension developer, will never create ggproto objects from scratch but rather subclass one of the main ggproto classes provided by ggplot2. Later chapters will go into detail on how exactly to do that.

20.4.2.2 ggproto classes are stateless

Except for a few selected internal classes used to orchestrate the rendering, ggproto classes in ggplot2 are stateless. This means that after they are constructed they will not change. This breaks a common expectation for reference based classes where methods will alter the state of the object, but it is paramount that you adhere to this principle. If e.g. some of your Stat or Geom extensions changed state during rendering, plotting a saved ggplot object would affect all instances of that object as all copies would point to the same ggproto objects. State is imposed in two ways in ggplot2. At creation, which is ok because this state should be shared between all instances anyway, and through a params object managed elsewhere. As you’ll see later, most ggproto classes have a setup_params() method where data can be inspected and specific properties calculated and stored.

20.4.2.3 ggproto classes have simple inheritance

Because ggproto class instances are stateless it is relatively safe to call methods from other classes inside a method, instead of inheriting directly from the class. Because of this it is relatively common to borrow functionality from other classes without creating an explicit inheritance. As an example, the setup_params() method in GeomErrorbar is defined as:

GeomErrorbar <- ggproto(
  # ...
  setup_params = function(data, params) {
    GeomLinerange$setup_params(data, params)
  }
  # ...
)

While we have seen that parent methods can be called using ggproto_parent() this pattern is quite rare to find in the ggplot2 source code, as the pattern shown above is often clearer and just as safe.