23 Extension Case Study: Springs, Part 3

In the last chapter we came as far as possible with our Stat-centered approach to a spring geom. In some cases this is enough. Many of the graphic primitives provided by the ggforce extension package is developed in this manner. Aspects of our spring geom means that we need to go further here (also, this is to learn about extensions so it would be bad pedagogy to stop now). These aspects, specifically, is that there is visual appearances of the spring that need to be unrelated to the coordinate system (the tension and diameter aesthetics). Consequently, in the chapter we will rewrite our geom to be a proper Geom extension.

23.1 Geom extensions

As discussed in the overview of extensions, there are many similarities between Stat and Geom extensions. The biggest difference is that Stat extensions returns a modified version of the input data, whereas Geom extensions return grid grobs (more on that later). Since our geom is basically a path, we can get pretty far by simply extending GeomPath by changing the data before it is rendered:

GeomSpring <- ggproto("GeomSpring", GeomPath,
  ...,
  setup_data = function(data, params) {
    cols_to_keep <- setdiff(names(data), c("x", "y", "xend", "yend"))
    springs <- lapply(seq_len(nrow(data)), function(i) {
      spring_path <- create_spring(data$x[i], data$y[i], data$xend[i], 
                                   data$yend[i], data$diameter[i],
                                   data$tension[i], params$n)
      spring_path <- cbind(spring_path, unclass(data[i, cols_to_keep]))
      spring_path$group <- i
      spring_path
    })
    do.call(rbind, springs)
  },
  ...
)

In the above we have used the setup_data() method to avoid having to think about rendering at all. Since our create_spring() function returns data that GeomPath inherently understand we can simply piggy-back on its draw_*() methods. Now, this is an imperfect solution and not really an improvement over our StatSpring approach. setup_data() is called before default aesthetics and set aesthetics are added to the data, so it will only work if everything is defined within aes(). While setup_data() is very nice for some situations you should always be very aware of the limitation it has in terms of what data is available to it.

To improve upon this we will need to move our generation into the draw_*() methods. However, we can still utilize the GeomPath implementation to avoid having to deal with grid grob generation just yet:

GeomSpring <- ggproto("GeomSpring", Geom,
  setup_params = function(data, params) {
    if (is.null(params$n)) {
      params$n <- 50
    } else if (params$n <= 0) {
      rlang::abort("Springs must be defined with `n` greater than 0")
    }
    params
  },
  setup_data = function(data, params) {
    if (is.null(data$group)) {
      data$group <- seq_len(nrow(data))
    }
    if (anyDuplicated(data$group)) {
      data$group <- paste(data$group, seq_len(nrow(data)), sep = "-")
    }
    data
  },
  draw_panel = function(data, panel_params, coord, n = 50, arrow = NULL,
                        lineend = "butt", linejoin = "round", linemitre = 10,
                        na.rm = FALSE) {
    cols_to_keep <- setdiff(names(data), c("x", "y", "xend", "yend"))
    springs <- lapply(seq_len(nrow(data)), function(i) {
      spring_path <- create_spring(data$x[i], data$y[i], data$xend[i], 
                                   data$yend[i], data$diameter[i],
                                   data$tension[i], n)
      cbind(spring_path, unclass(data[i, cols_to_keep]))
    })
    springs <- do.call(rbind, springs)
    GeomPath$draw_panel(
      data = springs, 
      panel_params = panel_params, 
      coord = coord, 
      arrow = arrow, 
      lineend = lineend, 
      linejoin = linejoin, 
      linemitre = linemitre, 
      na.rm = na.rm
    )
  },
  required_aes = c("x", "y", "xend", "yend"),
  default_aes = aes(
    colour = "black", 
    size = 0.5, 
    linetype = 1L, 
    alpha = NA, 
    diameter = 1, 
    tension = 0.75
  )
)

Developers used to object-oriented design may frown upon this design where we call the method of another kindred object directly (GeomPath$draw_panel()), but since Geom objects are stateless this is as safe as subclassing GeomPath and calling the parent method. You can see this approach all over the place in the ggplot2 source code. If you compare this code to our StatSpring implementation in the last chapter you can see that the compute_panel() and draw_panel() methods are quite similar with the main difference being that we pass on the computed spring coordinates to GeomPath$draw_panel() in the latter method. Our setup_data() method has been greatly simplified because we now relies on the default_aes functionality in Geom to fill out non-mapped aesthetics.

Creating the geom_spring() constructor is almost similar, except that we now uses the identity stat instead of our spring stat and uses the new GeomSpring instead of GeomPath.

geom_spring <- function(mapping = NULL, data = NULL, stat = "identity", 
                        position = "identity", ..., n = 50, arrow = NULL, 
                        lineend = "butt", linejoin = "round", na.rm = FALSE,
                        show.legend = NA, inherit.aes = TRUE) {
  layer(
    data = data, 
    mapping = mapping, 
    stat = stat, 
    geom = GeomSpring, 
    position = position, 
    show.legend = show.legend, 
    inherit.aes = inherit.aes, 
    params = list(
      n = n, 
      arrow = arrow, 
      lineend = lineend, 
      linejoin = linejoin, 
      na.rm = na.rm, 
      ...
    )
  )
}

Without much additional work we now have a proper geom with working default aesthetics and the possibility of setting aesthetics as parameters.

ggplot(some_data) + 
  geom_spring(aes(x = x, y = y, xend = xend, yend = yend),
              diameter = 0.5)

This is basically how far we can get without moving in to grid territory. Creating grid grobs is somewhat of an advanced subject so we will give it its own chapter. Using the techniques described in the last three chapters is enough to solve 95% of your geom extension needs and creating new grobs is almost never required except for when you need to transform your data relative to the physical size of the plot (e.g. setting the diameter of our spring to 1 cm).

23.2 Post-Mortem

In this chapter we finally created a proper Geom extension that behaves as we would expect. You may think that this is always the natural conclusion to the development of new layer types, but there is value in the Stat approach we reached in the last chapter as well, mainly that it makes it possible to use the data transformation with multiple different geoms (e.g. plotting dots at each coordinate instead of connecting them to a path). The final choice is ultimately up to you and should be guided by how you envision the layer to be used.

In this chapter we did not touch too much upon what goes on inside the draw_*() methods. Often it is enough to use a method from another geom and don’t worry about it. Even for something quite complex like the boxplot geom you will see that it simply combines output from multiple different geoms such as GeomSegment and GeomCrossbar. We will get much deeper into the draw_*() methods in the next chapter when we also create our own grob.