7 Space and time

7.1 Surface plots

ggplot2 does not support true 3d surfaces. However, it does support many common tools for representing 3d surfaces in 2d: contours, coloured tiles and bubble plots. These all work similarly, differing only in the aesthetic used for the third dimension. Here is an example of a contour plot:

The reference to the ..level.. variable in this code may seem confusing, because there is no variable called ..level.. in the faithfuld data. In this context the .. notation refers to a variable computed internally (see Section 11.6.1). To display the same density as a heat map, you can use geom_raster():

For interactive 3d plots, including true 3d surfaces, see RGL, http://rgl.neoscientists.org/about.shtml.

7.2 Drawing maps

There are four types of map data you might want to visualise: vector boundaries, point metadata, area metadata, and raster images. Typically, assembling these datasets is the most challenging part of drawing maps. Unfortunately ggplot2 can’t help you with that part of the analysis, but I’ll provide some hints about other R packages that you might want to look at.

I’ll illustrate each of the four types of map data with some maps of Michigan.

7.2.1 Vector boundaries

Vector boundaries are defined by a data frame with one row for each “corner” of a geographical region like a country, state, or county. It requires four variables:

  • lat and long, giving the location of a point.
  • group, a unique identifier for each contiguous region.
  • id, the name of the region.

Separate group and id variables are necessary because sometimes a geographical unit isn’t a contiguous polygon. For example, Hawaii is composed of multiple islands that can’t be drawn using a single polygon.

The following code extracts that data from the built in maps package using ggplot2::map_data(). The maps package isn’t particularly accurate or up-to-date, but it’s built into R so it’s a reasonable place to start.

You can visualise vector boundary data with geom_polygon():

Note the use of coord_quickmap(): it’s a quick and dirty adjustment that ensures that the aspect ratio of the plot is set correctly.

Other useful sources of vector boundary data are:

  • The USAboundaries package, https://github.com/ropensci/USAboundaries which contains state, county and zip code data for the US. As well as current boundaries, it also has state and county boundaries going back to the 1600s.

  • The tigris package, https://github.com/walkerke/tigris, makes it easy to access the US Census TIGRIS shapefiles. It contains state, county, zipcode, and census tract boundaries, as well as many other useful datasets.

  • The rnaturalearth package bundles up the free, high-quality data from http://naturalearthdata.com/. It contains country borders, and borders for the top-level region within each country ( e.g. states in the USA, regions in France, counties in the UK).

  • The osmar package, https://cran.r-project.org/package=osmar wraps up the OpenStreetMap API so you can access a wide range of vector data including indvidual streets and buildings

  • You may have your own shape files (.shp). You can load them into R with maptools::readShapeSpatial().

These sources all generate spatial data frames defined by the sp package. You can convert them into a data frame with fortify():

7.2.3 Raster images

Instead of displaying context with vector boundaries, you might want to draw a traditional map underneath. This is called a raster image. The easiest way to get a raster map of a given area is to use the ggmap package, which allows you to get data from a variety of online mapping sources including OpenStreetMap and Google Maps. Downloading the raster data is often time consuming so it’s a good idea to cache it in a rds file.

(Finding the appropriate scale required a lot of manual tweaking.)

You can then plot it with:

If you have raster data from the raster package, you can convert it to the form needed by ggplot2 with the following code:

7.2.4 Area metadata

Sometimes metadata is associated not with a point, but with an area. For example, we can create mi_census which provides census information about each county in MI:

We can’t map this data directly because it has no spatial component. Instead, we must first join it to the vector boundaries data. This is not particularly space efficient, but it makes it easy to see exactly what data is being plotted. Here I use dplyr::left_join() to combine the two datasets and create a choropleth map.