Package | Description |
---|---|
tidyverse | Collection of packages (visualization, manipulation): ggplot2, dplyr, purrr, etc. |
mapSpain | Administrative boundaries of Spain at different levels |
giscoR | Administrative boundaries of the world |
sf | Simple Feature: import, export and manipulate vector data |
ggtext | Improved text rendering support for ggplot2 |
ggrepel | Provide ggplot2 geometries to repel overlapping text labels |
geomtextpath | Add text to lines that can follow any path and will remain correctly spaced and angled. |
packcircles | Algorithms for finding non-overlapping circles. |
cartogram | Build different types of cartograms |
In my first post of 2024, which unfortunately has not been possible before April, I will explain how we can group in the same location several proportional circles. In 2022 I was looking for how to represent the number of heat wave days according to the degree of severity in Spain. I found the solution using the circle packing which is also used in the Dorling cartogram.
Packages
# install the packages if necessary
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("mapSpain")) install.packages("mapSpain")
if (!require("sf")) install.packages("sf")
if (!require("packcircles")) install.packages("packcircles")
if (!require("ggtext")) install.packages("ggtext")
if (!require("ggrepel")) install.packages("ggrepel")
if (!require("cartogram")) install.packages("cartogram")
if (!require("giscoR")) install.packages("giscoR")
if (!require("rmapshaper")) install.packages("rmapshaper")
if (!require("geomtextpath")) remotes::install_github("AllanCameron/geomtextpath")
# packages
library(tidyverse)
library(lubridate)
library(sf)
library(mapSpain)
library(ggrepel)
library(ggtext)
library(packcircles)
library(cartogram)
library(giscoR)
library(rmapshaper)
library(geomtextpath)
Data
In this post we will use a dataset of heat waves registered in the year 2022 (download). At the time I calculated the Excess Heat Factor index for different locations in Spain. You can find the formulas of the index in Díaz-Poso et al. (2023{:target=“_blank”}). It is a geojson with the number of heat wave days according to severity (low, severe, extreme) for 51 weather stations. The data needs to be projected, in this case it is ETRS89 UTM 30 (EPSG:25830). In addition, we import and modify the provincial and global boundaries to use them as a basis for the map.
# import EHF heat wave 2022 data
<- st_read("ehf_2022_spain.geojson") ehf
Reading layer `ehf_2022_spain' from data source
`C:\Users\xeo19\Downloads\dominicroye.github.io\blog\map-circle-packing\ehf_2022_spain.geojson'
using driver `GeoJSON'
Simple feature collection with 148 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -839774.5 ymin: 3150771 xmax: 1117513 ymax: 4815734
Projected CRS: ETRS89 / UTM zone 30N
ehf
Simple feature collection with 148 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -839774.5 ymin: 3150771 xmax: 1117513 ymax: 4815734
Projected CRS: ETRS89 / UTM zone 30N
First 10 features:
stat_name levels n geometry
1 BARCELONA AEROPUERTO low 48 POINT (924574.5 4583673)
2 BARCELONA AEROPUERTO severe 21 POINT (924574.5 4583673)
3 GIRONA AEROPUERTO extreme 2 POINT (978052.4 4656058)
4 GIRONA AEROPUERTO low 53 POINT (978052.4 4656058)
5 GIRONA AEROPUERTO severe 14 POINT (978052.4 4656058)
6 DONOSTIA / SAN SEBASTIÁN, IGELDO extreme 2 POINT (577768.2 4795286)
7 DONOSTIA / SAN SEBASTIÁN, IGELDO low 24 POINT (577768.2 4795286)
8 DONOSTIA / SAN SEBASTIÁN, IGELDO severe 14 POINT (577768.2 4795286)
9 BILBAO AEROPUERTO extreme 6 POINT (507593.1 4793918)
10 BILBAO AEROPUERTO low 31 POINT (507593.1 4793918)
# basic plot
plot(ehf)
# provincial boundaries Spain
<- esp_get_prov(moveCAN = FALSE, epsg = "4326")
esp plot(esp)
<- filter(esp, nuts2.name != "Canarias")
esp_pen
# interior boundaries of Spain
<- rmapshaper::ms_innerlines(esp_pen)
esp_inline
# global boundaries
<- gisco_get_countries(res = 10) |>
eu st_cast("MULTILINESTRING") |>
st_crop(
xmin = -10, ymin = 34.7,
xmax = 4.4, ymax = 43.8
)plot(eu)
## Canary Islands
<- esp_get_ccaa("Canarias", moveCAN = FALSE)
canlim <- st_bbox(canlim) bx
Dorling diagram
When I started investigating possibilities to achieve circles in a grouped position without overlapping, I thought directly of the {cartogram_dorling()
function from the {cartogram}
package. The problem I faced was that it is designed for only one category or level, however, as we can see we have up to three levels of severity. If we use the function for low severity we get the following result.
# Dorling low gravity map
<- cartogram_dorling(filter(ehf, levels == "low"), "n", k = .3)
dor
# map
ggplot(
dor,aes(fill = n)
+
) geom_sf(
data = esp,
fill = "grey80",
colour = "white",
linewidth = .3
+
) geom_sf() +
scale_fill_distiller(palette = "YlOrRd", direction = 1) +
labs(fill = NULL, title = "Number of heat weave days with low severity") +
theme_void() +
theme(
legend.position = "top",
legend.key.height = unit(.5, "lines"),
legend.key.width = unit(3, "lines"),
plot.title = element_text(margin = margin(t = 5, b = 10), hjust = .5)
)
So, what can we do? We should take a closer look at the function, which facilitates the creation of the Dorling map (link). First we could create a map of small multiples, but one problem we encounter is that we cannot set a maximum value of a variable. So, we add and modify it slightly to the line where the area is calculated (dat.init$v
).
<- function(x, weight,
cartogram_dorling2.sf limit_mx = NULL, # new argument
k = 5,
m_weight = 1,
itermax = 1000) {
# proj or unproj
if (sf::st_is_longlat(x)) {
stop('Using an unprojected map. This function does not give correct centroids and distances for longitude/latitude data:\nUse "st_transform()" to transform coordinates to another projection.', call. = F)
}# no 0 values
<- x[x[[weight]] > 0, ]
x # data prep
<- data.frame(sf::st_coordinates(sf::st_centroid(sf::st_geometry(x))),
dat.init v = x[[weight]]
)<- (max(dat.init[, 1]) - min(dat.init[, 1])) * (max(dat.init[, 2]) - min(dat.init[, 2]))
surf # dat.init$v <- dat.init$v * (surf * k / 100) / max(dat.init$v) # old
$v <- dat.init$v * (surf * k / 100) / ifelse(is_null(limit_mx),
dat.initmax(dat.init$v),
limit_mx# new argument
)
# circles layout and radiuses
<- packcircles::circleRepelLayout(
res x = dat.init, xysizecols = 1:3,
wrap = FALSE, sizetype = "area",
maxiter = itermax, weights = m_weight
)# sf object creation
<- sf::st_buffer(
. ::st_as_sf(res$layout,
sfcoords = c("x", "y"),
crs = sf::st_crs(x)
),dist = res$layout$radius
)::st_geometry(x) <- sf::st_geometry(.)
sfreturn(x)
}
If we now use the modified function, we will achieve multiple small Dorling’s. We simply map the new function on each category.
# estimate the maximum value
<- max(ehf$n)
max_value
# map over each category
<- split(ehf, ~levels) |>
dor2 map(cartogram_dorling2.sf,
weight = "n",
k = .3,
limit_mx = max_value
)
# rejoin the tables and set the order
<- bind_rows(dor2) |>
dor2 mutate(levels = factor(levels, c("low", "severe", "extreme")))
# map of small multiples
ggplot(
dor2,aes(fill = n)
+
) geom_sf(
data = esp,
fill = "grey80",
colour = "white",
linewidth = .3
+
) geom_sf() +
facet_grid(~levels) +
scale_fill_distiller(palette = "YlOrRd", direction = 1) +
labs(fill = NULL, title = "Number of heat weave days with low severity") +
theme_void() +
theme(
legend.position = "top",
legend.key.height = unit(.5, "lines"),
legend.key.width = unit(3, "lines"),
plot.title = element_text(margin = margin(t = 5, b = 10), hjust = .5)
)
I found it interesting, but it was not what I was looking for. I also considered making a proportional symbol map with circles in the same location, using color mixing and transparency in the style of this map (link). Finally I decided to modify the function to achieve a grouped position with circle packing. I introduced another argument surf = null
for the extension of the data set, also kept the change of introducing the possibility of the maximum value with the same aim, and I changed the circleRepelLayout()
function to the circleProgressiveLayout()
one. The second one arranges a set of circles, denoted by their sizes, placing consecutively each circle externally tangent to two previously placed circles avoiding overlaps. Unlike the original layout in which the circles are ordered by iterative pairwise repulsion within a bounding rectangle. Another potential alternative could be the use of networks or graphs with {ggraph}
, but I have not yet gotten around to testing it.
<- function(x,
packedcircle_dodge_position.sf
weight,surf = NULL,
limit_mx = NULL, # new argument
k = .5) {
# proj or unproj
if (sf::st_is_longlat(x)) {
stop('Using an unprojected map. This function does not give correct centroids and distances for longitude/latitude data:\nUse "st_transform()" to transform coordinates to another projection.', call. = F)
}# no 0 values
<- x[x[[weight]] > 0, ]
x # data prep
<- data.frame(sf::st_coordinates(sf::st_centroid(sf::st_geometry(x))),
dat.init v = x[[weight]]
)
if (is_null(surf)) surf <- (max(dat.init[, 1]) - min(dat.init[, 1])) * (max(dat.init[, 2]) - min(dat.init[, 2]))
$v <- dat.init$v * (surf * k / 100) / ifelse(is_null(limit_mx),
dat.initmax(dat.init$v),
limit_mx# new argument
) # circles layout and radiuses
<- packcircles::circleProgressiveLayout(
res x = dat.init,
sizecol = "v"
# other layout
)
<- mutate(res, x = dat.init$X + x, y = dat.init$Y + y) # reconvert to lon, lat
res
# sf object creation
<- sf::st_buffer(
. ::st_as_sf(res,
sfcoords = c("x", "y"),
crs = sf::st_crs(x)
),dist = res$radius
)::st_geometry(x) <- sf::st_geometry(.)
sf
return(x)
}
Creation of grouped circles
First, we must define global variables for all locations. Among others, the spatial extent of the data, the maximum value, the scale factor for the size, and the variable of interest.
<- max(ehf$n) # maximum value
limit_mx <- st_coordinates(ehf) # UTM coordinates
longlat <- (max(longlat[, 1]) - min(longlat[, 1])) * (max(longlat[, 2]) - min(longlat[, 2])) # extension
surf <- "n" # variable
weight <- .1 # size factor k
In the next step, we split our data into as many subsets as locations using the split()
function. Then we apply the circle packing function to each one individually with map()
passing all the previously defined global arguments. After that, we would only have to gather all the spatial tables.
# map over all locations
<- split(ehf, ~stat_name) |>
circle_dodge map(packedcircle_dodge_position.sf,
surf = surf, k = k,
weight = weight,
limit_mx = limit_mx
)
# rejoin
<- bind_rows(circle_dodge) |>
circle_dodge mutate(levels = factor(
levels,c("low", "severe", "extreme")
# fijamos orden
))
plot(circle_dodge["n"]) # resultado
Map construction
Legend
In the map construction the first thing we should make is the legend. We must create it manually by defining different magnitudes, in our case, number of days. Then we estimate the area and construct the circles artificially on a constant line of latitude at +45º, position where we will insert the legend as if it were a spatial object.
# areas
<- c(5, 10, 15, 30, 50) * (surf * k / 100) / limit_mx
legend_area
# circles construction at artificial coordinates
<- tibble(
legend_dat area = legend_area, r = sqrt(area / pi),
n = c(5, 10, 15, 30, 50), y = 45
|>
) arrange(area) |>
mutate(x = -4:0) |>
st_as_sf(coords = c("x", "y"), crs = st_crs(esp)) |>
st_transform(25830) %>% # switch to the other pipe for nested placehodler .
st_buffer(dist = .$r) |>
st_cast("LINESTRING")
plot(legend_dat["n"])
A more compact alternative legend would be collapsed circles, which can be achieved by changing the position with respect to the largest circle. However, the legend in this case remains small because of the maximum size of the circles. An increase in the size of the circles would lead to overlapping between different locations.
# circles construction at artificial coordinates
<- tibble(
legend_dat2 area = legend_area, r = sqrt(area / pi),
n = c(5, 10, 15, 30, 50), y = 45
|>
) arrange(area) |>
mutate(x = -3) |>
st_as_sf(coords = c("x", "y"), crs = st_crs(esp)) |>
st_transform(25830)
<- cbind(legend_dat2, st_coordinates(legend_dat2)) |>
legend_dat2 mutate(Y = Y - (max(r) - r)) |>
st_drop_geometry() |>
st_as_sf(coords = c("X", "Y"), crs = 25830) %>%
st_buffer(dist = .$r) |>
st_cast("LINESTRING")
plot(legend_dat2["n"])
Canary Islands
The first map would correspond to the Canary Islands, which is added to the main map in a false position. In ggplot2 any simple feature geometry can be added with geom_sf()
by adjusting the own characteristics of points, lines or polygons.
<- ggplot() +
can geom_sf(data = esp, fill = "grey20", colour = NA) +
geom_sf(
data = esp_inline, colour = "white",
size = .2
+
) geom_sf(
data = circle_dodge,
aes(fill = levels),
alpha = .8,
colour = "white",
stroke = .025,
show.legend = FALSE
+
) scale_fill_manual(
values = c("#feb24c", "#fc4e2a", "#b10026"),
guide = "none"
+
) theme_void(base_family = "Lato") +
labs(title = "Canary Islands") +
theme(
plot.background = element_rect(fill = NA, colour = "black"),
plot.title = element_text(vjust = 8, margin = margin(b = 4, l = 3))
+
) coord_sf(
xlim = bx[c(1, 3)],
ylim = bx[c(2, 4)],
crs = 4326
)
can
Spain
To build the main map, we will exclude the locations of the Canary Islands. In addition, we need to have the new position of the Canary Islands in the southwest of the Peninsula. The main difference with the previous map is that we include the legend as a spatial object using the geom_textsf()
function. It is a variant of the {geomtextpath}
package to add a label following any geometry path. Unfortunately, not all possible arguments of the package (issue) can be used. It is important that the geometry must be of type LINESTRING
. Therefore, we changed the type with st_cast()
above. Finally, we insert the map of the Canary Islands using the annotation_custom()
function. More details about inserted maps can be found here.
# filter out locations in mainland Spain and the Balearic Islands
<- st_filter(circle_dodge, st_transform(esp_pen, 25830))
esp_data
# coordinates of the Canary Islands moved position
<- c(xmin = -11.74, ymin = 34.92, xmax = -7, ymax = 36.70)
can_ext class(can_ext) <- "bbox"
<- st_as_sfc(can_ext) |>
can_bbx st_set_crs(4326) |>
st_transform(25830) |>
st_bbox()
# part one adding the base map limits
<- ggplot() +
g1 geom_sf(data = eu, colour = "grey80", size = .3) +
geom_sf(data = esp_pen, fill = "grey20", colour = NA) +
geom_sf(
data = esp_inline, colour = "white",
size = .2
)
# add the grouped proportional circles
<- g1 + geom_sf(
g2 data = esp_data,
aes(fill = levels),
alpha = .7,
colour = "white",
stroke = .02,
show.legend = "point"
+
) geom_sf(data = legend_dat, aes(label = n)) +
geom_text(
data = st_centroid(legend_dat), # legend in artifical
aes(label = n, geometry = geometry),
stat = "sf_coordinates",
size = 2.5, vjust = 0.5
+
) 1
# geom_textsf(data = legend_dat, # position lat 45º. not working due to bug
# aes(label = n),
# size = 2.5, hjust = .21) +
annotation_custom(ggplotGrob(can), # Canary Islands
xmin = can_bbx[1], xmax = can_bbx[3],
ymin = can_bbx[2], ymax = can_bbx[4]
)
# final adjustments of legend and style
+ scale_fill_manual(values = c("#feb24c", "#fc4e2a", "#b10026")) +
g2 guides(fill = guide_legend(override.aes = list(size = 5, shape = 21))) +
labs(fill = "severity", title = "HEAT WAVE DAYS 2022") +
theme_void() +
theme(
legend.position = "top",
legend.text.position = "bottom",
legend.title.position = "top",
legend.text = element_text(vjust = 3),
legend.title = element_text(
hjust = .5,
size = 14
),legend.margin = margin(t = 5),
plot.title = element_text(
hjust = .5,
face = "bold",
size = 25,
margin = margin(b = 5, t = 20)
),plot.margin = margin(15, 2, 2, 5)
+
) coord_sf(crs = 25830, clip = "off")
- 1
-
There seems to be a bug for the
geom_textsf()
function from thegeomtextpath
package. Therefore I use here an alternative way. (30/12/2024)
Interactive version
As an extra of this post, I show you how we can make the same map interactively using the {ggiraph}
package. We only change the geom_sf()
function for its geom_sf_interactive()
variant. In addition, we indicate what we want to show in the tooltip. We put the result in an object to finish building the interactive version using giraph()
and including style settings.
if (!require("ggiraph")) install.packages("ggiraph")
library(ggiraph) # versión interactiva de geometrias ggplot
<- ggplot() +
g1 geom_sf(data = eu, colour = "grey80", size = .3) +
geom_sf(data = esp_pen, fill = "grey20", colour = NA) +
geom_sf(
data = esp_inline, colour = "white",
size = .2
)
<- g1 + geom_sf_interactive(
g2 data = esp_data,
aes(fill = levels, tooltip = n), # interactive version with tooltip
alpha = .7,
colour = "white",
stroke = .02,
show.legend = "point"
+
) geom_sf(data = legend_dat, aes(label = n)) +
geom_text(
data = st_centroid(legend_dat), # legend in artifical
aes(label = n, geometry = geometry),
stat = "sf_coordinates",
size = 2.5, vjust = 0.5
)
<- g2 + scale_fill_manual(values = c("#feb24c", "#fc4e2a", "#b10026")) +
g_final guides(fill = guide_legend(override.aes = list(size = 5, shape = 21))) +
labs(fill = "severity", title = "HEAT WAVE DAYS 2022") +
theme_void() +
theme(
legend.position = "top",
legend.text.position = "bottom",
legend.title.position = "top",
legend.text = element_text(vjust = 3),
legend.title = element_text(
hjust = .5,
size = 14
),legend.margin = margin(t = 5),
plot.title = element_text(
hjust = .5,
face = "bold",
size = 25,
margin = margin(b = 5, t = 20)
),plot.margin = margin(15, 2, 2, 5)
+
) coord_sf(crs = 25830, clip = "off")
# creation of the final interactive map
girafe(
ggobj = g_final,
width_svg = 14.69,
height_svg = 12,
options = list(
opts_tooltip(opacity = 0.7, use_fill = T)
) )