# install the necessary packages
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("units")) install.packages("units")
if (!require("sf")) install.packages("sf")
if (!require("maps")) install.packages("maps")
if (!require("giscoR")) install.packages("giscoR")
# load packages
library(maps)
library(sf)
library(tidyverse)
library(units)
library(giscoR)Geographic distance
The first post of this year 2020, I will dedicate to a question that I was recently asked. The question was how to calculate the shortest distance between different points and how to know which is the closest point. When we work with spatial data in R, currently the easiest thing is to use the sf package in combination with the tidyverse collection of packages. We also use the units package which is very useful for working with units of measurement.
Packages
In this post we will use the following packages:
| Package | Description |
|---|---|
| tidyverse | Collection of packages (visualization, manipulation): ggplot2, dplyr, purrr, etc. |
| sf | Repel overlapping text labels in ggplot2 |
| units | Support for measurement units in R vectors, matrices and arrays: propagation, conversion, derivation |
| maps | Draw geographical maps |
| giscoR | Administrative boundaries of the world |
Measurement units
The use of vectors and matrices with the units class allows us to calculate and transform units of measurement.
# length
l <- set_units(1:10, m)
l# convert units
set_units(l, cm)# sum different units
set_units(l, cm) + lCapital cities of the world
We will use the capital cities of the whole world with the objective of calculating the distance to the nearest capital city and indicating the name/country.
# set of world cities with coordinates
head(world.cities) # from the maps packageTo convert points with longitude and latitude into a spatial object of class sf, we use the function st_as_sf(), indicating the coordinate columns and the coordinate reference system (WSG84, epsg: 4326).
In the next step, we filter by the capital cities encoded in the column capital with 1. The advantage of the sf package is the possibility of applying functions of the tidyverse collection to manipulate the attributes. In addition, we add a column with new labels using the str_c() function of the stringr package, which is similar to that of R Base paste().
Calculate distances
Geographical distance (Euclidean or greater circle) is calculated with the st_distance() function, either between two points, between one point and others or between all points. In the latter case we obtain a symmetric matrix of distances (NxN), taken pairwise between the elements of the capital city set. In the diagonal we find the combinations between the same points giving all null values.
| A | B | C | |
|---|---|---|---|
| A | 0 | 229 | 272 |
| B | 229 | 0 | 181 |
| C | 272 | 181 | 0 |
For instance, when we want to know the distance from Amsterdam to Abu Dhabi, Washington and Tokyo we pass two spatial objects.
# calculate distance
dist_amsterdam <- st_distance(
slice(capitals, 10),
slice(capitals, c(2, 220, 205))
)
dist_amsterdam # distance in metersThe result is a matrix with a single row or column (depending on the order of the spatial objects) with a class of units. Thus it is possible to convert easily to another unit of measure. If we want to obtain a vector without class units, we only have to apply the function as.vector().
# change from m to km
set_units(dist_amsterdam, "km")# units class to vector
as.vector(dist_amsterdam)In the second step, we estimate the distance matrix between all the capital cities. It is important to convert the null values to NA to subsequently obtain the correct matrix index.
# calculate distance
m_distance <- st_distance(capitals)
# matrix
dim(m_distance)When the result is of the units class, it is necessary to use the same class to be able to make logical queries. For example, set_units(1, m) == set_units(1, m) vs. set_units(1, m) == 1.
To obtain the shortest distance, in addition to its position, we use the apply() function which in turn allows us to apply the function which.min() and min() on each row. It would also be possible to use the function on columns giving the same result. Finally, we add the results as new columns with the mutate() function. The indices in pos allow us to obtain the names of the nearest cities.
# get the index (position) of the city and the distance
pos <- apply(m_distance_km, 1, which.min)
dist <- apply(m_distance_km, 1, min, na.rm = TRUE)
# add the distance and get the name of the city
capitals <- mutate(capitals,
nearest_city = city_country[pos],
geometry_nearest = geometry[pos],
distance_city = dist
)Map of distances to the next capital city
Finally, we build a map representing the distance in proportional circles. To do this, we use the usual grammar of ggplot() by adding the geometry geom_sf(), first for the world map as background and then for the cities. In aes() we indicate, with the argument size = distance_city, the variable which we want to map proportionally. The theme_void() function removes all style elements. In addition, we define with the function coord_sf() a new projection indicating the proj4 format.
# world map
world <- gisco_get_countries(resolution = "10")
# map
ggplot(world) +
geom_sf(fill = "black", colour = "white") +
geom_sf(
data = arrange(capitals, distance_city),
aes(size = distance_city),
alpha = 0.5,
colour = "white",
fill = "#bd0026",
shape = 21
) +
scale_size(range = c(.2, 6), breaks = c(100, 250, 750, 2000)) +
coord_sf(crs = "ESRI:54030") +
labs(size = "km", title = "Distance to the next capital city") +
theme_void() +
theme(plot.title = element_text(hjust = .5, margin = margin(b = 10)),
legend.position = "top",
legend.title.position = "right",
plot.margin = margin(5, 5, 5, 5))
