A while back I saw Chris Campbell’s global maps from the Financial Times like in this Tweet and I thought I needed to do it in R. In this first post of 2023 we’ll see how we can access the GFS (Global Forecast System) data and visualize it with {ggplot2}, even though there are several ways, in this case we use the Google Earth Engine API via the {rgee} package for accessing the GFS data. We will select the most recent run and calculate the maximum temperature for the next few days.
Packages
Package
Description
tidyverse
Collection of packages (visualization, manipulation): ggplot2, dplyr, purrr, etc.
lubridate
Easy manipulation of dates and times
sf
Simple Feature: import, export and manipulate vector data
terra
Import, export and manipulate raster ({raster} successor package)
rgee
Access to Google Earth Engine API
giscoR
Administrative boundaries of the world
ggshadow
Extension to ggplot2 for shaded and glow geometries
fs
Provides a cross-platform, uniform interface to file system operations
ggforce
Provides missing functionality to ggplot2
# install the packages if necessaryif(!require("tidyverse")) install.packages("tidyverse")if(!require("sf")) install.packages("sf")if(!require("terra")) install.packages("terra")if(!require("fs")) install.packages("fs")if(!require("rgee")) install.packages("rgee")if(!require("giscoR")) install.packages("giscoR")1if(!require("ggshadow")) install.packages("ggshadow")if(!require("ggforce")) install.packages("ggforce")if(!require("googledrive")) install.packages("googledrive")# packageslibrary(rgee)library(terra)library(sf)library(giscoR)library(fs)library(tidyverse)library(lubridate)library(ggshadow)library(ggforce)library(googledrive)
1
Today the newest version 0.0.5 is not working correctly. Here I used the 0.0.4 version.
Part I. Geoprocessing with Google Earth Engine (GEE)
Before using GEE in R
The first step is to sign up at earthengine.google.com. In addition, it is necessary to install CLI of gcloud (https://cloud.google.com/sdk/docs/install?hl=es-419), you just have to follow the instructions in Google. Regarding the GEE language, many functions that are applied are similar to what is known from {tidyverse}. More help can be found at https://r-spatial.github.io/rgee/reference/rgee-package.html and on the GEE page itself.
The most essential of GEE’s native Javascript language is that it is characterized by the way of combining functions and variables using the dot, which is replaced by the $ in R. All GEE functions start with the prefix ee_* (ee_print( ), ee_image_to_drive()).
Once we have gcloud and the {rgee} package installed we can proceed to create the Python virtual environment. The ee_install() function takes care of installing Anaconda 3 and all necessary packages. To check the correct installation of Python, and particularly of the numpy and earthengine-api packages, we can use ee_check().
ee_install() # create python virtual environmentee_check() # check if everything is correct
Before programming with GEE’s own syntax, GEE must be authenticated and initialized using the ee_Initialize() function. Not working! See issue https://github.com/r-spatial/rgee/issues/355
ee_Initialize(drive =TRUE)
Instead you should do:
ee$Authenticate(auth_mode='notebook')
ee$Initialize(project='ee-dominicroye')
Access the Global Forecast System
A time series of images or multidimensional data is called an ImageCollection in GEE. Each dataset is assigned an ID and we can access it by making the following call ee$ImageCollection('ID_IMAGECOLLECTION'). There are helper functions that allow conversion of purely R classes to Javascript, e.g. for dates rdate_to_eedate(). The first thing we do is to filter to the most recent date with the last run of the GFS model.
We have to know that, unlike R, only when GEE tasks are sent, the calculation are executed on the servers using all the created GEE objects. Most steps create only EarthEngine objects what you will see soon in this post.
The model runs every 6 hours (0, 6, 12, 18), so the ee_get_date_ic() function extracts the dates to choose the most recent one. This is the first time that calculations are run.
# vector of unique run dateslast_run <-ee_get_date_ic(dataset)$time_start |>unique()last_run
# select the last onelast_run <-max(last_run)
Next we filter the date of the last run and select the band of the air temperature at 2m. Forecast dates are attributes of each model run up to 336 hours (14 days) from the day of execution. When we want to make changes to each image in an ImageCollection we must make use of the map() function, similar to the one we know from the {purrr} package. In this case we redefine the date of each image (system:time_start: run date) by that of the forecast (forecast_time). It is important that the R function to apply is inside ee_utils_pyfunc(), which translates it into Python. Then we extract the dates from the 14 day forecast.
# last run and variable selectiontemp <- dataset$filter(ee$Filter$date(rdate_to_eedate(last_run)))$select('temperature_2m_above_ground')# define the forecast dates for each hourforcast_time <- temp$map(ee_utils_pyfunc(function(img) {return(ee$Image(img)$set('system:time_start',ee$Image(img)$get("forecast_time"))) }))# get the forecast datesdate_forcast <-ee_get_date_ic(forcast_time)head(date_forcast)
Here we could export the hourly temperature data, but it would also be possible to estimate the maximum or minimum daily temperature for the next 14 days. To achieve this we define the beginning and end of the period, and calculate the number of days. What we do in simple terms is map over the number of days to filter on each day and apply the max() function or any other similar function.
# define start end of the periodendDate <-rdate_to_eedate(round_date(max(date_forcast$time_start)-days(1), "day"))startDate <-rdate_to_eedate(round_date(min(date_forcast$time_start), "day"))# number of daysnumberOfDays <- endDate$difference(startDate, 'days')# calculate the daily maximumdaily <- ee$ImageCollection( ee$List$sequence(0, numberOfDays$subtract(1))$map(ee_utils_pyfunc(function (dayOffset) { start = startDate$advance(dayOffset, 'days') end = start$advance(1, 'days')return(forcast_time$filterDate(start, end)$max()$# alternativa: min(), mean()set('system:time_start', start$millis())) })))# dates of the daily maximumhead(ee_get_date_ic(daily))
Dynamic map via GEE
Since there is the possibility of adding images to a dynamic map in the GEE code editor, we can also do it from R using the GEE function Map.addLayer(). We simply select the first day with first(). In the other argument we define the range of the temperature values and the color ramp.
Map$addLayer(eeObject = daily$first(),visParams =list(min =-45, max =45,palette =rev(RColorBrewer::brewer.pal(11, "RdBu"))),name ="GFS") +Map$addLegend(list(min =-45, max =45, palette =rev(RColorBrewer::brewer.pal(11, "RdBu"))), name ="Maximum temperature", position ="bottomright", bins =10)
Export multiple images
The {rgee} package has a very useful function for exporting an ImageCollection: ee_imagecollection_to_local(). Before using it, we need to set a region, the one that is intended to be exported. In this case, we export the entire globe with a rectangle covering the whole Earth.
# earth extensiongeom <- ee$Geometry$Polygon(coords =list(c(-180, -90), c(180, -90),c(180, 90),c(-180, 90),c(-180, -90)),proj ="EPSG:4326",geodesic =FALSE)geom # EarthEngine object of type geometry# temporary download folderdata_export <- daily$filter(ee$Filter$date(rdate_to_eedate(today()), rdate_to_eedate(today()+days(3))))# Obtener la lista de imágenesimage_list <- data_export$toList(data_export$size())# Función para exportar cada imagenexport_image <-function(image, index) { image <- ee$Image(image_list$get(index)) task <- ee$batch$Export$image$toDrive(image = image$clip(geom),description =paste0("Image_", index),scale =20000,region = geom,folder ="tomorrow", # create folder on your drive before submittingmaxPixels =1e13 ) task$start()}# Iterar sobre la lista de imágenes y exportarlasfor (i in0:(data_export$size()$getInfo() -1)) {export_image(image_list, i)}ee_monitoring(eeTaskList =TRUE)
gee_files <-drive_ls(path ="tomorrow")
walk(gee_files$id, drive_download, overwrite = T)
Part II. Orthographic map
Data
Of course, the first step is to import the data with the help of rast(). We also define the name of each layer according to its temporal dimension correctly.
# paths to downloaded dataforecast_world <-dir_ls(regexp ="tif")# guarantee the file order file_ord <-str_extract(forecast_world, "_[0-9]{1,2}") |>parse_number()forecast_rast <-rast(forecast_world[order(file_ord)]) # importforecast_rast# define the temporal dimension as the name of each layernames(forecast_rast) <-seq(today(), today() +days(2), "day")time(forecast_rast) <-seq(today(), today() +days(2), "day")forecast_rast# plotplot(forecast_rast)
Now we can define the orthographic projection indicating with +lat_0 and +lon_0 the center of the projection. We then reproject and convert the raster to a data.frame.
# projection definitionortho_crs <-'+proj=ortho +lat_0=51 +lon_0=0.5 +x_0=0 +y_0=0 +R=6371000 +units=m +no_defs +type=crs'# reproject the rasterras_ortho <-project(forecast_rast, ortho_crs)# convert the raster to a data.frame of xyzforecast_df <-as.data.frame(ras_ortho, xy =TRUE)# transform to a long formatforecast_df <-pivot_longer(forecast_df, 3:length(forecast_df), names_to ="date", values_to ="ta")
Administrative boundaries and graticules
We import the administrative boundaries with gisco_get_countries() which we need prepare for the orthographic projection. In the same way we create the graticule using st_graticule(). In order to preserve the geometry, it will be necessary to cut to only the visible part. The ocean is created starting from a point at 0.0 with the radius of the earth. Using the st_intersection() function we reduce to the visible part and reproject the boundaries.
# obtain the administrative limitsworld_poly <-gisco_get_countries(year ="2016", epsg ="4326", resolution ="10") # get the global graticulegrid <-st_graticule()# define what would be oceanocean <-st_point(x =c(0,0)) |>st_buffer(dist =6371000) |># earth radiusst_sfc(crs = ortho_crs)plot(ocean)
# select only visible from the boundaries and reprojectworld <- world_poly |>st_intersection(st_transform(ocean, 4326)) |>st_transform(crs = ortho_crs) # plot(world)
For the graticules we must repeat the same selection, although we previously limit the grid of lines to the ocean. The ocean boundary is used to create the globe’s shadow, but to use it in geom_glowpath() you need to convert it to a data.frame.
# eliminate the lines that pass over the continentsgrid_crp <-st_difference(grid, st_union(world_poly))# select the visible partgrid_crp <-st_intersection(grid_crp, st_transform(ocean, 4326)) |>st_transform(crs = ortho_crs)plot(grid_crp)# convert the boundary of the globe into a data.frameocean_df <-st_cast(ocean, "LINESTRING") |>st_coordinates() |>as.data.frame()
Map construction
Select tomorrow
First we select the day of tomorrow, in my case when I write this post it is February 22, 2023. In addition, we limit the temperature range to -45ºC and +45ºC.
forecast_tomorrow <-filter(forecast_df, date ==today() +days(1)) |>mutate(ta_limit =case_when(ta >45~45, ta <-45~-45,TRUE~ ta))
The shadow of the globe
We create the shadow effect using the geom_glowpath() function from the {ggshadow} package. Aiming for a more smooth transition I duplicate this layer with different transparency and shadow settings.
# build a simple shadowggplot() +geom_glowpath(data = ocean_df, aes(X, Y, group ="L1"),shadowcolor='grey90',colour ="white",alpha = .01,shadowalpha=0.05,shadowsize =1.5) +geom_glowpath(data = ocean_df, aes(X, Y, group ="L1"),shadowcolor='grey90',colour ="white",alpha = .01,shadowalpha=0.01,shadowsize =1) +coord_sf() +theme_void()
# combining several layers of shadowg <-ggplot() +geom_glowpath(data = ocean_df, aes(X, Y, group ="L1"),shadowcolor='grey90',colour ="white",alpha = .01,shadowalpha=0.05,shadowsize =1.8) +geom_glowpath(data = ocean_df, aes(X, Y, group ="L1"),shadowcolor='grey90',colour ="white",alpha = .01,shadowalpha=0.02,shadowsize =1) +geom_glowpath(data = ocean_df, aes(X, Y, group ="L1"),shadowcolor='grey90',colour ="white",alpha = .01,shadowalpha=0.01,shadowsize = .5)
Adding other layers
In the next step we add the temperature layer and both vector layers.
g2 <- g +geom_raster(data = forecast_tomorrow, aes(x, y, fill = ta_limit)) +geom_sf(data = grid_crp, colour ="white", linewidth = .2) +geom_sf(data = world, fill =NA,colour ="grey10",linewidth = .2)
What we need to add are the last definitions of the color, the legend and the general style of the map.
The geom_mark_circle() function allows you to include a circle label at any position. We create the label using str_glue() where the variable will be replaced by each temperature of both extremes, at the same time we can define the format of the number with number() from the {scales} package.