9 minute readEasy way to compare and see Dutch drought

Dutch drought? That doesn’t sound right…

In all my travels, non-Dutch people have always been astonished that in the wet and rainy Netherlands we can have periods of drought. In 2018 we had a dry year with a deficit of almost 300 mm. But the last few days the weather has been GREAT (written in March 2019) … for the drought. Rain has been pouring for days. It got me thinking, maybe I can make some kind of weather dashboard using R, shiny and ggplot2.

Good thing somebody has been collecting all the weather data for me 🙂

The KNMI (Dutch weather institute) made an interactive map for the drought: http://geoservices.knmi.nl/droogtekaart2018/ . If you click on a pixel the cumulative rain deficit is shown for that location. But I don’t like the interface of their website, so I thought, I will try to remake it :D. So I used their basic information to make an interactive dashboard which shows the drought per measuring location.

But first things first, what is drought?

Wat is drought ?

According to the KNMI, drought develops when more water evaporates than arrives (this could be through groundwater, rivers or rain for example). Daily evaporation in the Netherlands is approximately 5 mm.

So, what is evaporation? This is water that leaves the soil and plants and moves to the atmosphere. This process is faster with warmer, sunnier and windier days than on colder, cloudier and windless days.  

Drought during rainfall?

There can be a drought when it rains as well. If the amount of rainfall is less than the daily evaporation, the water deficit will still grow. But if more rain falls than there is evaporation this deficit becomes smaller. That is why when you will look at the cumulative drought through a season the line rises and drops, although there is a rising trend.

Keep in mind that the shown drought data (except in first tab, will be explained later) is only shown from April to September. That is because the other months hardly have any evaporation.

Weather data is readily available and open!

The KNMI measures the weather every day, and the daily values are presented online. They can be downloaded from https://www.knmi.nl/kennis-en-datacentrum/achtergrond/data-ophalen-vanuit-een-script using a script (technical term: with an API)

The downloaded data is split in three sections, as I see it: first some explanation and information about the weather stations (coordinates, altitude, name), followed by explanation of the variables you selected. Then the actual data is presented for the time range and other variables you selected. This means the data is not presented in columns, each section has its own number of columns. So, some pre-processing and splitting of data is needed. Let dive into it!

Pre-processing: libraries

Firs of all, the libraries I used:


Dplyr is useful for data manipulation, lubridate for date manipulation, and feather and readr for loading and saving files. Feather saves the data in less space, making the eventual app slightly faster.

Downloading the weather data

This is pretty easy, you fill in all the variables you want into an URL and download! You fill in the start and end date (if you don’t fill in the end date, you download until yesterday). You can select all the stations (stns) or name some and choose the variables (vars). Check the documentation if you want to know which weather variables you can download. I chose RH and EV24, which are rainfall and evaporation, which are needed for the drought (= RH-EV24).

startdate <- 20120101

KNMIurl <-  paste0("http://projects.knmi.nl/klimatologie/daggegevens/getdata_dag.cgi?stns=ALL&vars=RH:EV24&start=",startdate)

download.file(url=KNMIurl, destfile = "KNMIdata.txt")  # Saves data as txt file

Depending on how many years of data and what variables you want, this process can take a couple of seconds to some minutes.

And extracting weather station coordinates and names

This was somewhat trickier. The first 4 rows of the text file are not designed to be columns, the 5th row and after are formatted in columns (6 in total, including the #):

How KNMI data is presented in the text file

Here is what I did:

locations0 <- read_table2("KNMIdata.txt", skip = 7) %>% select(-1)

NAindex <- which(is.na(locations0))

locations <- locations0 %>% head(min(NAindex)) %>% rename(Longtitude = `LON(east)`, Latitude = `LAT(north)` , Altitude =`ALT(m)`) %>% mutate(Latitude=as.numeric(Latitude), Longtitude= as.numeric(Longtitude), Altitude= as.numeric(Altitude), STN=substr(STN, 1,3))

Locations0 reads the text file, skips the first couple of rows and deletes the first column, with is the # column. It then reads the row containing the headers as headers but reformats it. I change the names in the third part of the script, and I get rid of the : in the STN column. In the second part of the script I index the NA rows and use it to see where the station data ends, as this is followed by an empty row. NAindex tells me that row 51, 55 and many other rows are empty. But I only need 51, so everything after 51 is deleted.

I then remove any NA rows (= empty rows) and save it for later use:

locations <- locations[complete.cases(locations), ]


Followed by the actual weather data

A similar approach is done for the actual data:

KNMIdata0 <- read_csv("KNMIdata.txt", skip = 120)

names(KNMIdata0)[1] <- "STN"

KNMIdata <- KNMIdata0[-1,]

I delete the first 120 rows, change the name to something I can read (STN column) and delete a NA row.

Joining the data to make one set

The next thing is to add coordinates to the measured data. This is only relevant if you would make a map, like I will do. Otherwise this location data is added weight…

joindata <- left_join(KNMIdata,locations, by="STN") %>% mutate(YYYYMMDD=ymd(YYYYMMDD)) %>%  rename(Date=YYYYMMDD) %>% mutate(year = format(Date, "%Y"), month = as.integer(format(Date,"%m"))) %>%  mutate(Drought = EV24-RH) %>% group_by(NAME, year)  %>% #filter(between(month, 4,9)) %>% 
  mutate(cumDR = cumsum(Drought),cumRH=cumsum(RH), cumEV=cumsum(EV24))  %>% mutate(day = yday(Date))

Then I delete any empty rows (added weight) and save it:

all <- joindata[complete.cases(joindata), ]


Now it is ready to use!

The dashboard itself

I won’t go into details about how the app is structured, there are pretty basic calculations (mainly filters) but I will show snippets of how to script it. Here is the link to the app: https://timonweitkamp.shinyapps.io/DroughtMapNL/ . It is a bit slow to load… I know it has to do with uploading the data, so if anyone has tips please let me know! It might be a bit to large… although it is only from the last 9 years.

I would normally embed it in the post but it takes a while to load… so open the link and read the rest of the post instead 😀

The menu: Drought insights

Menu structure

The menu is basic: there are three ways I presented the data and where you can view and interact (play 😀 ) with it, and there is a link to my website. You can also choose to view the graphs as absolute daily values or as a cumulative line.

First tab: Interactive map and graph

Here you can see where all the measuring stations are in the Netherlands. If you click on one, the graph below will show the drought for that station, for the chosen dates (can be found at the bottom of the page). There is also a textbox showing the latest drought for that time period, in 0.1 mm. The cumulative drought is calculated from the first date up to the last date you select.  

Second tab: comparing stations

Select any amount of stations and the drought (or rainfall or evaporation, your choice 😀 (found at bottom of page)) will be shown. You can see where it is wetter or drier.

Third tab: comparing years

In this tab you can only select one station, but you can compare all the years. Similar tot the second tab.

Snippits structure script

You select any station or year you want and it is used as input for the graph. Here is a short example of how it works:

pickerInput("stationYear", "Select a weather station:", choices = stations$NAME, multiple = F, selected = "ARCEN", options = list(deselectAllText = TRUE,actionsBox=TRUE))

The id is “stationYear”, which is used later on in the graph script. But first the data needs to be filtered for the station you selected:

datatab3 <- reactive({
    data %>% filter(NAME %in% input$stationYear & year %in% input$dateYear& between(month, 4,9) ) %>% group_by(NAME, year) 

You can see that input$stationYear is used, as wel as input$dateYear for the dates. I also filter out the dates that do not fall in the dry months (April-September), because of low evaporation.

And here is the actual part where the ggplot2 graph is made:

  output$plot_3 <- renderPlot({
  g <- ggplot( ) 
  if (input$type == "cum") {
    g <-  g + geom_line(data=datatab3(),aes(day, cumDR, col = year)) + ylab("Cumulative drought in 0.1 mm")
  } else{g}
  if (input$type == "abs") {
    g <-  g + geom_line(data=datatab3(),aes(day, Drought, col = year)) + ylab("Drought in 0.1 mm")
  } else{g}

It uses datatab3() from before (which is where the data is filtered) and then either makes an cumulative or absolute graph (depending on what you chose).

Shiny apps work with reactive data, in short meaning a program computes outputs from a given set of user inputs, in this case a simple filtering of data. But it can be used for any difficult calculation as well (although it might take some time…).

Other ways to make this dashboard more interactive

As I just said, calculations can also be applied. So far I have only made the app with basic filters, but it can still grow to show more. Think of:

  • Which is the driest station for a selected period?
  • What are dry and wet periods of the year, decade, part of NL?
  • Is a dry year significantly drier than others (statistically speaking)?
  • Can a model be made to predict drought for the season?

Your imagination is usually the limit in these kind of things! If you have some nice additions, please contact me!

One Comment

Add a Comment

Your email address will not be published. Required fields are marked *