Electricity Demand During Lockdown

Being in lockdown one way to keep occupied is with some data analysis. An interesting question with so many places shut down is how much less electricity is being used during the lockdown.

The first step will be to clear the workspace and load the required packages, rvest some webscraping, lubridate to manipulate dates and times and of course tidyverse.

rm(list=ls())
library(tidyverse)
library(lubridate)
library(rvest)

Scraping Electricity Demand

Scraping the data involves downloading many zip files so we will create some folders (if they do not already exist) to store these. These folders and files will all be deleted at the end.

if(!dir.exists('data')){dir.create('data')}
if(!dir.exists('monthlyzips')){dir.create('monthlyzips')}
if(!dir.exists('dailyzips')){dir.create('dailyzips')}

The following code finds links for all the archived zip files of halfhourly operational demand. There is one zip file for each month, each of which contains zip file for each day of data. The loop downloads these monthly zip files and unzips them.

#URL for archived operational demand
arch_url<-'http://nemweb.com.au/Reports/ARCHIVE/Operational_Demand/ACTUAL_DAILY/'
arch_url%>%
  read_html%>%
  html_nodes("a")%>% #Get hyperlinks
  html_attr("href")%>% #Get link url
  tail(-1) %>% #Remove fist link
  paste0('http://nemweb.com.au',.)->monthlyzips

#Extract monthly zip files

for (i in monthlyzips){
  download.file(i,destfile = paste0('monthlyzips/',basename(i)))
  unzip(paste0('monthlyzips/',basename(i)),exdir = 'dailyzips')
}

More recent data can be found at a different URL. The following code scrapes and downloads those links

#Current data
current_url<-'http://nemweb.com.au/Reports/CURRENT/Operational_Demand/ACTUAL_DAILY/'
current_url%>%
  read_html%>%
  html_nodes("a")%>% #Get hyperlinks
  html_attr("href")%>% #Get link url
  tail(-1) %>% #Remove fist link
  paste0('http://nemweb.com.au',.)->dailyzips

basenames<-basename(dailyzips)
for (i in dailyzips){
  download.file(i,destfile = paste0('dailyzips/',basename(i),'.zip'))
}

With all daily zip files downloaded these can be unzipped into csv files.

#Unzip all files
dayzips<-dir('dailyzips')

for (i in dayzips){
  unzip(paste0('dailyzips/',i),exdir = 'data')
}

The next block of code includes a function that processes each csv file. The data for Victoria are kept (although all states could be investigated), as well as variables we are interested in (Demand, Time/Date and just Date). This function is used with the map_dfr from the purrr package which creates an output by concatenating the rows of dataframes. Once completed, the average daily electricity demand is obtained using group_by and summarise.

#Combine data
datafiles<-dir('data')

clean_energy_i<-function(i){
  read_csv(file = paste0('data/',i),skip = 1,n_max = 240)%>%
    filter(REGIONID=="VIC1")%>%
    select(Time=INTERVAL_DATETIME,
           Demand=OPERATIONAL_DEMAND_1)%>%
           mutate(Time=ymd_hms(Time),
           Date=as.Date(Time))->a
}

hhdata<-map_dfr(datafiles,clean_energy_i)

hhdata%>%group_by(Date)%>%
  summarise(AvDemand=mean(Demand))->energydata

We now have a data frame with average daily electricity demand for Victoria.

Scraping Temperature

Whenever looking at electricity demand a very important thing to control for is temperature. Recent data can be easily scraped from the Bureau of Meteorology. The url for the files that need to be downloaded follow a fairly predicable structure. The term “IDCJDW3033” refers to the Melbourne Airport weather station. Most electricity demand in Victoria comes from Melbourne and airports tend to have the most reliable weather measurements.

if(!dir.exists('temps')){dir.create('temps')}

months<-c(paste0('20190',4:9), #change these according to the date you download
  paste0('2019',10:12),
  paste0('20200',1:5))

for(i in months){
  download.file(paste0('http://www.bom.gov.au/climate/dwo/',
                       i,
                       '/text/IDCJDW3033.',
                       i,'.csv'),
                destfile = paste0('temps/',i,'.csv'))
}

The data can be combined by writing a function that cleans one csv file and then uses the map_dfr function to create a single data frame.

clean_temp_i<-function(i){
  read_csv(paste0('temps/',i),skip = 5)%>%
    select(2,4)->a
  colnames(a)<-c('Date','MaxTemp')
  a<-mutate(a,Date=as.Date(Date))
  return(a)
}
  
tempdata<-map_dfr(dir('temps/'),clean_temp_i)

Joining Data and Clean Up

Finally, the electricity demand data can be joined with the temperature data.

data<-full_join(energydata,tempdata)%>%
  na.omit()%>%
  mutate(wday=wday(Date,label = T)%>%as.character(),
         lockdown=(Date>'2020-03-15'))

unlink('data',recursive = T)
unlink('monthlyzips',recursive = T)
unlink('dailyzips',recursive = T)
unlink('temps',recursive = T)
saveRDS(data,file = 'energy_temp_data.rds')

The remaining code creates a day of week variable and a dummy for the lockdown which in Australia began on the 15th March, 2020. The unlink function deletes all the folders containing zip and csv files. The data is then saved as an rds file.

Analyisis with a Generalised Additive Model.

To investigate the impact of the lockdown we can fit a Generalised Additive Model (GAM). This model allows for some regression effects to be non-linear which in the context of this model will be Temperature. Temperate exhibits a clear non-linear effect on electricity demand, on hot days more electricity is used for cooling and on cold days more electricity is used for heating. The model also includes dummy variables for each day of the week and for the lockdown.

The GAM model can be fit using the

library(mgcv)
gamout<-gam(AvDemand~1+s(MaxTemp)+wday+lockdown,data = data)

The output of this model can be viewed with the summary function.

summary(gamout)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## AvDemand ~ 1 + s(MaxTemp) + wday + lockdown
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5218.005     40.453 128.990  < 2e-16 ***
## wdayMon      -125.280     56.379  -2.222  0.02685 *  
## wdaySat      -484.283     56.825  -8.522 3.45e-16 ***
## wdaySun      -616.518     56.695 -10.874  < 2e-16 ***
## wdayThu        41.474     56.159   0.739  0.46065    
## wdayTue         7.109     56.410   0.126  0.89977    
## wdayWed        19.436     56.316   0.345  0.73018    
## lockdownTRUE -147.519     44.895  -3.286  0.00111 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df     F p-value    
## s(MaxTemp) 4.463  5.511 104.4  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.697   Deviance explained = 70.6%
## GCV =  93019  Scale est. = 90143     n = 403

This indicates that 147.52 MW fewer of electricity are being used during the lockdown. To put that into perspective the maximum capacity of the Dartmouth Dam (a large hydroelectric plant in the Snowy) is 150 MW. Naturally this analysis is crude but gives some indication of the reduction in electricity demand during the lockdown.

Avatar
Anastasios N. Panagiotelis
Associate Professor of Business Analytics

My research interests include applied statistics and data science.