345 300 Sisifo's Page prophet vs bsts, time series prediction, MineThatData forecasting challenge 3 January 2021

Introduction & purpose

Prophet is a forecasting procedure and R package developed by Facebook that I have used professionally and have seen colleagues using it pretty sucessfully in diverse setups.

In this post, I run a kind of benchmark with just one scenario: the MineThatData Forecasting Challenge that the authors of this blog actually won using BSTS.

BSTS and prophet are certainly different techniques (even if they have quite some in common). Anyhow this is not the point of this post; rather, since prophet is quite straight forward to use, the main question that comes to mind is actually, can a model be fit with prophet in a simpler way than with BSTS obtaining comparable results? Certainly, my guess is that BSTS as used for the challenge was far from simple for the readers of MineThatData.

Fit directly and see how lucky we get

Let’s go straight like a crazy gunman and try to fit a model with prophet with a lot of defaults and see what happens.

library(lubridate)
library(prophet)
library(readxl)
library(httr)

GET("http://minethatdata.com/MineThatData_ForecastChallenge_20170914.xlsx",
    write_disk(tf <- tempfile(fileext = ".xlsx")))
## Response [http://minethatdata.com/MineThatData_ForecastChallenge_20170914.xlsx]
##   Date: 2020-12-09 08:13
##   Status: 200
##   Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
##   Size: 21.8 kB
## <ON DISK>  /var/folders/lm/6kvyzp5d3ds_20dxw939n5hw0000gp/T//RtmphKlisw/file585b45bf51fa.xlsx
sales <- read_excel(tf, sheet=1,
                    col_names=TRUE, col_types=rep("numeric", 17))
sales <- sales[! duplicated(sales$Month),]
plot(sales$Month, sales$Rolling_12Month_Sales, type="l")

Let’s prepare the data a bit more for prophet; the package needs a time series with dates, in this case for monthly data.

target <- sales[order(sales$Month),]
end_date <- as.Date("2017-08-01")
initial_date <- end_date %m-% months(79)

target$date <- seq(initial_date, end_date, by="1 month")

sales4prophet <- target[, c("date", "Rolling_12Month_Sales")]
names(sales4prophet) <- c("ds", "y")
head(sales4prophet)

And let’s go straight for fitting the model like madmen. Just a bit of thought, barely enough: the only seasonality that makes sense in this case, due to monthly granularity, is yearly seasonality. The fourier order of 3 keeps it not very complicated.

m <- prophet(weekly.seasonality=FALSE)
m <- add_seasonality(m, name='yearly', period=365, fourier.order=5)
m <- fit.prophet(m, sales4prophet)

future <- make_future_dataframe(m, 3, 'month')
forecast <- predict(m, future)
plot(m, forecast)