Prophet vs BSTS

3 January 2021

Introduction & purpose

Prophet is a forecasting procedure and R package developed by Facebook that I have used professionally and have seen colleagues using it pretty sucessfully in diverse setups.

In this post, I run a kind of benchmark with just one scenario: the MineThatData Forecasting Challenge that the authors of this blog actually won using BSTS.

BSTS and prophet are certainly different techniques (even if they have quite some in common). Anyhow this is not the point of this post; rather, since prophet is quite straight forward to use, the main question that comes to mind is actually, can a model be fit with prophet in a simpler way than with BSTS obtaining comparable results? Certainly, my guess is that BSTS as used for the challenge was far from simple for the readers of MineThatData.

Fit directly and see how lucky we get

Let’s go straight like a crazy gunman and try to fit a model with prophet with a lot of defaults and see what happens.

library(lubridate)
library(prophet)
library(readxl)
library(httr)

GET("http://minethatdata.com/MineThatData_ForecastChallenge_20170914.xlsx",
    write_disk(tf <- tempfile(fileext = ".xlsx")))

## Response [http://minethatdata.com/MineThatData_ForecastChallenge_20170914.xlsx]
##   Date: 2020-12-09 08:13
##   Status: 200
##   Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
##   Size: 21.8 kB
## <ON DISK>  /var/folders/lm/6kvyzp5d3ds_20dxw939n5hw0000gp/T//RtmphKlisw/file585b45bf51fa.xlsx

sales <- read_excel(tf, sheet=1,
                    col_names=TRUE, col_types=rep("numeric", 17))
sales <- sales[! duplicated(sales$Month),]
plot(sales$Month, sales$Rolling_12Month_Sales, type="l")

Let’s prepare the data a bit more for prophet; the package needs a time series with dates, in this case for monthly data.

target <- sales[order(sales$Month),]
end_date <- as.Date("2017-08-01")
initial_date <- end_date %m-% months(79)

target$date <- seq(initial_date, end_date, by="1 month")

sales4prophet <- target[, c("date", "Rolling_12Month_Sales")]
names(sales4prophet) <- c("ds", "y")
head(sales4prophet)

And let’s go straight for fitting the model like madmen. Just a bit of thought, barely enough: the only seasonality that makes sense in this case, due to monthly granularity, is yearly seasonality. The fourier order of 3 keeps it not very complicated.

m <- prophet(weekly.seasonality=FALSE)
m <- add_seasonality(m, name='yearly', period=365, fourier.order=5)
m <- fit.prophet(m, sales4prophet)

future <- make_future_dataframe(m, 3, 'month')
forecast <- predict(m, future)

plot(m, forecast)

The last predictions are,

tail(forecast[, c('ds', 'yhat')])

If you measure it with the criteria of the MineThatData Forecasting Challenge, let’s calculate squared error of this solution,

formatC(sum((c(81.8e6, 81.3e6, 80.5e6) - tail(forecast, 3)["yhat"]) **2), format="e")

## [1] "9.9704e+11"

Thus about 1 trillion, which would not win the challenge, but it is not too bad, it would have been the 3rd solution (BSTS won with 0.4)… There is certainly one good thing about it: the model makes it right when predicting that the sales will continue going down. It simplifies reality a bit too much, disregarding the last 2 bumps in June 2017 and January 2016, but it makes a reasonible prediction with little effort.

Another good point about the prediction is that maybe is not the best, but it is prety interesting: makes a very good job for the 3rd month of the prediction, it would be actually better than any of the solutions proposed in the challenge in that regard.

In the following plot, the trend vs seasonaility components of the prediction are shown

prophet_plot_components(m, forecast)

The challenge does not say much about the product being sold, but this would point to a product that sales more in summer, with some smaller bumps in Christmas & Easter? -take a look to the scales of the y axis to realise that the seasonaility effect is weak anyway, 3 orders of magnitude smaller than the trend.

Now with cross-validation?

Actually the post could end in the last paragraphs. Mission accomplished: prophet certainly provides a very decent prediction with little effort. But inevitable one feels tempted to answer another question: can we beat BSTS in this case?

This is the model that results from a cross-validation fit, see the annex for details on how these numbers were deduced.

m2 <- prophet(weekly.seasonality=FALSE, changepoint.prior.scale=0.05, seasonality.prior.scale=1.0)
m2 <- add_seasonality(m2, name='yearly', period=365, fourier.order=3)
m2 <- fit.prophet(m2, sales4prophet)

future2 <- make_future_dataframe(m2, 3, 'month')
forecast2 <- predict(m2, future2)

plot(m2, forecast2)

The plot looks actually pretty closed to the madman model with defaults above. These are the last datapoints,

tail(forecast2[, c('ds', 'yhat')])

And the squared error became about 20% better, still 3rd place in the challenge, but pretty closed to the second solution (that got 7e11 squared error).

formatC(sum((c(81.8e6, 81.3e6, 80.5e6) - tail(forecast2, 3)["yhat"]) **2), format="e")

## [1] "7.8856e+11"

Seasonality is equally weak, and simpler since the fourier is order 3; only an increase of sales in summer is visible

prophet_plot_components(m2, forecast2)

So, does it worth it the effort? Well, 20% increase is not bad at all, and it wasn’t so difficult (see below). But anyway, as a baseline prediction, the 1st one would be good already.

Conclusion

Yes, prophet is a pretty good package for producing baseline predicions, as they clain in their webpage: accurate and fast, automatic & tunable. Might not be the winners of a contest, but still…

Annex: how to fit the prior hyperparameters with cross-validation

First of all, note that it quite an exercise trying not to overfit the results - when you already know the solution, inevitably you overfit, even if it is subtly and slowly. But yes, the fact that the final result is good but not so amazing maybe gives a hint that I didn’t overfit so badly…

Second, Bayesian priors are not like hyperparameters; one has a prior opinion about priors and does not look for their values in a grid -one may fine-tune a prior opinion based on business knowledge. Anyhow, in this case the target series is quite an unknown, which justifies the use of a grid.

prophet provides a couple of functions that make everything easy: - cross_validation performs forward chaining over the series - performance_metrics calculates over the cross validation folds above

The following is a utility encapsulating a call to both, where model.function is parameter function that generates the model we’d like to fit.

prophet.cv.loop <- function(data_train,
                            initial,
                            period,
                            horizon,
                            units,
                            model.function) {
  model <- model.function(data_train)
  df.cv <- cross_validation(model, initial=initial, period=period, horizon=horizon, units=units)
  df.p <- performance_metrics(df.cv)
  return(df.p[nrow(df.p),]) # return last row
}

And the following runs the search: - The model has as parameters a changepoint.prior.scale, a seasonality.prior.scale and a fourier.order - Changepoint prior is kept to low values; high values would make the trend actually follow the series, giving a meaningless prediction (the default is 0.05) - Seasonality is clearly weak in this series, thus values are low (default is 10) - Fourier order is kept to a moderate value, to avoid getting a misterious wiggy output - 68 of the 80 months of training data are reserved for pure training - The horizon of the prediction is 3 months - Thus forward chaining generates 9 predictions, in the first one the training data is up to the 68th month, and the prediction is for 69, 70 and 71, and in the last one the training is up to the 77th month, and the predictions are 78, 79 and 80 - Careful since this takes quite a bit to run (it’s fast but a lot of simulations done, it takes maybe 10 minutes)

res_df1 <- c()
for (changepoint.prior.scale in c(0.025, 0.05, 0.075, 0.1)) {
  for (fourier.order in c(3, 4, 5, 7, 9)) {
    for (seasonality.prior.scale in c(0.1, 1, 10)) {
      res <- prophet.cv.loop(
        sales4prophet,
        initial = 68*(365/12),
        period = 365/12,
        horizon = 3*365/12,
        units = "days",
        model.function = function(data_train) {
          m <- prophet(weekly.seasonality=FALSE,
                        changepoint.prior.scale=changepoint.prior.scale, seasonality.prior.scale=seasonality.prior.scale)
          m <- add_seasonality(m, name='yearly', period=365, fourier.order=fourier.order)
          m <- fit.prophet(m, sales4prophet)
          return(m)
        }
      )
      res_row <- data.frame(changepoint.prior.scale = changepoint.prior.scale,
                            fourier.order = fourier.order,
                            seasonality.prior.scale = seasonality.prior.scale)
      res_row <- cbind(res_row, res)
      print(res_row)
      res_df1 <- rbind(res_df1, res_row)
    }
  }
}

And these are the best solutions ordered by mape:

head(res_df1[order(res_df1$mape, decreasing = FALSE), c('changepoint.prior.scale', 'fourier.order', 'seasonality.prior.scale', 'mape')],5)