Prophet vs BSTS
Can a model like prophet, far simpler to fit compared to BSTS, have similar results in accuracy? Let's use the winning solution of MineThatData Forecasting Challenge, with BSTS, as a scenario for comparison.
Read More →Can a model like prophet, far simpler to fit compared to BSTS, have similar results in accuracy? Let's use the winning solution of MineThatData Forecasting Challenge, with BSTS, as a scenario for comparison.
Read More →
Under reconstruction! The aspect of this site may look broken for some browsers/resolutions.
In the meantime, find below links to the most recent posts:
Uplift modeling does not work well when the split between targeted and not targeted customers is not done at random. The problem was raised by a reader of a previous post; reproduced now with a public dataset. Symptoms: Qini area is negative, but otherwise the diagnosis for the classifier is ok. It includes code in R and a dataset, so that it may be used as a tutorial.
Read More →The challenge: forecasting the next 3 months of a target time series that contains data of sales of an unknown company. Markdown for R code.
Read More →Using now data from a pulsioximeter (instead of the accelerometer of previous posts), anyhow generating a big amount time series -the device is switched on every night with a patient, and for comparison a few nights from a healthier person are added. As a step of an initial exploratory analysis of the set of time series, the objective is to perform a clustering of a dataset of time series, using Dynamic Time Warping as the distance measure. It includes code in R and a dataset, so that it may be used as a tutorial.
Read More →When approaching a somehow more advanced analysis of the data coming from an accelerometer, the far-from-obvious concepts from time series start to be needed. But are they really needed? Maybe not, since one could always apply a neural network for solving the problem. Is this second path easier in practice? It includes code in R and a dataset, so that it may be used as a tutorial.
Read More →The very first analysis to try for the data coming from an accelerometer: measuring the physical activity of the user. Actually, the simplest statictic measures already tell a lot -in particular the standard deviation of the Signal Vector Magnitude (i.e. the modulus). It includes code in R and a dataset, so that it may be used as a tutorial.
Read More →Still setting up a lab for a number of wearables, in which all the raw data from sensors (e.g. accelerometer) goes upstream to a server, to be analyzed later... In this post, some conclusions on how to make the app robust enough, so that the users can do whatever they like with their smartphones during the experiment.
Read More →Setting up a lab for a number of wearables, in which all the raw data from sensors (e.g. accelerometer) goes upstream to a server, to be analyzed later. Some features are not so common in an Android app implementation, e.g. a not negligible upstream of data. Includes detailed explanations and Java code.
Read More →How to identify sessions in records from Google Analytics, when your customer has not set the custom user id dimension (follow up on Association Rules post). Also, identify the sessions of the users in other cases with GA data, e.g. when associating events to sessions. Plus the code in R.
Read More →How to mine Association Rules on clickstream data from Google Analytics, as a way to understand how users visit your web site. Includes details on how to configure GA and how to extract the data into R, plus the code in R for mining the rules. Seed for a recommendation system in your web site.
Read More →Is there a way to distinguish the customers that respond (or not) to a marketing action when targeted, from the customers that respond (or not) when they are not targeted? Which scenarios are relevant for uplift modeling, compared to a “traditional” model? How to evaluate them from a business perspective? How to fit the uplift model in R?
Read More →