Introduction & purpose

In the previous post, the data coming from the accelerometer was analysed in a over-simplistic way: just taking some common statistical measures (e.g. standard deviation) to get an idea of the overall physical activity of the user of the wearable over time. What to do when a more advanced analysis of the data is required?

Such a goal would imply analyzing a time series, and thus the need to understand quite a few far-from-obvious concepts from signal processing. But are these really needed?

Not really, not necessarily. For example in this podcast from O’Reilly, it’s stated that it should be easier to apply a neural network for solving time-series related problems. This post takes the neural network approach for analyzing the accelerometer data, keeping in mind the question on how easy and practical it is, and other problems that pop up (e.g. convergence, processing time…)

The post includes code in R, and also a dataset for the accelerometer data, in case you want to follow it along.

Dataset used

The dataset is the same as the one used in the previous post, you can find it in this R data file. Just download it and then load it into your R session.

Although this post intends to be self-contained, of course it could be better if you read quickly that previous post.

References

From this podcast from O’Reilly, already mentioned, find below a nice quote from Christopher Nguyen, interviewed in the podcast, co-founder of Arimo,

Let’s divide the world into before deep learning on time series and after deep learning on time series. Techniques exist for processing time series, yet I would say they’re too hard. It’s possible, but it costs too much or it’s too hard using traditional techniques, and it yields a value that hasn’t been worth the investment. Deep learning flips that around and gets you much larger value for less effort.

Besides, there’s quite a lot of info around on how to actually do it. One nice summary of the most straight-forward approach can be found in this Cross-Validated answer -in this post, the approach used is even more naive, but on the same lines. Actually, as usually happens, this is far from a new idea, take a look to the following link, a time series forecasting competition using neural networks in 2010.

On neural networks themselves, possibly one of the best introduction is the very well known Coursera MOOC by Andrew Ng. As for how to apply it in R, this tutorial covers the essentials. As for the common convergence problems when training a neural net, Efficient BackProp by Yann LeCun and others, 1998.

Reformulation of the problem so that the neural net can solve it

The following picture is a summary of the accelerometer data available for the exercise. It corresponds to around 8 hours of activity of a subject wearing the wristband.

A straight-forward problem would be to detect what kind of activity is the subject doing. In order to simplify the problem, but still keeping it not too easy for the neural net, let’s try to build a “sleep detector”. The cutting point is around minute 260, although it is a kind of lousy threshold, since the subject reported that he walked up several times and even went to the toilet after minute 260. The following would be then the “lousy” output to predict (see details in the R Code section below.)

This “sleep detection” correlates somehow with what a simple detection of activity levels would do, since of course low activity periods are more likely to be of sleep. Anyhow, if the output to predict was deduced on a reasonable threshold on the activity level, it would look as follows.