After a couple of post on the Android app for upstreaming data from a wearable with an accelerometer, we finally get to analyse the data. The very first objective is relatively simple: measuring the overall physical activity of the user. Actually, the simplest statistic measures already say a lot -in particular the standard deviation of the Signal Vector Magnitude.
The post includes code in R, and also a dataset for the accelerometer data, so that you can follow it along as a kind of tutorial.
You can find the dataset in this R data file. Just download it and load it into your R session.
There’s lot of information on accelerometers in the web, but in particular this se question could be useful for understanding the range of values that such a device provides, and how normalization is usually made. For a more serious/academic source on calibration, take e.g. this paper.
A good summary on the basic measures of physical activity may be found here.
Let’s first take a look at the dataset. It corresponds to 486 minutes of accelerometer data, i.e. around 8 hours of data. Some basic statistics:
##  364500
summary(post_data[, c('x', 'y', 'z')])
## x y z ## Min. :-2.2580 Min. :-2.48500 Min. :-2.2660 ## 1st Qu.:-0.4850 1st Qu.:-0.47700 1st Qu.:-0.6840 ## Median :-0.1250 Median :-0.20400 Median :-0.5160 ## Mean :-0.1703 Mean :-0.04411 Mean :-0.2351 ## 3rd Qu.: 0.1480 3rd Qu.: 0.60100 3rd Qu.: 0.0500 ## Max. : 1.6050 Max. : 1.47600 Max. : 1.5110
There’s around 365 thousand rows, and each of them is composed of an id (actually two fields,
id, see below for details on the data model) and the 3 columns coming from the accelerometer, for coordinates x, y and z.
First operation should be to calibrate the accelerometer so that the data can be normalized, but in this case the device already provides a normalized output, as it will become clear later. Second would be to calculate the modulus (length) of the 3d vector and plotting the results.
post_data$mod <- sqrt(post_data$x^2 + post_data$y^2 + post_data$z^2)
Note that, in order to avoid a crude plot with 350 thousand values, the plot below smooths the data manually, and it is only useful to get a quick visualization of the data. And alternative would be to use a
require(zoo) plot(rollapply(post_data$mod, width=100, FUN=max, by=100), xlab="Samples / 100", ylab="Modulus", main="Overall crude picture", type="l", ylim=c(min(post_data$mod), max(post_data$mod))) lines(rollapply(post_data$mod, width=100, FUN=min, by=100), type="l")
A few details on what the “subject” who was wearing the wristband was doing during the 8 hours in which the data was captured:
1st he arrived home and setup the experiment
2nd he attended quite a long phone call
then he prepared dinner and did some stuff at home (laundry, washing dishes, etc)
he spent some time in front on the computer and then on bed reading
and finally he went to sleep
The last 4 hours correspond to the sleep time, although the “subject” had to interrupt his sleep to go to the bathroom a couple of times, and in general was somewhat worried about the experiment.
From the data (and even before the visualization), it’s easy to see that the mean of the modulus is equal to 1. Let’s try to interpret that value: the mean of activity for the “subject”, in particular for the last 4 hours, was “being still”; on the other hand, in the context of an accelerometer, being still corresponds to a 1G acceleration -since the accelerometer just measures gravity when the subject is still. Precisely, the objective of the normalization would be to set to 1 whatever the value comes out from the accelerometer when the device is still and thus measuring 1G -so this step is already done.
Another important point to keep in mind is the meaning of the 0 axis for the modulus: that would correspond to 0G, zero gravity, or free fall.
The straightforward way to measure physical way is called
SVM (acceleration Signal Vector Magnitude), which is just an average of the modulus. Bellow the calculation in R using a rolling window per minute of data (since 750 samples correspond to 1 minute).
post_svm <- rollapply(post_data$mod, width=750, FUN=mean, by=750) summary(post_data$mod)
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.05152 0.98970 0.99550 0.99840 1.00600 3.29900
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.9709 0.9923 0.9965 0.9984 1.0060 1.0350
plot(post_svm, xlab="Minutes", ylab="SVM", main="Mean of physical activity")