Detect pattern of motion on an Android device

Detect pattern of motion on an Android device - android

I want to detect a specific pattern of motion on an Android mobile phone, e.g. if I do five sit-stands.
[Note: I am currently detecting the motion but the motion in all direction is the same.]
What I need is:
I need to differentiate the motion downward, upward, forward and backward.
I need to find the height of the mobile phone from ground level (and the height of the person holding it).
Is there any sample project which has pattern motion detection implemented?

This isn't impossible, but it may not be extremely accurate, given that the accuracy of the accelerometer and gyroscopes in phones have improved a lot.
What your app will doing is taking sensor data, and doing a regression analysis.
1) You will need to build a model of data that you classify as five sit and stands. This could be done by asking the user to do five sit and stands, or by loading the app with a more fine-tuned model from data that you've collected beforehand. There may be tricks you could do, such as loading several models of people with different heights, and asking the user to submit their own height in the app, to use the best model.
2) When run, your app will be trying to fit the data from the sensors (Android has great libraries for this), to the model that you've made. Hopefully, when the user performs five sit-stands, he will generate a set of motion data similar enough to your definition of five sit-stands that your algorithm accepts it as such.
A lot of the work here is assembling and classifying your model, and playing with it until you get an acceptable accuracy. Focus on what makes a stand-sit unique to other up and down motions - For instance, there might be a telltale sign of extending the legs in the data, followed by a different shape for straightening up fully. Or, if you expect the phone to be in the pocket, you may not have a lot of rotational motion, so you can reject test sets that registered lots of change from the gyroscope.

It is impossible. You can recognize downward and upward comparing acceleration with main gravity force but how do you know is your phone is in the back pocket when you rise or just in your waving hand when you say hello? Was if 5 stand ups or 5 hellos?
Forward and backward are even more unpredictable. What is forward for upside-down phone? What if forward at all from phone point of view?
And ground level as well as height are completely out of measurement. Phone will move and produce accelerations in exact way for dwarf or giant - it more depends on person behavior or motionless then on height.

It's a topic of research and probably I'm way too late to post it here, but I'm foraging the literature anyway, so what?
All kind of machine learning approaches have been set on the issue, I'll mention some on the way. Andy Ng's MOOC on machine learning gives you an entry point to the field and into Matlab/Octave that you instantly can put to practice, it demystifies the monsters too ("Support vector machine").
I'd like to detect if somebody is drunk from phone acceleration and maybe angle, therefore I'm flirting with neuronal networks for the issue (they're good for every issue basically, if you can afford the hardware), since I don't want to assume pre-defined patterns to look for.
Your task could be approached pattern based it seems, an approach applied to classify golf play motions, dancing, behavioural every day walking patterns, and two times drunk driving detection where one addresses the issue of finding a base line for what actually is longitudinal motion as opposed to every other direction, which maybe could contribute to find the baselines you need, like what is the ground level.
It is a dense shrub of aspects and approaches, below just some more.
Lim e.a. 2009: Real-time End Point Detection Specialized for Acceleration Signal
He & Yin 2009: Activity Recognition from acceleration data Based on
Discrete Consine Transform and SVM
Dhoble e.a. 2012: Online Spatio-Temporal Pattern Recognition with Evolving Spiking Neural Networks utilising Address Event Representation, Rank Order, and Temporal Spike Learning
Panagiotakis e.a.: Temporal segmentation and seamless stitching of motion patterns for synthesizing novel animations of periodic dances
This one uses visual data, but walks you through a matlab implementation of a neuronal network classifier:
Symeonidis 2000: Hand Gesture Recognition Using Neural Networks

I do not necessarily agree with Alex's response. This is possible (although maybe not as accurate as you would like) using accelerometer, device rotation and ALOT of trial/error and data mining.
The way I see that this can work is by defining a specific way that the user holds the device (or the device is locked and positioned on the users' body). As they go through the motions the orientation combined with acceleration and time will determine what sort of motion is being performed. You will need to use class objects like OrientationEventListener, SensorEventListener, SensorManager, Sensor and various timers e.g. Runnables or TimerTasks.
From there, you need to gather a lot of data. Observe, record and study what the numbers are for doing specific actions, and then come up with a range of values that define each movement and sub-movements. What I mean by sub-movements is, maybe a situp has five parts:
1) Rest position where phone orientation is x-value at time x
2) Situp started where phone orientation is range of y-values at time y (greater than x)
3) Situp is at final position where phone orientation is range of z-values at time z (greater than y)
4) Situp is in rebound (the user is falling back down to the floor) where phone orientation is range of y-values at time v (greater than z)
5) Situp is back at rest position where phone orientation is x-value at time n (greatest and final time)
Add acceleration to this as well, because there are certain circumstances where acceleration can be assumed. For example, my hypothesis is that people perform the actual situp (steps 1-3 in my above breakdown) at a faster acceleration than when they are falling back. In general, most people fall slower because they cannot see what's behind them. That can also be used as an additional condition to determine the direction of the user. This is probably not true for all cases, however, which is why your data mining is necessary. Because I can also hypothesize that if someone has done many situps, that final situp is very slow and then they just collapse back down to rest position due to exhaustion. In this case the acceleration will be opposite of my initial hypothesis.
Lastly, check out Motion Sensors: http://developer.android.com/guide/topics/sensors/sensors_motion.html
All in all, it is really a numbers game combined with your own "guestimation". But you might be surprised at how well it works. Perhaps (hopefully) good enough for your purposes.
Good luck!

Related

Tracking linear movement on mobile devices [duplicate]

I was looking into implementing an Inertial Navigation System for an Android phone, which I realise is hard given the accelerometer accuracy, and constant fluctuation of readings.
To start with, I set the phone on a flat surface and sampled 1000 accelerometer readings in the X and Y directions (parallel to the table, so no gravity acting in these directions). I then averaged these readings and used this value to calibrate the phone (subtracting this value from each subsequent reading).
I then tested the system by again placing it on the table and sampling 5000 accelerometer readings in the X and Y directions. I would expect, given the calibration, that these accelerations should add up to 0 (roughly) in each direction. However, this is not the case, and the total acceleration over 5000 iterations is nowhere near 0 (averaging around 10 on each axis).
I realise without seeing my code this might be difficult to answer but in a more general sense...
Is this simply an example of how inaccurate the accelerometer readings are on a mobile phone (HTC Desire S), or is it more likely that I've made some errors in my coding?

You get position by integrating the linear acceleration twice but the error is horrible. It is useless in practice.
Here is an explanation why (Google Tech Talk) at 23:20. I highly recommend this video.
It is not the accelerometer noise that causes the problem but the gyro white noise, see subsection 6.2.3 Propagation of Errors. (By the way, you will need the gyroscopes too.)
As for indoor positioning, I have found these useful:
RSSI-Based Indoor Localization and Tracking Using Sigma-Point Kalman Smoothers
Pedestrian Tracking with Shoe-Mounted Inertial Sensors
Enhancing the Performance of Pedometers Using a Single Accelerometer
I have no idea how these methods would perform in real-life applications or how to turn them into a nice Android app.
A similar question is this.
UPDATE:
Apparently there is a newer version than the above Oliver J. Woodman, "An introduction to inertial navigation", his PhD thesis:
Pedestrian Localisation for Indoor Environments

I am just thinking out loud, and I haven't played with an android accelerometer API yet, so bear with me.
First of all, traditionally, to get navigation from accelerometers you would need a 6-axis accelerometer. You need accelerations in X, Y, and Z, but also rotations Xr, Yr, and Zr. Without the rotation data, you don't have enough data to establish a vector unless you assume the device never changes it's attitude, which would be pretty limiting. No one reads the TOS anyway.
Oh, and you know that INS drifts with the rotation of the earth, right? So there's that too. One hour later and you're mysteriously climbing on a 15° slope into space. That's assuming you had an INS capable of maintaining location that long, which a phone can't do yet.
A better way to utilize accelerometers -even with a 3-axis accelerometer- for navigation would be to tie into GPS to calibrate the INS whenever possible. Where GPS falls short, INS compliments nicely. GPS can suddenly shoot you off 3 blocks away because you got too close to a tree. INS isn't great, but at least it knows you weren't hit by a meteor.
What you could do is log the phones accelerometer data, and a lot of it. Like weeks worth. Compare it with good (I mean really good) GPS data and use datamining to establish correlation of trends between accelerometer data and known GPS data. (Pro tip: You'll want to check the GPS almanac for days with good geometry and a lot of satellites. Some days you may only have 4 satellites and that's not enough) What you might be able to do is find that when a person is walking with their phone in their pocket, the accelerometer data logs a very specific pattern. Based on the datamining, you establish a profile for that device, with that user, and what sort of velocity that pattern represents when it had GPS data to go along with it. You should be able to detect turns, climbing stairs, sitting down (calibration to 0 velocity time!) and various other tasks. How the phone is being held would need to be treated as separate data inputs entirely. I smell a neural network being used to do the data mining. Something blind to what the inputs mean, in other words. The algorithm would only look for trends in the patterns, and not really paying attention to the actual measurements of the INS. All it would know is historically, when this pattern occurs, the device is traveling and 2.72 m/s X, 0.17m/s Y, 0.01m/s Z, so the device must be doing that now. And it would move the piece forward accordingly. It's important that it's completely blind, because just putting a phone in your pocket might be oriented in one of 4 different orientations, and 8 if you switch pockets. And there's many ways to hold your phone, as well. We're talking a lot of data here.
You'll obviously still have a lot of drift, but I think you'd have better luck this way because the device will know when you stopped walking, and the positional drift will not be a perpetuating. It knows that you're standing still based on historical data. Traditional INS systems don't have this feature. The drift perpetuates to all future measurements and compounds exponentially. Ungodly accuracy, or having a secondary navigation to check with at regular intervals, is absolutely vital with traditional INS.
Each device, and each person would have to have their own profile. It's a lot of data and a lot of calculations. Everyone walks different speeds, with different steps, and puts their phones in different pockets, etc. Surely to implement this in the real world would require number-crunching to be handled server-side.
If you did use GPS for the initial baseline, part of the problem there is GPS tends to have it's own migrations over time, but they are non-perpetuating errors. Sit a receiver in one location and log the data. If there's no WAAS corrections, you can easily get location fixes drifting in random directions 100 feet around you. With WAAS, maybe down to 6 feet. You might actually have better luck with a sub-meter RTK system on a backpack to at least get the ANN's algorithm down.
You will still have angular drift with the INS using my method. This is a problem. But, if you went so far to build an ANN to pour over weeks worth of GPS and INS data among n users, and actually got it working to this point, you obviously don't mind big data so far. Keep going down that path and use more data to help resolve the angular drift: People are creatures of habit. We pretty much do the same things like walk on sidewalks, through doors, up stairs, and don't do crazy things like walk across freeways, through walls, or off balconies.
So let's say you are taking a page from Big Brother and start storing data on where people are going. You can start mapping where people would be expected to walk. It's a pretty sure bet that if the user starts walking up stairs, she's at the same base of stairs that the person before her walked up. After 1000 iterations and some least-squares adjustments, your database pretty much knows where those stairs are with great accuracy. Now you can correct angular drift and location as the person starts walking. When she hits those stairs, or turns down that hall, or travels down a sidewalk, any drift can be corrected. Your database would contain sectors that are weighted by the likelihood that a person would walk there, or that this user has walked there in the past. Spatial databases are optimized for this using divide and conquer to only allocate sectors that are meaningful. It would be sort of like those MIT projects where the laser-equipped robot starts off with a black image, and paints the maze in memory by taking every turn, illuminating where all the walls are.
Areas of high traffic would get higher weights, and areas where no one has ever been get 0 weight. Higher traffic areas are have higher resolution. You would essentially end up with a map of everywhere anyone has been and use it as a prediction model.
I wouldn't be surprised if you could determine what seat a person took in a theater using this method. Given enough users going to the theater, and enough resolution, you would have data mapping each row of the theater, and how wide each row is. The more people visit a location, the higher fidelity with which you could predict that that person is located.
Also, I highly recommend you get a (free) subscription to GPS World magazine if you're interested in the current research into this sort of stuff. Every month I geek out with it.

I'm not sure how great your offset is, because you forgot to include units. ("Around 10 on each axis" doesn't say much. :P) That said, it's still likely due to inaccuracy in the hardware.
The accelerometer is fine for things like determining the phone's orientation relative to gravity, or detecting gestures (shaking or bumping the phone, etc.)
However, trying to do dead reckoning using the accelerometer is going to subject you to a lot of compound error. The accelerometer would need to be insanely accurate otherwise, and this isn't a common use case, so I doubt hardware manufacturers are optimizing for it.

Android accelerometer is digital, it samples acceleration using the same number of "buckets", lets say there are 256 buckets and the accelerometer is capable of sensing from -2g to +2g. This means that your output would be quantized in terms of these "buckets" and would be jumping around some set of values.
To calibrate an android accelerometer, you need to sample a lot more than 1000 points and find the "mode" around which the accelerometer is fluctuating. Then find the number of digital points by how much the output fluctuates and use that for your filtering.
I recommend Kalman filtering once you get the mode and +/- fluctuation.

I realise this is quite old, but the issue at hand is not addressed in ANY of the answers given.
What you are seeing is the linear acceleration of the device including the effect of gravity. If you lay the phone on a flat surface the sensor will report the acceleration due to gravity which is approximately 9.80665 m/s2, hence giving the 10 you are seeing. The sensors are inaccurate, but they are not THAT inaccurate! See here for some useful links and information about the sensor you may be after.

You are making the assumption that the accelerometer readings in the X and Y directions, which in this case is entirely hardware noise, would form a normal distribution around your average. Apparently that is not the case.
One thing you can try is to plot these values on a graph and see whether any pattern emerges. If not then the noise is statistically random and cannot be calibrated against--at least for your particular phone hardware.

Neural Network to recognize accelerometer pattern

I'm building an application for Android devices that requires it to recognize, by accelerometer data, the difference between walking noise and double tapping it. I'm trying to solve this problem using Neural Networks.
At the start it went pretty well, teaching it to recognize the taps from noise such as standing up/ sitting down and walking around at a slower pace. But when it came to normal walking it never seemed to learn even though I fed it with a large proportion of noise data.
My question: Are there any serious flaws in my approach? Is the problem based on lack of data?
The network
I've choosen a 25 input 1 output multi-layer perceptron, which I am training with backpropagation. The input is the changes in acceleration every 20ms and output ranges from -1 (for no-tap) to 1 (for tap). I've tried pretty much every constallation of hidden inputs there are, but had most luck with 3 - 10.
I'm using Neuroph's easyNeurons for the training and exporting to Java.
The data
My total training data is about 50 pieces double taps and about 3k noise. But I've also tried to train it with proportional amounts of noise to double taps.
The data looks like this (ranges from +10 to -10):
Sitting double taps:
Fast walking:
So to reiterate my questions: Are there any serious flaws in my approach here? Do I need more data for it to recognize the difference between walking and double tapping? Any other tips?
Update
Ok so after much adjusting we've boiled the essential problem down to being able to recognize double taps while taking a brisk walk. Sitting and regular (in-house) walking we can solve pretty good.
Brisk walk
So this is some test data of me first walking then stopping, standing still, then walking and doing 5 double taps while I'm walking.
If anyone is interested in the raw data, I linked it for the latest (brisk walk) data here

Do you insist on using a neural network? If not, here is an idea:
Take a window of 0.5 seconds and consider the area under the curve (or since your signal is discrete, the sum of the absolute values of each sensor reading-- the red area in the attached image). You will probably find that that sum is high when the user is walking and much much lower when they are sitting and/or tapping. You can set a threshold above which you consider a given window to be taken while the user is walking. Alternatively, since you have labelled data, you can train any binary classifier to differentiate between walking and not walking.
You can probably improve your system by considering other features of the signal, such as how jagged the line is. If the phone is sitting on a table, the line will be almost flat. If the user is typing, the line will be kind of flat, and you will see a spike every now and then. If they are walking, you will see something like a sine wave.

Have you considered that the "fast walking" and "fast walking + double tapping" signals might be too similar to differentiate using only accelerometer data? It may simply not be possible to achieve accuracy above a certain amount.
Otherwise, neural networks are probably a good choice for your data, and it still may be possible to get better performance out of them.
This very-useful paper (http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf) recommends that you whiten your dataset so that it has a mean of zero and unit covariance.
Also, since your problem is a classification problem, you should make sure that you are training your network using a cross-entropy criteria (http://arxiv.org/pdf/1103.0398v1.pdf ) rather than RMSE. (I have no idea whether Neuroph supports cross-entropy or not.)
Another relatively simple thing you could try, as other posters suggested, is transforming your data. Using an FFT or DCT to transform your data to the frequency domain is relatively standard for time-series classification.
You could also try training networks on different sized windows and averaging the results.
If you want to try some more difficult NN architectures, you could look at the Time-Delay-Neural-Network (just google this for the paper), which takes multiple windows into account in its structure. It should be relatively straightforward to use one of the Torch libraries (http://www.torch.ch/) to implement this, but it might be hard to export the network to an Android environment.
Finally, another method of getting better classification performance in time-series data is to consider the relationships between adjacent labels. Conditional Neural Fields (http://code.google.com/p/cnf/ - note:I have never used this code) do this by integrating neural networks into conditional random fields, and, depending on the patterns of behavior in your actual data, may do a better job.

What probably would work is to filter the data using a Fourier transform first. Walking has a sinus like amplitude, your double taps would stand-out in the transform-result as a different frequency. I guess a neural network can than determine if the data contains your double tabs because it has the extra frequency (the double tabs frequency). Some questions remain:
How long the sample of data needs to be?
Can your phone do all the work it needs to do, does it have enough processing power?
You might even want to consider using the GPU for this.
Another option is to use the Fourier output and some good old Fuzzy Logic.
This sound like fun...

Detect physical gesture with accelerometer

In an Android app I'm making, I would like to detect when a user is holding a phone in his hand, makes a gesture like he would when throwing a frissbee. I have seen a couple of apps implementing this, but I can't find any example code or tutorial on the web.
It would be great with some thoughts on how this could be done, and ofc.
It would be even better with some example code or link to a tutorial.

Accelerometer provides you with a stream of 3d vectors. In case your phone is help in hand, its direction is opposite of earth gravity pull and size is the same. (this way you can determine phone orientation)
If user lets if fall, vector value will go to 0 (the process as weighlessness on space station)
If user makes some gesture without throwing it, directon will shift, and amplitude will rise, then fall and then rise again (when user stops movement). To determine how it looks like, you can do some research by recording accelerometer data and performing desireg gestures.
Keep in mind, that accelerometer is pretty noisy - you will have to do some averaging over nearby values to get meaningful results.
I think that one workable approach to match gesture would be invariant moments (like Hu moments used to image recognition) - accelerometer vector over time defines 4 dimensional space, and you will need set of scaling / rotation invariant moments. Designing such set is not easy, but comptuing is not complicated.
After you got your moments, you may use standart techniques of matching vectors to clusters. ( see "moments" and "cluster" modules from our javaocr project: http://javaocr.svn.sourceforge.net/viewvc/javaocr/trunk/plugins/ )
PS: you may get away with just speed over time, which produces 2-Dimensional space and can be analysed with javaocr on the spot.

Not exactly what you are looking for:
Store orientation to an array - and compare
Tracking orientation works well. Perhaps you can do something similar with the accelerometer data (without any integration).
A similar question is Drawing in air with Android phone.
I am curious what other answers you will get.

Android accelerometer accuracy (Inertial navigation)

I was looking into implementing an Inertial Navigation System for an Android phone, which I realise is hard given the accelerometer accuracy, and constant fluctuation of readings.
To start with, I set the phone on a flat surface and sampled 1000 accelerometer readings in the X and Y directions (parallel to the table, so no gravity acting in these directions). I then averaged these readings and used this value to calibrate the phone (subtracting this value from each subsequent reading).
I then tested the system by again placing it on the table and sampling 5000 accelerometer readings in the X and Y directions. I would expect, given the calibration, that these accelerations should add up to 0 (roughly) in each direction. However, this is not the case, and the total acceleration over 5000 iterations is nowhere near 0 (averaging around 10 on each axis).
I realise without seeing my code this might be difficult to answer but in a more general sense...
Is this simply an example of how inaccurate the accelerometer readings are on a mobile phone (HTC Desire S), or is it more likely that I've made some errors in my coding?

You get position by integrating the linear acceleration twice but the error is horrible. It is useless in practice.
Here is an explanation why (Google Tech Talk) at 23:20. I highly recommend this video.
It is not the accelerometer noise that causes the problem but the gyro white noise, see subsection 6.2.3 Propagation of Errors. (By the way, you will need the gyroscopes too.)
As for indoor positioning, I have found these useful:
RSSI-Based Indoor Localization and Tracking Using Sigma-Point Kalman Smoothers
Pedestrian Tracking with Shoe-Mounted Inertial Sensors
Enhancing the Performance of Pedometers Using a Single Accelerometer
I have no idea how these methods would perform in real-life applications or how to turn them into a nice Android app.
A similar question is this.
UPDATE:
Apparently there is a newer version than the above Oliver J. Woodman, "An introduction to inertial navigation", his PhD thesis:
Pedestrian Localisation for Indoor Environments

I'm not sure how great your offset is, because you forgot to include units. ("Around 10 on each axis" doesn't say much. :P) That said, it's still likely due to inaccuracy in the hardware.
The accelerometer is fine for things like determining the phone's orientation relative to gravity, or detecting gestures (shaking or bumping the phone, etc.)
However, trying to do dead reckoning using the accelerometer is going to subject you to a lot of compound error. The accelerometer would need to be insanely accurate otherwise, and this isn't a common use case, so I doubt hardware manufacturers are optimizing for it.

Android accelerometer is digital, it samples acceleration using the same number of "buckets", lets say there are 256 buckets and the accelerometer is capable of sensing from -2g to +2g. This means that your output would be quantized in terms of these "buckets" and would be jumping around some set of values.
To calibrate an android accelerometer, you need to sample a lot more than 1000 points and find the "mode" around which the accelerometer is fluctuating. Then find the number of digital points by how much the output fluctuates and use that for your filtering.
I recommend Kalman filtering once you get the mode and +/- fluctuation.

I realise this is quite old, but the issue at hand is not addressed in ANY of the answers given.
What you are seeing is the linear acceleration of the device including the effect of gravity. If you lay the phone on a flat surface the sensor will report the acceleration due to gravity which is approximately 9.80665 m/s2, hence giving the 10 you are seeing. The sensors are inaccurate, but they are not THAT inaccurate! See here for some useful links and information about the sensor you may be after.

You are making the assumption that the accelerometer readings in the X and Y directions, which in this case is entirely hardware noise, would form a normal distribution around your average. Apparently that is not the case.
One thing you can try is to plot these values on a graph and see whether any pattern emerges. If not then the noise is statistically random and cannot be calibrated against--at least for your particular phone hardware.

How to detect phone orientation relative to direction of movement

Problem: Consider an Android device mounted in a vehicle. We want to measure various things using the accelerometer. These measurements should be relative to the vehicle's coordinate system. Thus we need to figure out how the device is oriented in relation to the vehicle. The simple solution would be to just average the "early" acceleration after startup, but I'm worried that the first thing the driver will do is leave a parking lot or a turning left onto the road, thus describing a curve. It would be feasible to ask the user to start measuring after getting on the road, but what if there is no acceleration at that point?
Question: Can someone suggest a strategy or an algorithm that would do a reasonable job of telling how the phone is oriented in relation to the vehicle? A pointer to some FOSS source that solves a similar problem would be even better.
Notes:
I do not want to use GPS for this as it would complicate things for the user.
We can interact with the user, for example by requesting that the user starts measurements before starting out.

The accelerometer alone would not provide sufficient information for your purpose, I would hazard: The vectors acting upon the device, besides vehicle acceleration, will be the vibration of the vehicle itself, road inclines, braking and centripetal force from turns.
The amount of data from sensors due to all those forces would be impractical to aggregate on a phone, hence moving averages or other cumulation approaches would not give even vaguely precise results.
Also, a lot of the acceleration data would be lost between sensor sampling times, even if you were to use the highest available sensor rate.
Recommendation: Use GPS or network positioning information, generate moving averages to account for minor aberrations, and use the result.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.