Tracking GPS points and finding their nearest neighbours? - android

I have a list of 1 million (slowly) moving points on the globe (stored as latitude and longitude). Every now and then, each point requests a list of the 100 nearest other points (with a configurable max range, if that helps).
Unfortunately, SELECT * SORT BY compute_geodetic_distance() LIMIT 100 is too slow to be done by each point over and over again. So my question: how should I handle this efficiently? Are there better algorithms/datastructures/... known for this? Or is this the only way and should I look into distributing server load?
(Note: this is for an Android app and the points are users, so in case I'm missing an android-specific solution, feel free to say so!)

For your task geo spatial databases have been invented.
There is Oracle Spatial (expensive) and PostGres (free).
These databases store your millions points in a geographical index, a quad tree (Oracle).
Such a query needs nearly no time.
Some people, like me prefer to leave the database away and build up the quadtree themselfs.
The operations search and insert are easy to implement. Update/delete can be more complex.(Cheapest related to implementation effort, is to build up a new quadtree evry minute)
Using a quadtree you can perform hundreds or thousansds of such nearest 100 points within a second.

Architecturally I would arrange for each "point" to phone home to a server with their location when it changes more than a certain amount. On the server you can do the heavy lifting of calculating the distance between the point that moved and each of the other points, and for each of the other points updating their list of the 100 closest points if required. You can then push changes to a point's closest 100 list as they happen (trivial if you are using App Engine, Android push is supported).
This reduces the amount of work involved to an absolute minimum:
Only report a location change when a point moves far enough
Only recalculate distances when a report is received
Don't rebuild the closest 100 list for a point every time, build the list once, then work out if a point that has moved is going to be added or removed from every other point's list.
Only notify a point of changes to its top 100 list to preserve bandwidth.
There are algorithms that you can use to make this super-efficient, and the problem has a fork/join feel to it as well, allowing you to throw horsepower at the problem.

You have to divide the earth into zones and then use an interior point algorithm to figure out what zones the phone is in. Each possible subset of zones will uniquely determine the 100 closest nodes to a fair approximation. You can get an exact set of 100 nodes by checking distance one by one against the candidate nodes, which (once again) are determined by the subset of zones.

Instead of r-tree or a quadtree, I.e spatial index you can also use a quadkey and a monster curve. This curve reduce the dimension and completetly fills the space. You can download my php class hilbert curve from phpclasses.org. You can use a simple varchar column for the quadkey and search the levels from left to right. A good explanation is from Microsoft Bing maps quadkey website.

Related

Finding data that has high similarity with current data?

I have android app, and I want to find all data that have high similarities with selected data. example:
I have data that has value like this.
No Name Distance Rating Price
1. Coffee Shop 1.3 KM 4.6 40
And I want to display all data that has similarities with the data above (assuming has weight to count like 'similarity score').
what kind algorithm that most suitable and easy to implement with my case?
From what i have been looking for i got several algorithm that i think it would works
- K-Means Clustering
- K-Nearest Neighbor
- ElasticSearch
- Cosine Similarity
In my current assumption, I still considering using K-Means because it's the only algorithm that I have learnt before
If you use K-Means you will get groups of data clustered together. But here I think k-Nearest Neighbors would suit better for your query since from what I understand you will get queries of data and you are trying to find similar data to it. With k-Nearest Neighbors you can just adjust how many you want to include by saying, say nearest 5 or 50 neighbors. So I would go with kNN in this case.
Use a database like MySQL. SQL has joins and methods to sort similar data.

Best technique for storing custom map polygons in Android and ray casting

I'm fairly new to Android and trying to develop an app that identifies if a user's location is inside or outside of a given region within a state. My approach is to take the user's latlng and use ray casting to identify which region they are inside (they must be inside 1). My regions are best equated to state park lines, but Google does not have these in Google Maps (and they're too irregular for geofencing). As such, I created customer polygons. I'm not struggling with the code, but struggling with the best way to handle data.
How should I store and access the polygon data for ray casting? I was taking the approach of storing the polygons in an XML file but I'm worried about the time and processing power it may take to parse the XML and run a ray casting across up to 30 polygons in a given state. My polygons are complex enough that the XML file for one state is upwards of 4MB. My polygons only need to be read, not written, as they'll come with the app.
I think that your best option is to store your polygons is a geographic database. The best solution I've found so far to do so is SpatiaLite which is build over SQLite and works really well on Android.
Using this approach you will store you polygons in the database and query what polygons intersect with a given LatLng (Point). The query will look like this (not tested):
select * from polygons_table where st_intersects(Geometry, MakePoint(longitude, latitude, 4326));
Note that I use 4326 as the SRID because I assume that you will store your polygons in WGS84.
Here you can find the SpatiaLite 3.0.0-BETA SQL functions reference list.

Save GPS Coordinates to file/DB/xml?

i'm trying my first android app and it should track my route with gps coorinates.
The app also has a five textboxes, each with about 30 chars user can type in.
Coordinates should be saved all 30/60 seconds, is this enough?
Or is it possible to save it with 10 seconds and what's the right way to save it?
Thought about reading xml from url, but I think it could be more data in future.
What could be a good way to store it on sd as XML or normal file locally, which I can parse from client-pc to retrieve coordinates?
Thanks for your time.
Best Regards
You should only save a position if it's far enough away from the previous position. That way you'll have way less data without losing any information (in other words - it doesn't help to save the same position every 10 seconds).
In my sports tracker app, I save the data in a database table (latitude, longitude, timestamp ... basically all you get in the Location object).
XML would work but the performance would dramatically decrease as the amount of data in your file increases. I had a similar project a year ago and I used a SQLite database.
The period you want to use depends on your needs, getting a location every 10 secs might be a lot, you might want to adapt the period to the speed or the area (city or highway). You can also rely on 3g (network instead of GPS) to get accurate, quicker and cheaper (in terms of battery) location fixes in dense areas (cities)
Consider the speed of the host, and the accuracy required when reconstructing the path. If you're walking, a sample every 30 seconds might be fine, but if you're in a car, you might want to sample faster. Also, I'd suggest XML, and I'd recommend looking up the GPX format, that would give you portability as well, because other programs will understand it and allow import/export.

Android line simplification

I'm looking for some best practice advice.
I have create an app (very like mytracks) which collects GPS measurements and displays them on a map. I want to be able to record GPS data for ideally 24 hours at a 10 second interval. This is a lot of data, so I am not keeping it in memory, i'm storing it into an SQLiteDB as it arrives. Inside the draw() functions I am selecting everything and drawing it as a Path object.
My above approach works great until I have > 4 hours worth of data. Then the draw function takes for ever to execute which makes the application seem very slow.
I think what I need to do is draw a simplified trajectory onto the map. My question is what is the best way of doing this.
i) Processor heavy: In draw() select everything from the SQLiteDB, construct the simplified trajectory, draw it on the map.
ii) Memory heavy: Maintain a simplified trajectory in memory, update it as new data arrives, in draw() simply draw it to the map.
iii) Magic: Use some special OverlayLay that I don't know about which handles line simplification for you.
Kind regards,
Cathal
My initial semi-random thoughts:
You don't say that you're actually doing so, but don't store one sample per database table row. 24 hours of samples at 10 second intervals, that's 8640 samples. Each sample is 2 doubles, i.e 16 bytes. A day's worth of data is 135KB, a sum which can easily fit entirely in memory. Your database strategy should probably be to let one table row correspond to one sampling period, whose maximum length is one day. Needless to say, the sample data should be in a BLOB field.
Drawing the path: this depends on the current map zoom and what part of the sample set is visible. The first thing you do is to iterate your sample collection (max. 8640) and determine the subset which is visible at the current zoom. That should be a pretty quick operation. Lets say for sake of example 5000 are visible. You then select some maximum number of samples for the path based on h/w assumptions... picking a number out of thin air let's say no more than 500 samples used for the path (i.e. the device won't struggle to draw a path with 500 points). You therefore build the path using every 10th sample (5000/500 = 10), and make sure to include the first and last sample of the visible set.
Note that you don't do all this work every frame. You only need to recalculate the path when the user finishes panning or zooming the map. The rest of the time you just draw the path you already calculated.
Oddly enough, I was just looking at code I wrote to do something similar, about 5 years ago.
Here are the steps I went through:
Simplify the dataset to only visible detail. I designed with several pluggable simplification strategies, but a common interface for tolerances and feeding in/getting out points to render. More on how to design it below.
Cache a compact, fast-to-access version of the simplified point list. For big data sets, it's helpful to use primitives as much as possible, in preference to Point objects. With double precision locations, you need 128 bytes per point, or ~1.3 MB of memory for 10,000.
Render efficiently, and without creating garbage. Iterating through int/float/double arrays of x and y coordinates is best, with a tight rendering loop that does as much as possible outside the loop. You'll be AMAZED how many points you can render at once if it's just "plot this path."
Notify the Simplifier when new points are added, and run new points through this before adding them to the cached point list. Update it as needed, but try to just process the latest.
Simplification Interface:
There's a bunch of ways to implement this. The best one (for huge point sets) is to feed it an Iterator<Point> and let the simplification algorithm go to work with that. This way, the points don't all have to be in memory, and you can feed it from a DB query. For example, the Iterator can wrap a JDBC ResultSet. Simplifiers should also have a "tolerance" value to determine how close points are before they get ignored.
How to simplify pointsets/polygonal lines:
There are a bunch of algorithms.
The simplest is to remove points that are less than $tolerance from the last included point. This is an O(n) implementation.
The Douglas-Peucker algorithm gives an excellent polygon simplification on a large pointset. The weakness is that you need to operate on points in memory; use it on batches of, say, 10,000 points at a time. Runs in O(n log n) average, O(n^2) worst case
Fancy 2D hashing: you can use a 2D hashing algorithm, with one entry possible per pixel. Points that map to an occupied slot aren't rendered. You'll need to run through points twice for this, to find points that lead back to the same spots, if rendering lines and not scatterplots.
Extra tips:
You can boost performance by creating a wrapper that maps your cached, simplified point list to something your graphics primitives can handle easily. I was using Swing, so I created something that acted like a mutable Path2D, which the rendering context could handle directly.
Use your primitives. Seriously, I can't say this enough. If you use objects to store memory, the 128 bytes/point can double, AND increase memory use AND it can't be compiled to such optimal code.
Using these strategies, it is possible to render millions of points at once (in a reduced form). If I recall correctly, I could run the simplification routine in real-time, operating on 10k+ points at a time, using good code. It might have been 100k+ points. If you can store a hashtable with one slot per pixel, the hashing implementation is ridiculously fast (I know, it was my fastest solution)

How to store lots of longitudes/latitudes on an Android device

I am looking into writing an Android app that has a database of approximately 2000 longitudes and latitudes which are effectively hard coded.
I assume that once my app is installed, I can put this information into the SQLite database, but how should I distribute this information when the app is downloaded?
One option I thought of was some kind of Patricia Trie to minimise the size of the data (the points will be in a number of clusters, rather than evenly distributed), but I'm not sure whether such a collection would work when there are two associated numbers to store, along with perhaps some other information such as place name.
Does anyone have any thoughts, input or suggestions?
Rich
2000 ints is not many.
In fact I recently tried to load up my web app that has similar numbers for lat lon. I realized i need to optimize a bit, but its load time wasn't completely terrible.
You may want to just request the data you need at any given moment. There must be some other data associated with the lat lons that can help you with that... or maybe you should only display pins within some boundary of lat lon, like +1,-1 in every direction of the center of your map or something.

Categories

Resources