We are building an application which requires good amount of data exchanges between different users. We are using SQLite to store the info and Rest api to exchange data with server.
To ensure high performance and less CPU /memory hogging but to also maintain good user experience we need following suggestions:
1 We tried running sync at frequency of 30 seconds but it hogs resources.Is there any client side framework which can be used to sync sqlite with MySQL or we have to only plan all possible events for same
2 How does applications like Gmail /twitter work- do they sync only on demand or keep on syncing in background. I feel it is on demand but not sure.
3 Notifications should be server side or client side (based on updates in sqlite). In whatsapp I observed it is client side only. If I do not click a received message I keep on getting the notification about same
4 IF we keep notifications server side and sync on demand basis. then on clicking a new notification when app will open up at that time should we make a sync call
Need an expert opinion that such applications should be designed to manage sync and notifications in such a way that it does not hogs resources and also gives online kind of experience to customer
Your question is pretty broad, but I'll at least give you a direction to start.
I've run local databases in iOS and Android that are over 100 MB without incident. SQLite should never be your problem, if you use it correctly. Even with 100,000 rows of data, it is fast and efficient. Where people get into trouble is by not properly indexing the data or over-normalizing the data. A quick search can tell you how to use indexes to optimize your queries, so I won't go into that any further. Over-normalization seems to be not fully understood, so I'll go into a bit more depth on it.
When designing a server database, it is important to minimize the amount of duplicate data. This often is done by breaking up a single record into multiple tables and using foreign keys. On a 1 GB database, this type of data normalization may save 20%, which is fairly significant. This gain in storage comes at the cost of perform. Sequential lookups and joins are frequently necessary to get complete data. On a server, there are plenty of CPU cycles and memory, and no one really notices if a request takes an extra millisecond or two.
A mobile app is not a full database server. The user is actively staring at the screen waiting for the app to respond. Additionally, the CPU and memory available are minimal, which makes that delay take even longer. To add insult to injury, mobile databases are only a small portion the size of a server database and duplicate data is already pretty minimal. The same data normalization that may have saved 200 MB (20% of 1 GB) server size, may now only save 5% of 10 MB, or 500 KB. That minor gain is not worth the effort or performance hit.
When you sync, you do not need a full data set each time. You only need to get data that has changed since the last sync. In many cases, that will be no change at all. You should find a way to identify what the device has on it now and only get the changes.
It is important that the UI does not stall waiting for the network request. All network activity should be done on a background thread and notify the UI to refresh once the sync completes.
Lastly, I'll mention that SQLite is NOT thread safe. It is important to limit concurrency with your database access.
Related
I'm trying to figure out whether or not to use sqlite/spatialite on Android with only 29k rows. I just need to find the nearest locations from the user everytime they move outside 100 meters which could be about every 10 minutes. I feel like running querying a spatial database as opposed to looping a collection and calculating distances could be overkill. When is it overkill to use a database in this case?
It's not overkill. It's actually probably required for you to store that data somewhere if you don't want your users to hate the app.
Running a constant process on your phone constantly consumes the phone's system resources. Consuming resources kills phone batteries. People don't like apps that kill their phone batteries. Repeatedly executing queries for that many records to a web service endpoint doesn't seem like the best idea either, since it would eat up your users' data plans. Users tend not to like that either.
29K records in active memory is probably more phone resources than you should be thinking about thinking about consuming unless you are doing something very, very special.
If your data doesn't change though, a database isn't the only way to store and query your data. There might a better solution somewhere in the middle, but I would not expect good results from consuming an unnecessary allotment of users' data plan and/or battery life.
I'm currently writing an Android app that will need to do things such as :
Retrieving sensors data such as
ActivityRecognition data, say every 20-30 seconds
Retrieving GPS data from time to time (e.g. when activity recognition sends that user is using its bike or car), so I'd say a few times per day max for average user. Frequency of GPS data retrieving could be every 5-10 seconds for example.
Of course, this data should be stored somewhere to be, later on, analysed by my app. The analyse part is not the problem here as I do not need a real-time calculation of any kind, so my actual concern is how to store the data efficiently.
So if we consider an average user that will generate about 5000 sensors data + 5000 GPS data :
How to best store this data ? Database ? 1 file per day ? I'd say database for performance issue and simplicity of use, but I'm not sure it's very good practice to open/close a database connection every 10/20 seconds to add just one line of data. Also, a journaled file (one per day) could be a good idea but I think this is pretty bad for a performance point of view, even using serialization ?
Will storing these 10000 data degrade battery life much more than just retrieving sensors (ActivityRecognition, GPS) data without storage ? I mean, it seems to me that it will be a bit overconsuming, but in the same time GPS is already using so much battery...
Is there another way to do that ?
Also thought of in-memory storage then every few minutes it could be put in hard storage (SQLite, files), but I'm not sure this is a good idea in terms of safely keeping the data...
Thanks in advance
Compared to the GPS battery consumption, the reading and writing in the database will consume almost nothing (it is flash storage, after all), so no worries there. The database would be the best option to store this data in my opinion and I don't see a problem in creating a single entry every 10 or 20 seconds.
Also thought of in-memory storage then every few minutes it could be
put in hard storage (SQLite, files), but I'm not sure this is a good
idea in terms of safely keeping the data...
This is a very good idea.
I have it done this way, and most systems that deal with files do it that way (called a buffer).
If your app crashes due a bug, some data will be lost, depending on the buffer size.
In all other cases (device will shut down, user terminates app) you have time to write (flush) the buffer.
Just an short additional note.
Other than shutting down peripherals, not really an option in your case, maximize the length of time the processor is inactive. (This is not the same as minimizing the length of time the process is active.) This allows the processor to drop into lower sleep states (called C-states). The deeper the sleep state, the more power savings.
In a general sense, this means
no polling; use interrupts instead,
if you need to periodically wake up to see if anything needs to
be done, make sure your interrupt period is the maximum allowable.
(Contrary to current practice, waking up every 10 ms does not
improve your responsiveness when the average event happens every 500
ms.)
This also applies to peripherals, as they drop into sleep states also when not active (Called D-states).
Minimize the number of cloud accesses you do.
Maximize the length of time between cloud accesses.
I'm not sure if this question belongs here, as it is solely based on theory, however I think this fits best in this stackexchange compared to the rest.
I have 500,000 taxis with Android 4 computers inside them. Everyday, after one person or party makes a trip, the computer sends the information about the trip to the Node.js server. There are roughly 35 trips a day, so that means 500,000 taxis * 35 trips = 17,500,000 reports sent a day to the Node.js server. Also, each report has roughly 4000 characters in it, sized around 5KB.
The report that the taxi computers send to the node.js server is just an http post. Node.js will then send back a confirmation to the taxi. If the taxi does not receive the confirmation for report A in an allotted amount of time, it will resend report A.
The node.js server simply receives the report. Sends the confirmation back to the taxi. And then sends the full report to the MongoDB.
One potential problem : Taxi 1 sends report A to node.js. Node.js does not respond within the allotted time, so Taxi 1 resends report A to node.js. Node.js eventually processes everything and sends report A twice to MongoDB.
Thus MongoDB is in charge of checking whether or not it received multiple of the same reports. Then MongoDB inserts the data.
I actually have a couple of questions. Is this too much for NodeJS to handle (I don't think so, but it could be a problem)? Is this too much for MongoDB to handle? I feel like checking for duplicate reports may severely hinder the performance.
How can I make this whole system more efficient? What should I alter or add?
First potential problem is easy to overcome. Calculate a hash of the trip and store them in mongo. Put the key on that field and then compare every next document if the same hash exists. This way checking for duplicate will be extremely easy and really fast. Keep in mind that this document should not have something like time of sending in it.
Second problem: 17,500,000/day is 196/second nontheless sound scary but in reality it is not so much for decent server and for sure is not a problem for Mongodb.
It is hard to tell how to make it more efficient and I highly doubt you should think about it now. Give it a try, do something, check what is not working efficiently and come back with specific questions.
P.S. not to answer all this in the comments. You have to understand that the question is extremely vague. No one knows what do you mean by trip document and how big is it. It can be 1kb, It may be 10Mb, it can be 100Mb (which is bigger then 16 Mb mongodb limit). No one knows. When I told that 196 documents/sec is not a problem, I did not said that exactly this amount of documents is the maximum cap, so even if it will be 2, 3 times more it is still sounds feasible.
You have to try it yourself. Take avarage amazon instance and see how many YOUR documents (create documents which are close to your size and structure) it can save per second. If it can not handle it, try to see how much it can, or can amazon big instance handle it.
I gave you a rough estimate that this is possible, and I have no idea that you want to "include admins using MongoDB, to update, select,". Have you told this in your question?
I just finished an app that synchronizes its data with a server (runs SyncAdapter on background). I installed it on my phone, let it run on background (I barely used my phone) and I found out that the 23% of my applications' battery usage belongs to my app, so I really need to decrease its battery usage.
Right now I have the sync time set to 30 seconds. It's a multi-user app and if other users interact with you you get a notification, I can't set a sync time too high (actually I wanted to decrease it until I saw the battery usage).
In each synchronization it ALWAYS asks the server for any changes and checks for changes in the local database. If there are changes in the local database they are sent to server, and if we retrieve changes from the server they are applied to the local database.
Does anybody know about some tips to reduce battery usage?
Probably the best that you can do is to implement GCM (Google Cloud Messaging) using pushing instead of polling.
In this way you will be able to get a "tickle" when something new happened and you will know when ask to the server for datas.
A network poll each 30 seconds is very aggressive. I recommend you read this article from Google : http://developer.android.com/training/efficient-downloads/index.html
However, if you really need to request the network that often, I don't see any magical trick for you...
I'm faced with another dilemma, with regards to synchronizing (or updating?) data across to the server, from a mobile device (using Android).
I've looked into SyncML as the standard for doing this, but my big concern is that we plan on syncing a large amount of data across (not just 1 record), and probably only doing it once, twice or at most 3 times a day, or maybe not even once a day - all dependant on certain circumstances.
The other thing - the device or server will still be able to function properly without having to sync across. The sync would just be an update, essentially.
From reading up on the SyncML specs, it applies more to syncing across small pieces of data, and at a very fast interval (ie. every 5-15 minutes, but I guess can be regulated by the user). At any rate, the synchronization process is more involved, and important for both the device and the server (more so the device, I guess).
Here's a quote from the documentation that got me thinking:
2.2.3 Data synchronization SyncML is oriented towards synchronization of
small independent records, as the
modied records are transmitted
entirely. This is adequate for address
entries, short messages and similar
data. On the primary target of SyncML,
mobile devices, most data is of this
type. The devices must be able to keep
track which of their records have been
changed. Each record is identied by a
unique ID, so con icts can be detected
quite simple. As the record ID's may
not be arbitrarily chosen but
automatically created, mapping between
server and client ID's is dened in
the protocol. Mapping is always
managed by the server. When the client
receives a new item from the server,
he can send a map update command to
tell the server what ID he assigned to
the item. Now the server uses the
client ID in all his messages.
So, I guess my question is whether we should continue to look into SyncML for this, or built an in-house solution - maybe something more tailored to delivering large pieces of data across, which can define it as well?
I'm faced with the problem too. I prefer syncml solution, mainly because it's more extensible.
The data tables we want to sync are indeterminate, syncml may be a better choice.