Best way to update a Sqlite DB from a server? - android

I have an android app with a Sqlite database (it's about 800Mb), sometimes I need to insert, modify or delete database rows from an external server (via internet) in order to update the database.
Is there a way to update the database from the server without having to download the entire database (800mb)?
I was thinking of a homemade solution that consists of adding a new column to the server database that indicates if said row needs to be inserted, deleted or modified by the android app, but I don't know if something is already implemented.

First question- does the database also change locally on the Android device? If so, you're basically into cache coherency. There's an old joke that the two hardest problems in CS are cache coherency and naming things. It's not totally wrong.
If you do need to keep local changes, especially if you need to sync local changes up, this needs to be a small book, so I'm going to assume not for the rest of the answer.
Honestly, if your db needs to scale at all or you need to make changes frequently, downloading a new db is the way to go. Doing any sort of diff against the db is going to cost you a lot of DB processor time, which translates to bigger or faster db servers, which equals money. Or a big perf hit on any other use of the db.
If you do decide you need to do this you need two extra columns. One- an isDeleted flag. That way you can easily check for deleted rows (the only other way to do so is to download all rows and see what's missing, which is a very bad idea). Please note you'll need to change every db query you make anywhere to add "and isDeleted=false" as a condition so you don't return delete rows.
The second column isn't an "isModified" field, its a "modifiedTime" timestamp. Why a timestamp? Because you can't be assured that a client downloading the db was only 1 version behind. He could be 2. Or 10. You need to be able to get all the changes in all the previous versions as well, so an isModified isn't good enough. With a modifiedTime field, you can find the max modifiedTime in the local db, then ask the server for all rows with a modifiedTime greater than yours. You'll then either need to change all your inserts and updates to also set modifiedTime, or use a trigger to do so.
There are a few other ways to do it- a migration file approach (a file with the SQL commands to alter the data) can work if your changes are small. Really though, just download the db. It's so much simpler and less likely to break things. And if you're doing large updates, it may even be less bandwidth. Most importantly, if you just download the file you know the data is correct- if you try and do some kind of diff like above, you have to worry about bugs or inconsistencies in the data for various reasons (did your app get killed while processing the changes? Do you have a bug? Did you do a query mid change and get broken data, with only half the changes you need? Downloading a new file and swapping the dbs when done fixes all those things).

Related

android application with huge database

Let me explain how my application is supposed to work:
Application will ship with a sqlite database in its assets folder which will be copied into databases folder later and it has some content in it(categories, sub categories, products and news) which they all have image. Then after download user can update the content via internet and the application store the new content in database so the application can perform offline.
So my question is, after a while this content will increase in size, is it gonna cause my application to crash? Lets say I release the application with 1 MB database and after 2 years of work the database size goes up around 120 MB. Is it gonna make the application to crash?
Also the other concern is that currently I'm storing the images in database and I load'em from there. Is it a good approach? Because I don't want user to be able to clear the cache or delete the images because later on updating the content it has to download those deleted images again and it will consume traffic imo.
please remember that the Application should be able to load content offline
No, applications don't just crash because they have a large database.
Part of the point of a Cursor is that it gives you a view into a large set of data, without having to load it all into memory at the same time.
If you follow best practices I see no problem - you're using a database. Forget for a second that it's on Android - you should optimize your table structure, indexes, etc, as best you can.
Also, large database or not, don't make any queries to it on the main thread. Use the Loader API if you need to show the result of a query in your UI.
Last, potentially most importantly, rethink why you even need such a large database. Is it really that common that a user will need to access all data ever while offline? Or might it make more sense for you to only store data from the last week or month, etc, and tell them that they need to be online to access older data.
Regarding your 2nd question - please in the future separate that into a separate question. But, no, storing binary blobs (images in this case) in a sqlite database is not good approach. Also, if they clear data on the app, everything is gone, so there's no advantage to using a database to avoid that. I would suggest storing images in a folder named after your app in external storage of the device, potentially storing image URIs/names in the database.
Any problem with database will cause SQLiteException which you are able to handle in your app to prevent the abnormal termination.
Having said that, a database of 120 MB seems to be too much, are you sure your users will want all that?

sqlite insert vs update vs replace

I am a working on a project in which I retrieve data from facebook about friends of the user. Friends details vary some times while at the other times they are the same as the one stored in the db.
I can use the replace command to make sure that the db is consistent with whatever information I retrieve from the facebook.
My question is how efficient this technique will be? In other words, I can use two techniques:
One is to use the replace command and replace the complete record blindly
Second is to first check whether there is any difference from the record saved in the db and update only the fields that have changed
Which of these approaches is going to be more efficient?
I've found that queuing up a number of sqlite commands in a row is much more efficient than is doing anything else in between, even just comparing a few values.
I'd strongly recommend that you just do an update command. SQLite is fast.
My observation is that SQLite is always way faster than I am. So let it do the heavy lifting and just dump the data at it, and let it sort out your updates.
For example, I was searching through about 7,000 records. I pulled the records out into an array, did a quick check for one field, and separated it into two arrays. This was taking me about 5 seconds. I replaced it with two separate SQLite queries that each had to go through the entire data base. The revised dual query takes about a quarter second, near as I can tell, because its so crazy fast.
I've had similar speed luck with Updates in my big database.

Optimizing fast access to a readonly sqlite database?

I have a huge database and I want my application to work with it as soon as possible. I'm using android so resources are more restricted. I know that its not a good idea to storage huge data in the sqlite database, but I need this.
Each database contain only ONE table and I use it READ only.
What advice can you give me to optimize databases as much as possible. I've already read this post, and except the PRAGMA commands what else can I use?
Maybe there are some special types of the tables which are restricted for read only queries, but principally faster then ordinary table types?
As long as your database fits on the device, there is no problem with that; you'll just have less space for other apps.
There is no special table type. However, if you have queries that use only a subset of a table's columns, and if you have enough space left, consider adding one or more covering indexes.
Being read-only allows the database to be optimized on the desktop, before you deploy it:
set page size, etc.;
create useful indexes;
ANALYZE
VACUUM
In your app, you might experiment with increasing the page cache size, but if your working set is larger than free memory, that won't help anyway. In any case, random reads from flash are fast, so that would not be much of a problem.
Huge is relative. But ultimately a device is constrained on storage and memory. So assuming that huge is beyond the typical constraints of a device, you have a few options.
The first option is to store your huge dataset in the cloud and the connected device can offer views into that data by offering cloud services with something like RESTful APIs from the coud to proffer the data to the device. If the device and app rely on always being connected, you don't need as much local storage unless you want to cache data.
Another approach is an occasionally connected device (sometimes offline) where you pull down a slice of the most relevant data to work on to the device. In that model, yo can work offline and push/pull back to the cloud. In this model, sqlite is the storage mechanism to hold that slice of relevant data.
EDIT based on comments:
Concerning optimizing what you have on the device, see the optimization FAQ here:
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
(in rough order of effectiveness)
Use an in-memory database
Use BEGIN TRANSACTION and END TRANSACTION
Use indexes Use PRAGMA cache_size
Use PRAGMA synchronous=OFF
Compact the database
Replace the memory allocation library
Use PRAGMA count_changes=OFF
Maybe I'm stating the obvious but you should probably just open it with the SQLITE_OPEN_READONLY flag to sqlite3_open: I think that SQLite will take advantage of this fact and optimize the behaviour of the engine.
Note that all normal SQL(ite) optimization tips still apply (e.g. VACUUMing to finalize the database, setting the correct page size at database creation, proper indexes and so on...)
In addition, if you have multiple threads accessing the database in your application, you may want to try out also the SQLITE_OPEN_NOMUTEX and SQLITE_OPEN_SHAREDCACHE flags (they require sqlite3_open_v2, though)
Also you need journalling switch off, because data not change http://www.sqlite.org/pragma.html#pragma_journal_mode
PRAGMA journal_mode=OFF

Benefits of packaging sqlite db rather than creating?

When my app is first run, it creates 5 tables and inserts about 50 initial values. The user can delete any of these initial values if they want and they will add to them.
In this situation, what are the pros/cons between creating the db file and copying it over on first run and just putting a bunch of create/insert statements in onCreate?
It's crucial that user information doesn't get overwritten and because of that I'm leaning towards the create/insert statements, since those will fail/be minor if some bug triggers onCreate (if that's possible), whereas copying the db file would wipe the db.
There are no benefits of packaging the sqlite db rather than creating it IMO, just choose one and code accordingly, worrying that something it wrong with your code and deciding to use one method or another based on that is just an ice patch to bad programming.
No offense meant, I just think the reason why you are asking this question isnt right, the real difference would be between deciding something that just copies, but makes the APK bigger, but perhaps is faster than creating and populating.
I personally go with creating the DB from scratch, you will only do it "once", unless of course your update requires modifications to the DB or the user deleted the data. I would rather have the user wait a while, just once, and make the APK a considerable ammount of KBs lighter.
I think you've answered your own question as far as the cons of copying the whole db over. (Although I do like copying over databases for unit testing.) If you are looking for a less tedious way to populate those fifty values, you might try using .dump in sqlite3 and putting all of those insert statements into a single resource.
On the other hand, if onCreate gets called when it shouldn't, you probably have bigger problems to worry about.

Best practice for keeping data in memory and database at same time on Android

We're designing an Android app that has a lot of data ("customers", "products", "orders"...), and we don't want to query SQLite every time we need some record. We want to avoid to query the database as most as we can, so we decided to keep certain data always in memory.
Our initial idea is to create two simple classes:
"MemoryRecord": a class that will contain basically an array of objects (string, int, double, datetime, etc...), that are the data from a table record, and all methods to get those data in/out from this array.
"MemoryTable": a class that will contain basically a Map of [Key,MemoryRecord] and all methods to manipulate this Map and insert/update/delete record into/from database.
Those classes will be derived to every kind of table we have in the database. Of course there are other useful methods not listed above, but they are not important at this point.
So, when starting the app, we will load those tables from an SQLite database to memory using those classes, and every time we need to change some data, we will change in memory and post it into the database right after.
But, we want some help/advice from you. Can you suggest something more simple or efficient to implement such a thing? Or maybe some existing classes that already do it for us?
I understand what you guys are trying to show me, and I thank you for that.
But, let's say we have a table with 2000 records, and I will need to list those records. For each one, I have to query other 30 tables (some of them with 1000 records, others with 10 records) to add additional information in the list, and this while it's "flying" (and as you know, we must be very fast at this moment).
Now you'll be going to say: "just build your main query with all those 'joins', and bring all you need in one step. SQLite can be very fast, if your database is well designed, etc...".
OK, but this query will become very complicated and sure, even though SQLite is very fast, it will be "too" slow (2 a 4 seconds, as I confirmed, and this isn't an acceptable time for us).
Another complicator is that, depending on user interaction, we need to "re-query" all records, because the tables involved are not the same, and we have to "re-join" with another set of tables.
So, an alternative is bring only the main records (this will never change, no matter what user does or wants) with no join (this is very fast!) and query the other tables every time we want some data. Note that on the table with 10 records only, we will fetch the same records many and many times. In this case, it is a waste of time, because no matter fast SQLite is, it will always be more expensive to query, cursor, fetch, etc... than just grabbing the record from a kind of "memory cache". I want to make clear that we don't plan to keep all data in memory always, just some tables we query very often.
And we came to the original question: What is the best way to "cache" those records? I really like to focus the discussion on that and not "why do you need to cache data?"
The vast majority of the apps on the platform (contacts, Email, Gmail, calendar, etc.) do not do this. Some of these have extremely complicated database schemas with potentially a large amount of data and do not need to do this. What you are proposing to do is going to cause huge pain for you, with no clear gain.
You should first focus on designing your database and schema to be able to do efficient queries. There are two main reasons I can think of for database access to be slow:
You have really complicated data schemas.
You have a very large amount of data.
If you are going to have a lot of data, you can't afford to keep it all in memory anyway, so this is a dead end. If you have complicated structures, you would benefit in either case with optimizing them to improve performance. In both cases, your database schema is going to be key to good performance.
Actually optimizing the schema can be a bit a of a black art (and I am no expert on it), but some things to look out for are correctly creating indices on rows you will query, designing joins so they will take efficient paths, etc. I am sure there are lots of people who can help you with this area.
You could also try looking at the source of some of the platform's databases to get some ideas of how to design for good performance. For example the Contacts database (especially starting with 2.0) is extremely complicated and has a lot of optimizations to provide good performance on relatively large data and extensible data sets with lots of different kinds of queries.
Update:
Here's a good illustration of how important database optimization is. In Android's media provider database, a newer version of the platform changed the schema significantly to add some new features. The upgrade code to modify an existing media database to the new schema could take 8 minutes or more to execute.
An engineer made an optimization that reduced the upgrade time of a real test database from 8 minutes to 8 seconds. A 60x performance improvement.
What was this optimization?
It was to create a temporary index, at the point of upgrade, on an important column used in the upgrade operations. (And then delete it when done.) So this 60x performance improvement comes even though it also includes the time needed to build an index on one of the columns used during upgrading.
SQLite is one of those things where if you know what you are doing it can be remarkably efficient. And if you don't take care in how you use it, you can end up with wretched performance. It is a safe bet, though, if you are having performance issues with it that you can fix them by improving how you are using SQLite.
The problem with a memory cache is of course that you need to keep it in sync with the database. I've found that querying the database is actually quite fast, and you may be pre-optimizing here. I've done a lot of tests on queries with different data sets and they never take more than 10-20 ms.
It all depends on how you're using the data, of course. ListViews are quite well optimized to handle large numbers of rows (I've tested into the 5000 range with no real issues).
If you are going to stay with the memory cache, you may want have the database notify the cache when it's contents change and then you can update the cache. That way anyone can update the database without knowing about the caching. Also, if you build a ContentProvider over your database, you can use the ContentResolver to notify you of changes if you register using registerContentObserver.

Categories

Resources