I have a huge database and I want my application to work with it as soon as possible. I'm using android so resources are more restricted. I know that its not a good idea to storage huge data in the sqlite database, but I need this.
Each database contain only ONE table and I use it READ only.
What advice can you give me to optimize databases as much as possible. I've already read this post, and except the PRAGMA commands what else can I use?
Maybe there are some special types of the tables which are restricted for read only queries, but principally faster then ordinary table types?
As long as your database fits on the device, there is no problem with that; you'll just have less space for other apps.
There is no special table type. However, if you have queries that use only a subset of a table's columns, and if you have enough space left, consider adding one or more covering indexes.
Being read-only allows the database to be optimized on the desktop, before you deploy it:
set page size, etc.;
create useful indexes;
ANALYZE
VACUUM
In your app, you might experiment with increasing the page cache size, but if your working set is larger than free memory, that won't help anyway. In any case, random reads from flash are fast, so that would not be much of a problem.
Huge is relative. But ultimately a device is constrained on storage and memory. So assuming that huge is beyond the typical constraints of a device, you have a few options.
The first option is to store your huge dataset in the cloud and the connected device can offer views into that data by offering cloud services with something like RESTful APIs from the coud to proffer the data to the device. If the device and app rely on always being connected, you don't need as much local storage unless you want to cache data.
Another approach is an occasionally connected device (sometimes offline) where you pull down a slice of the most relevant data to work on to the device. In that model, yo can work offline and push/pull back to the cloud. In this model, sqlite is the storage mechanism to hold that slice of relevant data.
EDIT based on comments:
Concerning optimizing what you have on the device, see the optimization FAQ here:
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
(in rough order of effectiveness)
Use an in-memory database
Use BEGIN TRANSACTION and END TRANSACTION
Use indexes Use PRAGMA cache_size
Use PRAGMA synchronous=OFF
Compact the database
Replace the memory allocation library
Use PRAGMA count_changes=OFF
Maybe I'm stating the obvious but you should probably just open it with the SQLITE_OPEN_READONLY flag to sqlite3_open: I think that SQLite will take advantage of this fact and optimize the behaviour of the engine.
Note that all normal SQL(ite) optimization tips still apply (e.g. VACUUMing to finalize the database, setting the correct page size at database creation, proper indexes and so on...)
In addition, if you have multiple threads accessing the database in your application, you may want to try out also the SQLITE_OPEN_NOMUTEX and SQLITE_OPEN_SHAREDCACHE flags (they require sqlite3_open_v2, though)
Also you need journalling switch off, because data not change http://www.sqlite.org/pragma.html#pragma_journal_mode
PRAGMA journal_mode=OFF
Related
My app are sometime needed syncing with web servers and pull the data in mobile sqlite database for offline usages, so database size is keep growing exponentially.
I want to know how the professional app like whatsapp,hike,evernote etc manage their offline sqlite database.
Please suggest me the steps to solve this problem.
PS: I am asking about offline database (i.e growing in the size after syncing) management do not confuse with database syncing with web servers.
I do not know how large is your data size is. However, I think it should not be a problem storing reasonably large data into the internal memory of an application. The internal memory is shared among all applications and hence it can grow until the storage getting filled.
In my opinion, the main problem here is the query time if you do not have the proper indexing to your database tables. Otherwise, keeping the databases in your internal storage is completely fine and I think you do not have to be worried about the amount of data which can be stored in the internal storage of an application as the newer Android devices provide better storage capability.
Hence, if your database is really big, which does not fit into the internal memory, you might consider having the data only which is being used frequently and delete otherwise. This highly depends on the use case of your application.
In one of the applications that I developed, I stored some large databases in the external memory and copied them into the internal memory whenever it was necessary. Copying the database from external storage into internal storage took some time (few seconds) though. However, once the database got copied I could run queries efficiently.
Let me know if you need any help or clarification for some points. I hope that helps you.
For max size databases. AFAIK You don't want to loose what's on the device and force a reload.
Ensure you don't drop the database with each new release of your app when a simple alter table add column will work.
What you do archive and remove from the device give the user a way to load it in the background.
There might be some Apps / databases where you can find a documentation, but probably this case is limited and an exception.
So to know exactly what's going on you need to create some snapshots of the databases. You can start with that of one app only, or do it directly with several, but without analyzing you won't get a reliable statement.
The reasons might be even different for each app as databases and app-features differ naturally too.
Faster growth in size than amount of incoming content might be related to cache-tables or indexing for searches, but perhaps there exist other reasons too. Without verification and some important basic-info about it, it's impossible to tell you a detailed reason.
It's possible that table-names of a database give already some hints, but if tablenames or even fields just use meaningless strings, then you've to analyze the data inside including the changes between snapshots.
The following link will help in understanding what exactly Whatsapp is using,
https://www.quora.com/How-is-the-Whatsapp-database-structured
Not really sure if you have to keep all the data all the time stored on the device, but if you have a choice you can always use cloud services (like FCM, AWS) to store or backup most of the data. If you need to keep all the data on the device, then perhaps one way is to use Caching mechanisms in your app.
For Example - Using LRU (Least Recently Used) to cache/store the data that you need on the device, while storing the rest on the cloud, and deleting whats unneeded from the device. If needed you can always retrieve the data on demand (i.e. if the user tries to pull to refresh or on a different action) and delete it whenever its not being used.
I have this question on how is deletion done on Mobile development, in lieu with Database of course, which or what is the best practice in these field, Does deletion of records permanently in database recommended on mobile dev, or flagging in database that a record is deleted more efficient?... i know that in large scale DBMS u have to flag a record that is deleted so u can view past records i just read that on the DBMS book though, does the same principle apply on mobile development, considering that i'm only developing an application for a very small scale only
Any inputs form all you veteran DBAs and Mob. Devs out there are much Appreciated..
As databases on a mobile device are not likely to get large, deletion shouldn't be too inefficient.
But I would say it is useful to flag deleted rows instead, as usually databases in these scenarios are used to hold data locally when the device is off-line pending synchronization to the network, and it's much easier to query deletions and upload them rather than diffing the local rows with remote ones to find what needs to be deleted.
It depends on the design requirements of your application, so it has nothing to do with the efficiency of app. What you read is for large scale systems where audit is required. Since you are developing an application which will run on mobile platform and would probably not require auditing, so in that case its fine to delete such records (at least that's what I would prefer to do).
Flagging records would increase the size of database and is not a good idea to leave unwanted data (memory) on your device.
I am planning to write an Android application where I'll use its SQLite database. I was wondering what should be my limit to the number of rows I can store. Should I be having a limit?
If that limit is crossed, whats the best strategy to handle that situation provided that I need to keep them and not delete them!
Right now I can verify that my app runs with a 1.3 MB db with no problems.
If you absolutely must maintain all of the data, and you are having problems, you could utilize the SD card, but for most cases, this argument is somewhat moot.
Here is an discussion about maximum database sizes:
Link
You should be limited to as much of the information you need to store in the database. Save what you need. Avoid unnecessary rows.
Keep in mind you can overrite data in your database, for example; A user edits information.
This will allow you to reuse your same rows.
Hope this answers your question
We're designing an Android app that has a lot of data ("customers", "products", "orders"...), and we don't want to query SQLite every time we need some record. We want to avoid to query the database as most as we can, so we decided to keep certain data always in memory.
Our initial idea is to create two simple classes:
"MemoryRecord": a class that will contain basically an array of objects (string, int, double, datetime, etc...), that are the data from a table record, and all methods to get those data in/out from this array.
"MemoryTable": a class that will contain basically a Map of [Key,MemoryRecord] and all methods to manipulate this Map and insert/update/delete record into/from database.
Those classes will be derived to every kind of table we have in the database. Of course there are other useful methods not listed above, but they are not important at this point.
So, when starting the app, we will load those tables from an SQLite database to memory using those classes, and every time we need to change some data, we will change in memory and post it into the database right after.
But, we want some help/advice from you. Can you suggest something more simple or efficient to implement such a thing? Or maybe some existing classes that already do it for us?
I understand what you guys are trying to show me, and I thank you for that.
But, let's say we have a table with 2000 records, and I will need to list those records. For each one, I have to query other 30 tables (some of them with 1000 records, others with 10 records) to add additional information in the list, and this while it's "flying" (and as you know, we must be very fast at this moment).
Now you'll be going to say: "just build your main query with all those 'joins', and bring all you need in one step. SQLite can be very fast, if your database is well designed, etc...".
OK, but this query will become very complicated and sure, even though SQLite is very fast, it will be "too" slow (2 a 4 seconds, as I confirmed, and this isn't an acceptable time for us).
Another complicator is that, depending on user interaction, we need to "re-query" all records, because the tables involved are not the same, and we have to "re-join" with another set of tables.
So, an alternative is bring only the main records (this will never change, no matter what user does or wants) with no join (this is very fast!) and query the other tables every time we want some data. Note that on the table with 10 records only, we will fetch the same records many and many times. In this case, it is a waste of time, because no matter fast SQLite is, it will always be more expensive to query, cursor, fetch, etc... than just grabbing the record from a kind of "memory cache". I want to make clear that we don't plan to keep all data in memory always, just some tables we query very often.
And we came to the original question: What is the best way to "cache" those records? I really like to focus the discussion on that and not "why do you need to cache data?"
The vast majority of the apps on the platform (contacts, Email, Gmail, calendar, etc.) do not do this. Some of these have extremely complicated database schemas with potentially a large amount of data and do not need to do this. What you are proposing to do is going to cause huge pain for you, with no clear gain.
You should first focus on designing your database and schema to be able to do efficient queries. There are two main reasons I can think of for database access to be slow:
You have really complicated data schemas.
You have a very large amount of data.
If you are going to have a lot of data, you can't afford to keep it all in memory anyway, so this is a dead end. If you have complicated structures, you would benefit in either case with optimizing them to improve performance. In both cases, your database schema is going to be key to good performance.
Actually optimizing the schema can be a bit a of a black art (and I am no expert on it), but some things to look out for are correctly creating indices on rows you will query, designing joins so they will take efficient paths, etc. I am sure there are lots of people who can help you with this area.
You could also try looking at the source of some of the platform's databases to get some ideas of how to design for good performance. For example the Contacts database (especially starting with 2.0) is extremely complicated and has a lot of optimizations to provide good performance on relatively large data and extensible data sets with lots of different kinds of queries.
Update:
Here's a good illustration of how important database optimization is. In Android's media provider database, a newer version of the platform changed the schema significantly to add some new features. The upgrade code to modify an existing media database to the new schema could take 8 minutes or more to execute.
An engineer made an optimization that reduced the upgrade time of a real test database from 8 minutes to 8 seconds. A 60x performance improvement.
What was this optimization?
It was to create a temporary index, at the point of upgrade, on an important column used in the upgrade operations. (And then delete it when done.) So this 60x performance improvement comes even though it also includes the time needed to build an index on one of the columns used during upgrading.
SQLite is one of those things where if you know what you are doing it can be remarkably efficient. And if you don't take care in how you use it, you can end up with wretched performance. It is a safe bet, though, if you are having performance issues with it that you can fix them by improving how you are using SQLite.
The problem with a memory cache is of course that you need to keep it in sync with the database. I've found that querying the database is actually quite fast, and you may be pre-optimizing here. I've done a lot of tests on queries with different data sets and they never take more than 10-20 ms.
It all depends on how you're using the data, of course. ListViews are quite well optimized to handle large numbers of rows (I've tested into the 5000 range with no real issues).
If you are going to stay with the memory cache, you may want have the database notify the cache when it's contents change and then you can update the cache. That way anyone can update the database without knowing about the caching. Also, if you build a ContentProvider over your database, you can use the ContentResolver to notify you of changes if you register using registerContentObserver.
I am developing an application that periodically sends information to an external server. I make a local copy of the data being sent, for backup purposes.
What is the best option to store the data in terms of saving battery life? Each data submission is a serialized object (the class has 5 fields, including a date, numbers and strings) of about 5K-10K.
Any other idea?
I don't believe it matters whether you use SQLite or a File, because the SQLite db is simply a file on the system (stored in /data/data/<your_package>/databases/). You'll need to commit to the db at the right times, just as much as you would need to save a file to the hard drive at the right times. In other words, one way or the other you can use just as many hard drive writes.
I think that what you choose depends more on what sort of data you are saving. If you need the powers that having a db can bestow (such as querying), then by all means use SQLite. However, if you don't need a db, or you've got data that varies wildly (and can't be easily setup in a relational database) then I'd go with files.
What I can tell you for sure is that you should not use serialization for saving a file, if that is the route you choose to go. Android serialization is slow, slow, slow and creates large files. It is much better to either write your own XML or JSON format for performance reasons.
I have no idea in terms of battery life directly but one criteria would be which is easier to manage? Fewer operations to manage the data would mean fewer CPU cycles and in turn longer battery life.
I would say the SQLite option is easier. You can put a date column in the SQLite table which stores your data which makes removing old submissions which you don't need any more very easy - and all handled via the native SQL library. Managing a whole load of file - or worse a single file - with your own Java code would be much more work.
Additionally, you can write data the to database and just forget about it until you need to read it again. If you're storing data in files, you'll need to work out when you should be reading and writing files in terms on the Android application life cycle. If you're worried about battery you probably wouldn't want to write files more often than you should, and cache data in memory, but you'd need to make sure you didn't lose any data when your app is Paused or Destroyed. In my opinion it's much easier to use an SQLite database and not worry about any of this.
Is your application multi-threaded? If you have multiple threads accessing the data store then I would go with SQLite. Let SQLite worry about locking issues.