I have measured database queries and what I found is that the very first time it runs is slower than all subsequent queries. The very first time I run the code below I got: 11. For all subsequent calls it is always only 1. How does it work so it is faster after being first time queried?
long time = System.currentTimeMillis();
Cursor cursor = DBHelper.getInstance().getAllContacts();
Logger.log(TAG, "measured time is " + String.valueOf(System.currentTimeMillis() - time));
The `getAllContacts() method is:
(getReadableDatabase().rawQuery("SELECT _id, name, title FROM " + CONTACTS_TABLE_NAME, null));
There are a few effects that might be at play here.
First, you are also timing the getReadableDatabase() method: if that is doing significant work the first time (like opening the DB file!) and then caching what it does, you'll have a lot of overhead from that. Try just calling that method on its own before the timed zone so that you know that overhead isn't biting.
Secondly, SQLite works by compiling the SQL to an internal bytecode form and then executing that in its own little virtual machine. The compilation step can have non-trivial costs, and is specific to the SQL and the table(s) queried against. Luckily, you're sending the same SQL in every time (that's a constant table name, yes?) and SQLite is smart enough to implement an LRU cache of the compilations so that if you run the same query twice, it's faster. (I don't know the size of the cache; it's probably set at the time you build SQLite, with some sensible default.) You could test this in theory by just querying against a different table each time, but that would be silly. Closing the connection to the DB would also get rid of this cache, but would guarantee that you get poor performance; you're not supposed to measure just the start of things as then you're bearing lots of extra costs for no good reason.
Thirdly, it's quite possible that SQLite only reads some information about the DB and its tables when it actually needs to, so if this is the first query against that table, you may be bearing a number of extra costs. Try doing a different — otherwise never used — query against the table first, before the timing run. (Also, be aware of my caveats in the previous paragraph.)
Fourthly, the ORM layer might be caching the results, but frankly this is actually pretty unlikely as that's rather hard to get right (the hard part is working out when to flush the query results cache due to the DB being updated, and it's actually easier to just not bother with such complexity).
First time you call an SQL statement, SQLite engine has to prepare that typed SQL statement, ie, convert human readable text in machine executable code, which represents a slightly overhead.
Related
I've got a table with about 7 million rows in it. I'm inserting on average about one row every second into the database. When I do this, I am noticing that it is taking an incredibly long time (as much as 15 seconds) to run a simple SELECT against the database, e.g. something like:
SELECT * FROM table WHERE rowid > 7100000
This select often returns no rows of data as sometimes no data has been inserted in this particular table. It is often happening even when the table I'm writing to isn't even actually inserting rows into the table I am reading.
The idea is that there are two separate processes, one is adding data, the other is trying to get all new data that has not yet been read. But the read side is connected to a UI and any noticable lag is intolerable, much less 15 seconds. This is being run under Android and the the UI thread doesn't like being blocked for that long either and it is wreaking havoc.
My initial thought was maybe the insert is requiring an update to the indicies as originally I had the index on a different field (a time field). This seems at least partially confirmed because if I use a database with only a few rows each select completes in a few milliseconds. But when I re-created the table to only have the rowid as primary key it actually got slower. I would expect inserting a new row at the end would always result in very fast reads when just comparing on the rowid as primary key.
I have tried enabling write ahead logging, but it appears that SQLCipher doesn't support this, at least not directly, as it doesn't adhere to the lastest API for android.database.sqlite.SQLiteDatabase. Even using "PRAGMA journal_mode = WAL" in the postKey hook hasn't made any difference.
What's going on here? How can I speed up my selects?
Update: I tried getting rid of sqlcipher and just using plain sqlite to see if that was a factor. I used sqlcipher_export to export to a plaintext database, and then used the default android.database.sqlite.SQLCipher. The delay time dropped from 10-20s to 1.8-2.8s. I then removed write-ahead and it dropped further to 1.3-2.7s. So the issue is still noticably there, although it did get a lot better.
SQLite is ultimately file-based, and there is no portable mechanism to communicate to another process which part of a file has changed. So when one process has written something, all other processes must drop their caches when they access the database file the next time.
If possible, modify your architecture that both parts of the code are in the same process and share the same database connection. (With multiple threads, this requires locking, but SQLite has not much concurrency anyway.)
Alternatively, write the new data into a separate database, and let the UI app move it to its own database.
I don't know why SQLCipher is so much slower (it's unlikely to be the CPU overhead of the decryption).
I've modified my DatabaseHelper class to use the SQLCipher library.
To do that, I:
Copied the assets into my assets folder and the libraries (armeabi, x86, commons-codec, guava-r09, sqlcipher) into my libs folder.
Changed the imports in my DatabaseHelper class so that they point to import net.sqlcipher.database.* instead.
Call SQLiteDatabase.loadLibs(getApplicationContext()); when the app starts up.
Modified the lines where I call getReadableDatabase() and getWriteableDatabase() so that they include a passphrase as a parameter;
Everything seems to work fine as data is read/written properly. My issue is related to performance, as my app may execute DB operations with some frequency, causing it to become slow (after migrating to SQLCipher).
For my DatabaseHelper methods, I believe I'm following the standard approach, e.g.:
/*
* Getting all MyObjects
*/
public List<MyObject> getMyObjects() {
List<MyObject> objects = new ArrayList<MyObject>();
String selectQuery = "SELECT * FROM " + TABLE_NAME;
Log.v(LOG, selectQuery);
// Open
SQLiteDatabase db = this.getReadableDatabase("...the password...");
// I know this passphrase can be figured out by decompiling.
// Cursor with query
Cursor c = db.rawQuery(selectQuery, null);
// looping through all rows and adding to list
if (c.moveToFirst()) {
do {
MyObject object = createMyObjectFromCursor(c); // Method that builds MyObject from Cursor data
// adding to list
objects.add(object);
} while (c.moveToNext());
}
c.close();
db.close();
return objects;
}
I'm not entirely familiar with the internal mechanics of SQLCipher (e.g. does it decrypt the whole DB file when I call getReadableDatabase()?) but, while debugging, it seems that the overhead is in getReadableDatabase(password) and getWritableDatabase(password), which makes sense if my supposition above is true.
Would moving those calls to a DatabaseHelper.open() and DatabaseHelper.close() method which would be called by the Activities whenever they instantiate a DatabaseHelper, instead of calling them on each individual method, be a bad practice? Please share your knowledge on how to address this issue.
EDIT:
I've used DDMS to trace one of the methods and I can see that the overhead is indeed at the SQLiteOpenHelper.getReadableDatabase() (taking ~4 sec. each time). The queries seem to work fast and I don't think I need to worry about them.
If I drill down the calls, following the one with the longest duration every time, I end up with:
SQLiteDatabase.OpenOrCreateDatabase --> SqLiteDatabase.openDatabase --> SQLiteDatabase.openDatabase --> SQLiteDatabase.setLocale
So the SQLiteDatabase.setLocale(java.util.Locale) seems to be the culprit, as it is taking ~4 seconds everytime getReadableDatabase() is called. I've looked into the source for SQLiteDatabase and it just locks the DB, calls native_setLocale(locale.toString(), mFlags) (the 4 sec. overhead takes place here) and unlocks the DB.
Any idea on why this happens?
The performance issue you are seeing is most likely due to SQLCipher key derivation. SQLCipher's performance for opening a database is deliberately slow, using PBKDF2 to perform key derivation (i.e. thousands of SHA1 operations) to defend against brute force and dictionary attacks (you can read more about this at http://sqlcipher.net/design). This activity is deferred until the first use of the database, which happens to occur in setLocale, which is why you are seeing the performance issue there when profiling.
The best option is to cache the database connection so that it can be used multiple times without having to open and key the database repeatedly. If this is possible, opening the database once during startup is the preferred course of action. Subsequent access on the same database handle will not trigger key derivation, so performance will be much faster.
If this is not possible the other option is to disable or weaken key derivation. This will cause SQLCipher to use fewer rounds of PBKDF2 when deriving the key. While this will make the database open faster, it is significantly weaker from a security perspective. Thus it is not recommended except in exceptional cases. That said, here is the information on how to reduce the KDF iterations:
http://sqlcipher.net/sqlcipher-api/#kdf_iter
does it decrypt the whole DB file when I call getReadableDatabase()?
No. It decrypts pages (4KB??) as needed, on the fly.
it seems that the overhead is in getReadableDatabase(password) and getWritableDatabase(password), which makes sense if my supposition above is true
Only call those once for the lifetime of your process. Anything else is insecure, as it requires you to keep the password around, above and beyond any overhead issues.
Of course, you seem to be hard-coding a password, in which case all this encryption is pointless and a waste of time.
Please share your knowledge on how to address this issue.
Use Traceview to determine exactly where your time is being spent.
In one benchmark that I performed -- converting a SQLite benchmark to SQLCipher -- I could not detect any material overhead. Disk I/O swamped the encryption overhead, near as I can tell.
To the extent that a well-written SQLCipher for Android app adds overhead, it will make bad operations worse. So, for example, a query that needs to do a table scan sucks already; SQLCipher will make it suck incrementally harder. The solution there is to add the appropriate indexes (or FTS3) as needed to avoid the table scan.
I'm sort of lost on this. I have an application that is reading from a static SQLite database that has 439397 records (~32MB).
I am querying the database on a column that is indexed, but the it takes ~8-12 seconds to finish the query. The current query I am using is to do database.query(tableName, columnHeaders, "some_id=" + id) for a list of ids.
I tried doing the "WHERE some_id IN (id1, id2, id3)" approach, but that took over twice as long. I have a feeling that I might be doing it wrong.
The query is done in an AsyncTask, so I am at a lost at what other thing I could do to improve the performance.
UPDATE:
I resolved the problem by changing the behavior of the application.
You can use EXPLAIN QUERY PLAN to confirm that your index is indeed being properly used.
You can try running your query once with a COUNT(*) instead of the real column list, to see if the issue is the act of actually reading the row data off of flash storage (which is possible if there are lots of matches and lots of big columns).
You can try running your query to match on a single ID (rather than N of them), to start to try to get a handle on whether the issue is too many comparisons.
However, please bear in mind that AsyncTask does not somehow make things magically faster. It makes things magically not run on the main application thread.
Looks like you don't have an index on that field. Note that the index might need to cover several fields if you use them for filtering/sorting/grouping in your real query (can't tell in more details because I haven't seen it).
I have a SQLite database with with just over 6,000 rows of addresses in a table. This is a read-only database - no updates or changes after the app is built and deployed. I have an index on the state field. My app uses a simple select statement to get all rows that match the given state. I have used the explain and explain query plan statements to see that my query is using the index.
Most of the time the query comes back in under a second - not great, but good enough for my application.
Every so often the query takes longer - even up to 14 seconds, often 3-4 seconds. Exact same query on the exact same read-only database (and table) on the same phone, invoked by the exact same binary.
I can see that no garbage collection is occurring, and no exceptions are being generated from monitoring logcat
There is just a variation that sometimes occurs. A variation that creates an inconsistent user experience.
It appears that the SQLite database system is being shared by other apps - such as the email client. Could it be that my query is being queued behind another app's queries and thus the variation is due to when the shared SQLite database system actually gets to run my query? If this is the case, is it possible to "create my own SQLite instance" so that I can get consistent performance?
If it is not a shared SQLite database system (and thus I do have my own instance) what else could be causing such a large variation in query performance given that everything else is equal?
Note that I can't easily bring the data into memory to run the query there as the rows are pretty long (have more information than just the address) and I have a number of other parts of my code that make use of more complex select queries. I've narrowed the performance variation down to just the simplest "select where state = " query for this question (plea for help).
It appears that the SQLite database system is being shared by other apps - such as the email client.
Not exactly. Storage is shared by other apps. And on Android 1.x and most 2.x devices, internal storage is formatted YAFFS2, which only allows one process to access the storage at a time. This should be less of a problem on Android 3.0+ devices (and some 2.3 devices) that are running ext4 instead of YAFFS2.
Could it be that my query is being queued behind another app's queries and thus the variation is due to when the shared SQLite database system actually gets to run my query?
Not exactly. Your disk I/O could be queued behind another app's disk I/O, though.
In my Android app, I need to get 50,000 database entries (text) and compare them with a value when the activity starts (in onCreate()). I am doing this with the simplest way: I get the whole table from db to a cursor. However this way is too laggy. Are there any other ways to do it more effectively ?
Edit: The app is "scrabble solver" that is why I am not using WHERE clause in my query (Take the whole data and compare it with combination of the input letters). At first I was using a big table which contains whole possible words. Now I am using 26 tables. This reduced the lag and I am making database calls on a thread - that solved a lot problems too. It is still little bit laggy but much better.
To summarize and add a bit more
Let the database do the work of searching for you, use a WHERE clause when you perform a query!
If your where clause does not operate on the primary key column or a unique column you should create an index for that column. Here is some info on how to do that: http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html#indexes
Google says: "Use question mark parameter markers such as 'phone=?' instead of explicit values in the selection parameter, so that queries that differ only by those values will be recognized as the same for caching purposes."
Run the query analysis using EXPLAIN QUERY PLAN http://www.sqlite.org/lang_explain.html and look for any scan operations, these are much slower than search operations. Uses indexes to avoid scan operations.
Don't perform any time consuming tasks in onCreate(), always use an AsyncTask, a Handler running on a background thread or some other non-main thread.
If you need to do full text search please read: http://www.sqlite.org/fts3.html
You should never read from the database in the UI thread. Use a background thread via AsyncTask or using regular threading. This will fix the UI lag issue your having.
Making the database read faster will help with bringing the data faster to the user but it's even more important that the fetching of the data does not block the user from using the app.
Check out the Following Links:
Painless Threading in Android
YouTube: Writing Zippy Android Apps
Use a WHERE clause in your SQL rather than reading the whole thing in. Then add an index on the columns in the WHERE clause.
At least you can put index on the field you compare and use WHERE clause. If you are comparing numerics Sqlite (db engine used by Android) supports functions such as MIN and MAX. Also if you are comparing partial strings you can use LIKE. For query optimization there are many resources such as this