I'm using SQLite on Android using SQLiteDatabase (http://developer.android.com/reference/android/database/sqlite/SQLiteDatabase.html)
I am developing a bible application, which has a single table with the following columns:
book : int
chapter : int
verse : int
wordIdx : int
strongId : string
word : string
each sentence is broken down in to a series of strongId/word pairs, so wordIdx is used to order the words, strongId is simply a index in to a concordance, and word is the word in the sentence.
so I have 300,000 rows
the bottleneck appears to be my query to get a list of words for each verse:
My SQL is effectively this:
SELECT strongId, word FROM ? WHERE book=? AND chapter=? AND verse=?
Here is the code:
Cursor cursor = mBible.database().rawQuery("SELECT " + KEY_STRONGID + "," + KEY_WORD + " FROM " + tableName() + " WHERE " + KEY_BOOK + "=? AND " + KEY_CHAPTER + "=? AND " + KEY_VERSE + "=?" , new String[] { String.valueOf(mChapter.mBook.index()), String.valueOf(mChapter.index()), String.valueOf(verse) });
cursor.moveToFirst();
mWordList = new ArrayList<Word>();
do {
mWordList.add(new Word(cursor.getString(1), cursor.getString(0)));
} while (cursor.moveToNext());
Now, I've tried putting each chapter in to its own temporary view (using CREATE TEMP VIEW) which cuts down the records to about 400 in my example how ever it is still taking far to long to query
Its taking of the order of 30 seconds to generate the text for two chapters to display to the user (using a temporary view and without using a temporary view). It takes about 5 seconds if I set up a dummy list of words to avoid the database query.
How can I improve the performance of this? It seems as if a temp view is having no impact on performance as I had hoped.
A view does not change the performance of a query; it just saves the query itself, not the results of the query.
If you open your database with the sqlite3 command-line tool on your desktop machine, you can use the EXPLAIN QUERY PLAN command to check how efficient your queries are.
Without any indexes, you query always scans the entire table:
> sqlite3 bible.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> EXPLAIN QUERY PLAN SELECT strongId, word FROM MyTable WHERE book=1 AND chapter=2 AND verse=3;
0|0|0|SCAN TABLE MyTable (~1000 rows)
With an index on your three lookup fields, SQLite can do a fast search in the index and needs to read only the matching records from the table:
sqlite> CREATE INDEX b_c_v ON MyTable(book, chapter, verse);
sqlite> EXPLAIN QUERY PLAN SELECT strongId, word FROM MyTable WHERE book=1 AND chapter=2 AND verse=3;
0|0|0|SEARCH TABLE MyTable USING INDEX b_c_v (book=? AND chapter=? AND verse=?) (~8 rows)
If you create a covering index (with all fields used in the query, lookup fields first), SQLite does not need to read from the table at all. However, this does not give a big speedup over a normal index, and might not be worth the additional storage cost:
sqlite> CREATE INDEX cov ON MyTable(book, chapter, verse, strongId, word);
sqlite> EXPLAIN QUERY PLAN SELECT strongId, word FROM MyTable WHERE book=1 AND chapter=2 AND verse=3;
0|0|0|SEARCH TABLE MyTable USING COVERING INDEX cov (book=? AND chapter=? AND verse=?) (~8 rows)
Please note that SQLite can use at most one index per table in a query, so it does not always make sense to create multiple indexes.
Use EXPLAIN QUERY PLAN to check which indexes are actually used, and whether you can create a few indexes to optimize most of your queries.
Also see the Query Planning documentation.
I ended up creating temporary tables and performance is now acceptable
Related
I'm having a hard time understanding how to use full text search (FTS) with Android. I've read the SQLite documentation on the FTS3 and FTS4 extensions. And I know it's possible to do on Android. However, I'm having a hard time finding any examples that I can comprehend.
The basic database model
A SQLite database table (named example_table) has 4 columns. However, there is only one column (named text_column) that needs to be indexed for a full text search. Every row of text_column contains text varying in length from 0 to 1000 words. The total number of rows is greater than 10,000.
How would you set up the table and/or the FTS virtual table?
How would you perform an FTS query on text_column?
Additional notes:
Because only one column needs to be indexed, only using an FTS table (and dropping example_table) would be inefficient for non-FTS queries.
For such a large table, storing duplicate entries of text_column in the FTS table would be undesirable. This post suggests using an external content table.
External content tables use FTS4, but FTS4 is not supported before Android API 11. An answer can assume an API >= 11, but commenting on options for supporting lower versions would be helpful.
Changing data in the original table does not automatically update the FTS table (and vice versa). Including triggers in your answer is not necessary for this basic example, but would be helpful nonetheless.
Most Basic Answer
I'm using the plain sql below so that everything is as clear and readable as possible. In your project you can use the Android convenience methods. The db object used below is an instance of SQLiteDatabase.
Create FTS Table
db.execSQL("CREATE VIRTUAL TABLE fts_table USING fts3 ( col_1, col_2, text_column )");
This could go in the onCreate() method of your extended SQLiteOpenHelper class.
Populate FTS Table
db.execSQL("INSERT INTO fts_table VALUES ('3', 'apple', 'Hello. How are you?')");
db.execSQL("INSERT INTO fts_table VALUES ('24', 'car', 'Fine. Thank you.')");
db.execSQL("INSERT INTO fts_table VALUES ('13', 'book', 'This is an example.')");
It would be better to use SQLiteDatabase#insert or prepared statements than execSQL.
Query FTS Table
String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery("SELECT * FROM fts_table WHERE fts_table MATCH ?", selectionArgs);
You could also use the SQLiteDatabase#query method. Note the MATCH keyword.
Fuller Answer
The virtual FTS table above has a problem with it. Every column is indexed, but this is a waste of space and resources if some columns don't need to be indexed. The only column that needs an FTS index is probably the text_column.
To solve this problem we will use a combination of a regular table and a virtual FTS table. The FTS table will contain the index but none of the actual data from the regular table. Instead it will have a link to the content of the regular table. This is called an external content table.
Create the Tables
db.execSQL("CREATE TABLE example_table (_id INTEGER PRIMARY KEY, col_1 INTEGER, col_2 TEXT, text_column TEXT)");
db.execSQL("CREATE VIRTUAL TABLE fts_example_table USING fts4 (content='example_table', text_column)");
Notice that we have to use FTS4 to do this rather than FTS3. FTS4 is not supported in Android before API version 11. You could either (1) only provide search functionality for API >= 11, or (2) use an FTS3 table (but this means the database will be larger because the full text column exists in both databases).
Populate the Tables
db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('3', 'apple', 'Hello. How are you?')");
db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('24', 'car', 'Fine. Thank you.')");
db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('13', 'book', 'This is an example.')");
(Again, there are better ways in do inserts than with execSQL. I am just using it for its readability.)
If you tried to do an FTS query now on fts_example_table you would get no results. The reason is that changing one table does not automatically change the other table. You have to manually update the FTS table:
db.execSQL("INSERT INTO fts_example_table (docid, text_column) SELECT _id, text_column FROM example_table");
(The docid is like the rowid for a regular table.) You have to make sure to update the FTS table (so that it can update the index) every time you make a change (INSERT, DELETE, UPDATE) to the external content table. This can get cumbersome. If you are only making a prepopulated database, you can do
db.execSQL("INSERT INTO fts_example_table(fts_example_table) VALUES('rebuild')");
which will rebuild the whole table. This can be slow, though, so it is not something you want to do after every little change. You would do it after finishing all the inserts on the external content table. If you do need to keep the databases in sync automatically, you can use triggers. Go here and scroll down a little to find directions.
Query the Databases
String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery("SELECT * FROM fts_example_table WHERE fts_example_table MATCH ?", selectionArgs);
This is the same as before, except this time you only have access to text_column (and docid). What if you need to get data from other columns in the external content table? Since the docid of the FTS table matches the rowid (and in this case _id) of the external content table, you can use a join. (Thanks to this answer for help with that.)
String sql = "SELECT * FROM example_table WHERE _id IN " +
"(SELECT docid FROM fts_example_table WHERE fts_example_table MATCH ?)";
String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery(sql, selectionArgs);
Further Reading
Go through these documents carefully to see other ways of using FTS virtual tables:
SQLite FTS3 and FTS4 Extensions (SQLite docs)
Storing and Searching for Data (Android docs)
Additional Notes
Set operators (AND, OR, NOT) in SQLite FTS queries have Standard Query Syntax and Enhanced Query Syntax. Unfortunately, Android apparently does not support the Enhanced Query Syntax (see here, here, here, and here). That means mixing AND and OR becomes difficult (requiring the use of UNION or checking PRAGMA compile_options it seems). Very unfortunate. Please add a comment if there is an update in this area.
Don't forget when using content from to rebuild the fts table.
I do this with a trigger on update, insert, delete
I have a ContentProvider that uses a custom CursorFacory in debug to print out the SQL queries (for debugging).
A certain query was returning 0 rows, while I knew there were rows that should have been included. So I copied the query from my logs, replaced the bind values and ran it in sqlite3 shell on the device and got the correct result.
The Query Code
cr.query (contentUri,
Projection.columns,
FeedColumns.FEED_TYPE + "=? AND " +
FeedColumns.SUB_TYPE + "=? AND " +
ProfileUpdateFeedItem.UPDATED_FIELD + "=? AND " +
FeedColumns.IS_NOTIFIED + "=?",
new String[] {FeedType.USER, // 2
WallPostData.WallPostType.PROFILE_UPDATE, // 1
ProfileUpdateData.ProfileField.STATUS, // 0
SQLBoolean.FALSE // 0
},
FeedColumns.CREATED + " ASC");
From the logs:
07-04 12:48:51.339 4067-4314/com.redacted.android D/DATABASE﹕ QUERY: SQLiteQuery: SELECT DISTINCT id, sender, data_1, data_2, photo, feed_type, sub_type, created, expiry, updated, comment_count, comment_unread, reaction_count, reaction_unread, sender_name, sender_photo, _id FROM wall WHERE feed_type=? AND sub_type=? AND data_1=? AND is_notified=? ORDER BY created ASC LIMIT 100
On device:
Enter SQL statements terminated with a ";"
sqlite> SELECT DISTINCT id, sender, data_1, data_2, photo, feed_type, sub_type, created, expiry, updated, comment_count, comment_unread, reaction_count, reaction_unread, sender_name, sender_photo, _id FROM wall WHERE feed_type=2 AND sub_type=1 AND data_1=0 AND is_notified=0 ORDER BY created ASC LIMIT 100;
53b702b827d7482062f52b03|a7e759d78abe4bfa97045ce49a24ab57|0|Educ||2|1|1404502712279|1404761912325|1404502712279|||||Luke Skywalker|pr/e5c2c0398b267f93683c80dc5009722e|49
The ContentProvider, however, doesn't agree and cursor.getCount() returns 0.
Any ideas why this is happening?
feed_type, sub_type, and is_notified are INTEGER columns.
data_1 is a BLOB that is storing an integer for any row that would qualify for this query, but stores strings for other types of data that could go in this table.
When you run in the shell i'm surprised you get any rows. The blob data type may not convert the keyed value properly for you. Typically the database API requires a special function to set the blob value as well as retrieve it.
So the problem here was the BLOB column. It was being evaluated properly in queries (The data in the table is used in a ListView and is displayed differently depending on the contents of the data_1 and data_2 columns).
Everything in the feed category gets parsed into a member of a class hierarchy rooted at an AnstractFeedObject.
Most fields that use both data_1 and data_2 store text in both, but some fields (those who correspond to a subset of the mentioned class hierarchy) use data_1 as a type enumeration that the UI uses to interpret the value stored in data_2. For example, a 0 type means that data_2 is a picture id (construct the url and download), while type 1 means it's actual text content.
What I ended up doing was that I replaced data_1 with an integer column called type_enumeration and renamed data_2 to data_1. Now that I know BLOB can cause those kinds of issues, I'll be changin data_2 also to a TEXT column.
If at some point in the future I need to store binary data in the DB, I'll add a bin_data to the column.
Now usually in a proper normalized schema you'd use linked tables to represent such hierarchy, but in a mobile environment, you want to minimize joins so a few extra columns are cheaper in terms of performance (at least that's been my experience).
Let's say I have a table T and an index on field f. I want to filter on f based on some integer myF. If the integer is 0, I want all records where f is null. A sure way to write this would be:
db.rawQuery("SELECT someField FROM T WHERE "
+ (myF == 0 ? "f IS NULL" : "f = ?"),
(myF == 0 ? new String[] {}, new String[] {String.valueOf(myF)}));
This is a bit inconvenient; especially if the query is more complex than this and has additional parameters. So I thought I'd write
db.rawQuery("SELECT someField FROM T WHERE IFNULL(f, 0) = ?",
new String[] {String.valueOf(myF)});
instead, which is much simpler, easier to read and easier to maintain.¹
My question is: If there is an index on f, will SQLite still use that index or will it resort to a table scan? I'm asking because in the latter case I'm not comparing a field and a parameter but an expression and a parameter, and I'm not sure how "smart" the SQLite query optimizer is.
¹ Note that there are no records with f = 0 in the database.
It will result in a table scan.
Example using the sqlite3 command line client:
sqlite> create table t(f);
sqlite> create index tf on t(f);
sqlite> explain query plan select * from t where ifnull(f,0)=1;
0|0|0|SCAN TABLE t (~500000 rows)
sqlite> explain query plan select * from t where f is null;
0|0|0|SEARCH TABLE t USING COVERING INDEX tf (f=?) (~10 rows)
sqlite> explain query plan select * from t where f=1;
0|0|0|SEARCH TABLE t USING COVERING INDEX tf (f=?) (~10 rows)
My app reads an XML file on the internet, takes note of the time and creates/writes an SQLite database. The next time data is required, if the time is >24hrs the database is updated (xml downloaded again).
The problem is that whenever I relaunch the app in AVD it has to re-download and so I notice that all the data in the database is written again (duplicated). So instead of 10 items, I have 20 (10+10 duplicates). If I relaunch again I get another 10 items duplicated.
I thought about how I could prevent the duplication of the database (or delete the old entries), so I decided to increment the database version every time the content is downloaded. I thought this would trigger the onUpgrade() method so the data would be cleared but nothing changes.
Now I am clueless. How should I go about this?
On your database create you'll want to use the UNIQUE constraint. You may not want the ON CONFLICT REPLACE that i use, but you should get the idea.
For Ex:
private static final String DATABASE_CREATE_NEWS= "create table news (_id integer primary key autoincrement, "title text not null, description text not null, date text not null, LastModified text not null, UNIQUE(title, date) ON CONFLICT REPLACE);";
Here is another solid thread that talks about it as well.
SQLite table constraint - unique on multiple columns
Here is some more info on the android sqlite: http://developer.android.com/reference/android/database/sqlite/SQLiteDatabase.html
You should create an index on the columns that represent a unique identifier.
see this article on SQLite's website.
CREATE INDEX ix_tblexample ON TableName ( Column1, Column2, Column3 [, Column4, etc..])
Or (as per your comment) you can select the table into a cursor and check for each one.
String sql = "select * from " + tableName + "where column1 = " + param1 + "and column2 = " + param2;
Cursor cur = _db.rawQuery( sql, new String[0] );
if(cur.getCount() == 0)
{
//upload
}
I have a sqlite db that at the moment has few tables where the biggest one has over 10,000 rows. This table has four columns: id, term, definition, category. I have used a FTS3 module to speed up searching which helped a lot. However, now when I try to fetch 'next' or 'previous' row from table it takes longer than it was before I started using FTS3.
This is how I create virtual table:
CREATE VIRTUAL TABLE profanity USING fts3(_id integer primary key,name text,definition text,category text);
This is how I fetch next/previous rows:
SELECT * FROM dictionary WHERE _id < "+id + " ORDER BY _id DESC LIMIT 1
SELECT * FROM dictionary WHERE _id > "+id + " ORDER BY _id LIMIT 1
When I run these statements on the virtual table:
NEXT term is fetch within ~300ms,
PREVIOUS term is fetch within ~200ms
When I do it with normal table (the one created without FTS3):
NEXT term is fetch within ~3ms,
PREVIOUS term is fetch within ~2ms
Why there is such a big difference? Is there any way I can improve this speed?
EDITED:
I still can't get it to work!
Virtual table you've created is designed to provide full text queries. It's not aimed to fast processing standard queries using PK in where condition.
In this case there is no index on your _id column, so SQLite probably performs full table scan.
Next problem is your query - it's totally inefficient. Try something like this (untested):
SELECT * FROM dictionary WHERE _id = (select max(_id) from dictionary where _id < ?)
Next thing you can consider is redesign of your app. Instead of loading 1 row you, maybe you should get let's say 40, load them into memory and make background data loading when there is less than n to one of the ends. Long SQL operation will become invisible to user even if it'll last 3s instead of 0,3s
If you're running LIMIT 1 to begin with, you can remove the order by clause completely. This may help. I'm not familiar with FTS3, however.
You could also just flat out assign your id variable a ++ or -- and assert `WHERE _id = "+id+" LIMIT 1" which would make a single lookup instead of < or >.
Edit: and now that I look back at what I typed, if you do it that way, you can just remove LIMIT 1 completely, since your _id is your pk and must be unique.
hey look, a raw where clause!