I'm creating a database for event objects, each event has a priority (long) which is unique. Objects with lower priority values are favored, newer object are initially assigned higher and higher priorities as the amount of object increases but the user can reassign priorities at will.
My question is should I use just _ID as the priority field? I figure that way when I select them and put them into my ArrayList in Java my events will already be sorted by priority high to low and save me the trouble of having to search. And depending on what algorithms SQLite actually uses selection by priority may be faster since it's also an index. I'm new to SQL and this all seems well and good to me, but there might be some drawback that is plain to someone more experienced.
Also, I'm a bit unclear on how insertion into the middle of a table would work, probably use the update function to increment the row_ids above it but how do I make sure it increments them in the proper order? (so as to not tread on each other).
Something like this?:
UPDATE event_table SET priority = priority + 1 WHERE priority > ?
P.S: I'm doing this on an Android 2.2 in SQLite 3
I think having a priority value of decimal datatype would make it easier to insert into the middle of the table. And not require you to do massive shifting when you're re-sorting.
Related
I have an SQLite DB where I perform a query like
Select * from table where col_name NOT IN ('val1','val2')
Basically I'm getting a huge list of values from server and I need to select the ones which is not present in the list given.
Currently its working fine, No issues. But the number of values from server becomes huge as the server DB is getting updated frequently.
So, I may get thousands of String values which I need to pass to the NOT IN
My question is, Will it cause any perfomance issue in the future? Does the NOT IN parameter have any size restriction? (like max 10000 values you can check)?
Will it cause any crash at some point?
This is an official reference about various limitation in sqlite. I think the Maximum Length Of An SQL Statement may related to your case. Default value is 1000000, and it is adjustable.
Except this I don't think any limitation existed for numbers of parameter of NOT IN clause.
With more than a few values to test for, you're better off putting them in a table that has an index on the column holding them. Then things like
SELECT *
FROM table
WHERE col_name NOT IN (SELECT value_col FROM value_table);
or
SELECT *
FROM table AS t
WHERE NOT EXISTS (SELECT 1 FROM value_table WHERE value_col = t.col_name);
will be reasonably efficient no matter how many records are in value_table because that index will be used to find entries.
Plus, of course, it makes it a lot easier to re-use prepared statements because you don't have to create a new one and re-bind every value (You are using prepared statements with placeholders for these values, right, and not trying to put their contents inline into a string?) every time you add a value to the ones you need to check. You just insert it into value_table instead.
Yes, there is a limit of 999 arguments as reported in the official documentation: https://www.sqlite.org/limits.html#max_variable_number
I have about 3000 pairs of (key,value). They are fixed and will not be changed forever. In my app, there is a page that needs to make around 200 queries. For each query, they take the key and ask for value. Also, they are sequential. I have to finish query 1 to get the "value 1" so then I know the key for query 2 to get "value 2".
I tried to implement with SQLite. I measured the time and found this is very slow, it took around 600ms. I wonder if there is better way to implement it? For example, string array with 3000 size? or other hashmap? thanks for advice.
Edit: Forget to mention the size of key and value, size of key: 2char(unicode), value: 4~6char, in fact, it is similar to a lookup language dictionary.
The answer depends on the amount of data that has to be put in this container? E.g. 3000 pairs of five bytes: no issue, to keep that data in memory; however, 3000 pairs of 350 bytes: that's already about 1MB.
If you have a rather smaller amount of data you could think about using a static SparseArray that gets initially filled by a SQL query or assignments in code. SparseArray's are intended to be more efficient than HashTable's.
If the key isn't an Integer a HashTable is still much faster than a SQL query.
If you have rather larger data sets, you could use a LruCache.
What is the best way to maintain a "cumulative sum" of a particular data column in SQLite? I have found several examples online, but I am not 100% certain how I might integrate these approaches into my ContentProvider.
In previous applications, I have tried to maintain cumulative data myself, updating the data each time I insert new data into the table. For example, in the sample code below, every time I would add a new record with a value score, I would then manually update the value of cumulative_score based on its value in the previous row.
_id score cumulative_score
1 100 100
2 50 150
3 25 175
4 25 200
5 10 210
However, this is far from ideal and becomes very messy when handling tables with many columns. Is there a way to somehow automate the process of updating cumulative data each time I insert/update records in my table? How might I integrate this into my ContentProvider implementation?
I know there must be a way to do this... I just don't know how. Thanks!
Probably the easiest way is with a SQLite trigger. That is the closest I know
of to "automation". Just have an insert trigger that takes the previous
cumulative sum, adds the current score and stores it in the new row's cumulative
sum. Something like this (assuming _id is the column you are ordering on):
CREATE TRIGGER calc_cumulative_score AFTER INSERT ON tablename FOR EACH ROW
BEGIN
UPDATE tablename SET cumulative_score =
(SELECT cumulative_score
FROM tablename
WHERE _id = (SELECT MAX(_id) FROM tablename))
+ new.score
WHERE _id = new._id;
END
Making sure that the trigger and the original insert are in the same
transaction. For arbitrary updates of the score column, you would have to
have to implement a recursive trigger that somehow finds the next highest id (maybe by selecting by the min id
in the set of rows with an id greater than the current one) and updates its
cumulative sum.
If you are opposed to using triggers, you can do more or less the same thing in
the ContentProvider in the insert and update methods manually, though since
you're pretty much locked into SQLite on Android, I don't see much reason not to
use triggers.
I assume you are wanting to do this as an optimization, as otherwise you could just calculate the sum on demand (O(n) vs O(1), so you'd have to consider how big n might get, and how often you need the sums).
I am fetching my data with id which is Integer primary key or integer.
But after deleting any row...
After that if we make select query to show all.
But it will give force close because one id is missing.
I want that id can itself take auto increment & decrement.
when i delete a record at the end(i.g. id=7) after this i add a row then id must be 7 not 8. as same when i delete a row in middle(i.g. id=3) then all the row auto specify by acceding.
your idea can help me.
Most systems with auto-incrementing columns keep track of the last value inserted (or the next one to be inserted) and do not ever reissue a number (give the same number twice), even if the last number issued has been removed from the table.
Judging from what you are asking, SQLite is another such system.
If there is any concurrency in the system, then this is risky, but for a single-user, single-app-at-a-time system, you might get away with:
SELECT MAX(id_column) + 1 FROM YourTable
to find the next available value. Depending on how SQLite behaves, you might be able to embed that in the VALUES list of an INSERT statement:
INSERT INTO YourTable(id_column, ...)
VALUES((SELECT MAX(id_column) + 1 FROM YourTable), ...);
That may not work; you may have to do this as two operations. Note that if there is any concurrency, the two statement form is a bad ideaTM. The primary key unique constraint normally prevents disaster, but one of two concurrent statements fails because it tries to insert a value that the other just inserted - so it has to retry and hope for the best. Clearly, a cell phone has less concurrency than, say, a web server so the problem is correspondingly less severe. But be careful.
On the whole, though, it is best to let gaps appear in the sequence without worrying about it. It is usually not necessary to worry about them. If you must worry about gaps, don't let people make them in the first place. Or move an existing row to fill in the gap when you do a delete that creates one. That still leaves deletes at the end creating gaps when new rows are added, which is why it is best to get over the "it must be a contiguous sequence of numbers" mentality. Auto-increment guarantees uniqueness; it does not guarantee contiguity.
We have about 7-8 tables in our Android application each having about 8 columns on an average. Both read and write operations are performed on the database and I am experimenting and trying to find ways to enhance the performance of the DataAccess layer. So, far I have tried the following:
Use positional arguments in where clauses (Reason: so that sqlite makes use of the same execution plan)
Enclose inserts and update with transactions(Reason: every db operation is enclosed within a transaction by default. Doing this will remove that overhead)
Indexing: I have not created any explicit index other than those created by default on the primary key and unique keys columns.(Reason: indexing will improve seek time)
I have mentioned my assumptions in paranthesis; please correct me if I am wrong.
Questions:
Can I add anything else to this list? I read somewhere that avoiding the use of db-journal can improve performance of updates? Is this a myth or fact? How can this be done, if recomended?
Are nested transactions allowed in SQLite3? How do they affect performance?
The thing is I have a function which runs an update in a loop, so, i have enclosed the loop within a transaction block. Sometimes this function is called from another loop inside some other function. The calling function also encloses the loop within a transaction block. How does such a nesting of transactions affect performance?
The where clauses on my queries use more than one columns to build the predicate. These columns might not necessarily by a primary key or unique columns. Should I create indices on these columns too? Is it a good idea to create multiple indices for such a table?
Pin down exactly which queries you need to optimize. Grab a copy of a typical database and use the REPL to time queries. Use this to benchmark any gains as you optimize.
Use ANALYZE to allow SQLite's query planner to work more efficiently.
For SELECTs and UPDATEs, indexes can things up, but only if the indexes you create can actually be used by the queries that you need speeding up. Use EXPLAIN QUERY PLAN on your queries to see which index would be used or if the query requires a full table scan. For large tables, a full table scan is bad and you probably want an index. Only one index will be used on any given query. If you have multiple predicates, then the index that will be used is the one that is expected to reduce the result set the most (based on ANALYZE). You can have indexes that contain multiple columns (to assist queries with multiple predicates). If you have indexes with multiple columns, they are usable only if the predicates fit the index from left to right with no gaps (but unused columns at the end are fine). If you use an ordering predicate (<, <=, > etc) then that needs to be in the last used column of the index. Using both WHERE predicates and ORDER BY both require an index and SQLite can only use one, so that can be a point where performance suffers. The more indexes you have, the slower your INSERTs will be, so you will have to work out the best trade-off for your situation.
If you have more complex queries that can't make use of any indexes that you might create, you can de-normalize your schema, structuring your data in such a way that the queries are simpler and can be answered using indexes.
If you are doing a large number of INSERTs, try dropping indexes and recreating them at the end. You will need to benchmark this.
SQLite does support nested transactions using savepoints, but I'm not sure that you'll gain anything there performance-wise.
You can gain lots of speed by compromising on data integrity. If you can recover from database corruption yourself, then this might work for you. You could perhaps only do this when you're doing intensive operations that you can recover from manually.
I'm not sure how much of this you can get to from an Android application. There is a more detailed guide for optimizing SQLite in general in the SQLite documentation.
Here's a bit of code to get EXPLAIN QUERY PLAN results into Android logcat from a running Android app. I'm starting with an SQLiteOpenHelper dbHelper and an SQLiteQueryBuilder qb.
String sql = qb.buildQuery(projection,selection,selectionArgs,groupBy,having,sortOrder,limit);
android.util.Log.d("EXPLAIN",sql + "; " + java.util.Arrays.toString(selectionArgs));
Cursor c = dbHelper.getReadableDatabase().rawQuery("EXPLAIN QUERY PLAN " + sql,selectionArgs);
if(c.moveToFirst()) {
do {
StringBuilder sb = new StringBuilder();
for(int i = 0; i < c.getColumnCount(); i++) {
sb.append(c.getColumnName(i)).append(":").append(c.getString(i)).append(", ");
}
android.util.Log.d("EXPLAIN",sb.toString());
} while(c.moveToNext());
}
c.close();
I dropped this into my ContentProvider.query() and now I can see exactly how all the queries are getting performed. (In my case it looks like the problem is too many queries rather than poor use of indexing; but maybe this will help someone else...)
I would add these :
Using of rawQuery() instead of building using ContentValues will fasten up in certain cases. off course it is a little tedious to write raw query.
If you have a lot of string / text type data, consider creating Virtual tables using full text search (FTS3), which can run faster query. you can search in google for the exact speed improvements.
A minor point to add to Robie's otherwise comprehensive answer: the VFS in SQLite (which is mostly concerned with locking) can be swapped out for alternatives. You may find one of the alternatives like unix-excl or unix-none to be faster but heed the warnings on the SQLite VFS page!
Normalization (of table structures) is also worth considering (if you haven't already) simply because it tends to provide the smallest representation of the data in the database; this is a trade-off, less I/O for more CPU, and one that is usually worthwhile in medium-scale enterprise databases (the sort I'm most familiar with), but I'm afraid I've no idea whether the trade-off works well on small-scale platforms like Android.