Using Like in Sqlite with android for searching in encrypted data - android

I have a table (Table1) in SQLite with one column (Col1), this table has 100,000 rows that all values in Col1 are encrypted with special algortith.
I've used select sql ... like command in Android, Like this:
Select Col1
from Table1
where Col1 like 'A%';
I want to return all rows that started with 'A' letter.
But actually Cal1 is encrypted!! even if I use this:
"Select Col1 from Table1 where Col1 like '"+my_method_encryption("A")+"%';" .. it will be wrong, becuase may the Encrypted values of 'A' letter in Col1 has different value with return value of my_method_encryption("A").
What should I do?
Actually There is another way to solve it, if I select all 100,000 rows and after that I will decrypt all 100,000 rows and then search. But this way will be so slow becuase maybe I will need to use this select ... like more than 10 times.
Thanks

Encrypt the database file containing the plain-text column and table key. Make and link by key a separate db file for non-encrypted data.
Decrypt the file on open, making searches possible and join the other data based on the key.

where Col1 like '"+my_method_encryption("A")+"%';"
That won't work, it's in the purpose of encryption that you cannot tell from the encrypted text what the first character(s) of the plaintext are, otherwise encryption would be meaningless.
What you could do is move your decryption function to SQLite as an user function (https://www.sqlite.org/appfunc.html)
The use something like:
where my_decrypt(Col1,'key') like 'A%'"
But this would be identical to what you wrote:
Actually There is another way to solve it, if I select all 100,000 rows and after that I will decrypt all 100,000 rows and then search.
This is what SQLite will do, just that internally.
However, what you are trying to achieve seems to be conceptually wrong; namely:
your encryption scope is a row:column
you want to query the encrypted data based on what usually requires an external aggregate structure (an index). It's the only way to get a sub-linear search performance.
You should consider expanding your encryption scope to the whole table; for instance SQLite provides https://www.sqlite.org/see/doc/trunk/www/readme.wiki which blocklevel-encrypts the whole database (database scope, block unit). You can also logically join an unencrypted db with an encrypted db; using ATTACH you bring both dbs into the same scope then use a normal JOIN, maybe even in a view, to bring the data together.
I'm not familiar with the Android ecosystem but a simple search for "android sqlite encryption extension" reveals that there is no shortage of alternatives for DB-level encryption.

Related

Indexes in SQLite and ordering of rows based on PK

Does a PK in SQLite guarantee order of data?
AFAIK indexes implementation store data in-order of PK.
Does this apply for SQLite? Even for a composite PK?
The documentation says:
If a SELECT statement that returns more than one row does not have an ORDER BY clause, the order in which the rows are returned is undefined.
The presence of a primary key or any other index does not change this; there is no guarantee that that index is actually used for the query.
If you want the output of a query to be sorted, you must use an ORDER BY. (If the ordering can be trivially implemented with the index, this will not be any less efficient that the same query without the ORDER BY clause.)
Do you want know the underly implementation about indexes in SQLite and how does SQLite store data on the disk? Maybe help you at File Format For SQLite Databases and Here in chinese.

What is the advantage of FTS over custom solution?

I have a biggish database ~32mb which has lots of text in 4 languages. Including Arabic and Urdu. I need to search this text in the most efficient way (speed & size).
I am considering FTS, and trying to find out how to implement it. Right now I am reading http://www.sqlite.org/fts3.html#section_1_2 about it.
It seems to me, an FTS table is just like a normal table used to index all the different words. So my questions are:
1) If to populate FTS I have to do all the inserts myself, then why not make my own indexed word table, what is the difference?
Answer : Yes there are many advantages, many built in functions that help. For example with ranking etc, searching of stems and the transparent nature of how it all works in android makes the FTS approach more appealing.
2) On the google docs I read its a virtual in memory table, now this would be massive right... but it doesnt mention this on the SQLite website. So which is it?
3) Is there an easy way to generate all the different words from my columns?
4) Will the FTS handle arabic words properly?
FTS allows for fast searching of words; normal indexes only allow to search for entire values or for the beginning of the value.
If you table has only one word in each field, using FTS does not make sense.
FTS is a virtual table, but not an in-memory table.
You can get individual terms from the full-text index with the fts4aux table.
The default tokenizer works only with ASCII text.
You have to test whether the ICU or UNICODE61 tokenizers work with your data.
1) If to populate FTS I have to do all the inserts myself, then why
not make my own indexed word table, what is the difference?
Using your own indexed word table, you would have parse words in sentences. You would then need a table for sentences and another to words. And you should do this efficiently.
2) On the google docs I read its a virtual in memory table, now this
would be massive right... but it doesnt mention this on the SQLite
website. So which is it?
Don't understand your question. Data is handled via virtual table extension, however back storage is done in database (FTS4 creates 5 tables for each virtual table). Check this:
sqlite> CREATE VIRTUAL TABLE docs USING fts4();
sqlite> .schema
CREATE VIRTUAL TABLE docs USING fts4();
CREATE TABLE 'docs_content'(docid INTEGER PRIMARY KEY, 'content');
CREATE TABLE 'docs_segments'(blockid INTEGER PRIMARY KEY, block BLOB);
CREATE TABLE 'docs_segdir'(level INTEGER,idx INTEGER,start_block INTEGER,leaves_
end_block INTEGER,end_block INTEGER,root BLOB,PRIMARY KEY(level, idx));
CREATE TABLE 'docs_docsize'(docid INTEGER PRIMARY KEY, size BLOB);
CREATE TABLE 'docs_stat'(id INTEGER PRIMARY KEY, value BLOB);
sqlite>
3) Is there an easy way to generate all the different words from my
columns?
For sure. But that's not easy. That's what FTS does.
4) Will the FTS handle arabic words properly?
I'm not sure. Does arabic languages uses ICU word boundaries? From Tokenizer:
The ICU tokenizer implementation is very simple. It splits the input
text according to the ICU rules for finding word boundaries and
discards any tokens that consist entirely of white-space. This may be
suitable for some applications in some locales, but not all. If more
complex processing is required, for example to implement stemming or
discard punctuation, this can be done by creating a tokenizer
implementation that uses the ICU tokenizer as part of its
implementation.

How to generate database with table of variable number of columns?

In my Android app, I need to temporarily store some data in a form of table such as follows:
id | column 1 | column 2 | ... | column n
The data are downloaded from a server whenever users press a button. However, the data table doesn't have a fix number of column (as well as row) every time user downloads it from the server. For example, the server may send data with 3 columns the first time. Then it might send data with 5 columns the second time, etc...
Given this scenario, I think the database is probably the right data structure to use. My plan is to create a database, then add and delete tables as necessary. So I have been reading various tutorials on Android database (one example is this one http://www.codeproject.com/Articles/119293/Using-SQLite-Database-with-Android#). It seems to me I cannot create new table with variable number of columns using the sqlite database. Is this correct? In the onCreate(SQLiteDatabase db) method, the "create table" command must be specified with known number of columns and their data types. I could provide several "create table" commands, each with different number of columns but that seems like very crude. Is there a way to create database tables with variable number of columns on the fly?
Another alternative probably using several hash tables, each storing one column of the data table. I'm seriously considering this approach if the database approach is not possible. Any better suggestion is welcomed.
There is no such thing as a variable number of columns in an SQLite data base. Also, adding and deleting tables dynamically seems like a horrible hack.
It sounds like you want to store an array of values associated with an id. I suggest you think in terms of rows, not columns. Use a table structure like (id, index, value); each array of values returned by the server results in as many rows as necessary to store the values.

How to organize sqlite database

this is more of a question of theory than anything else. I am writing an android app that uses a pre-packaged database. The purpose of the app is solely to search through this database and return values. Ill provide some abstract examples to illustrate my implementation and quandary. The user can search by: "Thing Name," and what I want returned to the user is values a, b, and c. I initially designed the database to have it all contained on a single sheet, and have column 1 be key_index, column 2 be name, column 3 be a, etc etc. When the user searches, the cursor will return the key_index, and then use that to pull values a b and c.
However, in my database "Thing alpha" can have a value a = 4 or a = 6. I do not want to repeat data in the database, i.e. have multiple rows with the same thing alpha, only separate "a" values. So what is the best way to organize the data given this situation? Do I keep all the "Thing Names" in a single sheet, and all the data separately. This is really a question of proper database design, which is definitely something foreign to me. Thanks for your help!
There's a thing called database normalization http://en.wikipedia.org/wiki/Database_normalization. You usually want to avoid redundancy and dependency in the DB entities using a corresponding design with surrogate keys and foreign keys and so on. Your "thing aplpha" looks like you want to have a many-to-many table like e.g. one or many songs belong/s to the same or different genres. You may want to create dictionary tables to hold your id,name pairs and have foreign keys referencing these tables. In your case it will be mostly a read-only DB so you might want to consider creating indexes with high FILLFACTOR percentage don't think sqlite allows it to do though. There're many ways to design the database. Everything depends on the purpose of DB. You can start with a design of your hardware like raids/file systems/db block sizes to match the F-System's block sizes in order to keep the I/O optimal and where to put your tablespaces/filegroups/indexes to balance the i/o load. The whole DB design theory/task is really a deep subject which is not to be underestimated nor is a matter of few sentences in the answer of stackoverflow. :)
without understanding your data better here is my guess at what you are looking for.
table: product
- _id
- name
table: attribute
- product_id
- a

Can I declare table name with escape sequences in sqlite3?

Is it possible to create a table name with escape sequences ?
like TableName:exampl's
I have EditText and it's entry like that and want to create a table for it ,and there is no restriction for the edittext.
Yes, it is possible. Or at least sqlite3 itself does not forbid this.
The following example would create the table tbl'1
create table "tbl'1"(one varchar(10), two smallint);
But.
There are several reasons why you should not do that:
Naming tables after user input is simply not acceptable. (http://xkcd.com/327/)
I assume that you are using a database wrapper and you do not directly access the sqlite3 file. If yes, than this solution may fail eventually.
If you have a valid database model, there will be no need to create tables dynamically. Insert rows for new data instead. There you can use as much escape characters as you want.

Categories

Resources