Android Sqlite FTS3 how to select words that starts with? - android

For example, if i have these records
word
AAA
AAB
AAC
BAA AA
With a normal table i would use sql like
select * from table where word like 'AA%'order by H collate nocase asc
How do i select with FTS3 table instead?
Also i would like to know if FTS3 will still have better performance than normal table with this kind of query?

How do i select with FTS3 table instead?
Quoting the documentation:
An FTS table may be queried for all documents that contain a specified term (the simple case described above), or for all documents that contain a term with a specified prefix. As we have seen, the query expression for a specific term is simply the term itself. The query expression used to search for a term prefix is the prefix itself with a '*' character appended to it.
The documentation also gives a sample:
-- Query for all documents containing a term with the prefix "lin". This will match
-- all documents that contain "linux", but also those that contain terms "linear",
--"linker", "linguistic" and so on.
SELECT * FROM docs WHERE docs MATCH 'lin*';

Related

How to write contains query in SQLite fts3 fulltext search

I want to make fulltext search in my index table which is sqlite fts3.
For example;
the data set is { "David Luiz", "David Villa", "Diego Costa", "Diego Ribas", "Diego Milito","Gabriel Milito", }
When I type "vid i" I want to get {"David Luiz", "David Villa"}
In documentation of SQLite I found this
http://www.sqlite.org/fts3.html#section_3
but it contains just startswith query.
my query is:
SELECT *FROM Table WHERE Table MATCH "*vid* *i*"
I dont know it is possible or not. If it is possible to make search in sqlite fts3, any help will be appreciated
The FTS index is optimized for word searches, and supports word prefix searches.
There is no index that can help with searches inside words.
You have to use LIKE '%vid%' (which scans the entire table).
Change your query from
SELECT * FROM Table WHERE Table MATCH "*vid* *i*"
To
SELECT * FROM SOME_TABLE WHERE some_column LIKE '%vid%'

How to do a word search on a large text database

I have a large database in my app. One column is made of text strings that are about a sentence to a paragraph long. I would like to make this column searchable by word(s) that the user inputs.
How would I make a quick search? I've heard of making an index but I don't know how to do that for a text search.
SQLite has a mechanism for storing a lot of text in a database, it's called FTS (short for full text search).
Android supports all SQLite commands, so you can easily just use FTS3.
How is explained in the documentation linked above.
Example for creating a table:
CREATE VIRTUAL TABLE enrondata1 USING fts3(content TEXT); /* FTS3 table */
CREATE TABLE enrondata2(content TEXT); /* Ordinary table */
Query:
SELECT count(*) FROM enrondata1 WHERE content MATCH 'linux'; /* 0.03 seconds */
SELECT count(*) FROM enrondata2 WHERE content LIKE '%linux%'; /* 22.5 seconds */

SQLite Fts select query

I am making a dictionary of over 20,000 words in it. So, to make it work faster when search data, i am using fts3 table to do it.
my select query:
Cursor c=db.rawQuery("Select * from data where Word MATCH '"+word+"*'", null);
Using this query, it will show all the word that contain 'word' , but what i want is to get only the word that contain the beginning of the searching word.
Mean that i want it work like this query:
Cursor c=db.rawQuery("Select * from data where Word like '"+word+"%'", null);
Ex: I have : apple, app, and, book, bad, cat, car.
when I type 'a': i want it to show only: apple, app, and
What can i solve with this?
table(_id primary key not null autoincrement, word text)
FTS table does not use the above attributes. It ignores data type. It does not auto increment columns other than the hidden rowid column. "_id" will not act as a primary key here. Please verify that you are implementing an FTS table
https://www.sqlite.org/fts3.html
a datatype name may be optionally specified for each column. This is
pure syntactic sugar, the supplied typenames are not used by FTS or
the SQLite core for any purpose. The same applies to any constraints
specified along with an FTS column name - they are parsed but not used
or recorded by the system in any way.
As for your original question, match "abc*" already searches from the beginning of the word. For instance match "man*" will not match "woman".
FTS supports searching for the beginning of a string with ^:
SELECT * FROM FtsTable WHERE Word MATCH '^word*'
However, the full-text search index is designed to find words inside larger texts.
If your Word column contains only a single word, your query is more efficient if you use LIKE 'a%' and rely on a normal index.
To allow an index to be used with LIKE, the table column must have TEXT affinity, and the index must be declared as COLLATE NOCASE (because LIKE is not case sensitive):
CREATE TABLE data (
...
Word TEXT,
...
);
CREATE INDEX data_Word_index ON data(Word COLLATE NOCASE);
If you were to use GLOB instead, the index would have to be case sensitive (the default).
You can use EXPLAIN QUERY PLAN to check whether the query uses the index:
sqlite> EXPLAIN QUERY PLAN SELECT * FROM data WHERE Word LIKE 'a%';
0|0|0|SEARCH TABLE data USING INDEX data_Word_index (Word>? AND Word<?)

Search data from sqlite3 database in android

I have a Sqlite3 database in android, with data are sentences like: "good afternoon" or "have a nice day", now I want to have a search box, to search between them, I use something like this :
Cursor cursor = sqliteDB.rawQuery("SELECT id FROM category WHERE sentences LIKE '"+ s.toString().toLowerCase()+ "%' LIMIT 10", null);
But it only show "good afternoon" as result if user start searching with first "g" or "go" or "goo" or etc, how can I retrieve "good afternoon" as results, if user search like "a" or "af" or "afternoon".
I mean I want to show "good afternoon" result, if user search from middle of a data in sqlite3 db, not only if user searches from beginning.
thanks!
Just put the percent sign in front of your query string: LIKE '%afternoon%'. However, your approach has two flaws:
It is susceptible to SQL injection attacks because you just insert unfiltered user input into your SQL query string. Use the query parameter syntax instead by re-writing your query as follows:
SELECT id FROM category WHERE sentences LIKE ? LIMIT 10. Add the user input string as selection argument to your query method call
It will be dead slow the bigger your database grows because LIKE queries are not optimized for quick string matching and lookups.
In order to solve number 2 you should use SQLite's FTS3 extension which greatly speeds up any text-related searches. Instead of LIKE you would be using the MATCH operator that uses a different query syntax:
SELECT id FROM category WHERE sentences MATCH 'afternoon' LIMIT 10
As you can see the MATCH operator does not need percent signs. It just tries to find any occurrence of a word in the whole text that is being searched (in your case the sentences column). Read through the documentation of FTS3 I've linked to. The MATCH query syntax provides some more pretty handy and powerful options for finding text in your database table which are pretty similar to early search engine query syntax such as:
MATCH 'afternoon OR evening'
The only (minor) downside to the FTS3 extension is that it blows up the database file size by creating additional search index tables and meta-data. But I think it's well worth it for this use case.

unable to retrieve special characters from sqlite fts3

I am having some problems with special characters in my scenario.
I have a sqlite db created using fts3.
When I use SELECT col_1, col_2, offsets(table) FROM table WHERE table MATCH 'h*' LIMIT 50;
I am able to get words which start with h.
but when I am using
SELECT col_1, col_2, offsets(table) FROM table WHERE table MATCH '#*' LIMIT 50;
I am not getting strings which start with #.
Where am I going wrong? Any pointer regarding approach would be great.
I think the behavior you described happens because SQLite FTS3 uses tokenizer called "simple" by default. The character # gets discarded because is not an alphanumeric character and its UTF codepoint is not greater than 127. My interpretation of this is that FTS is not for searching special characters, it is for searching natural text.
The fix I suggest is not to use FTS for this kind of queries but to use LIKE operator. Or you could try to search for other tokenizers available or write your on in C.

Categories

Resources