I did a bit of research about sql escape characters and count statements and didnt find a solution to my question. Even though I used stuff like:
SELECT * FROM table WHERE path LIKE '%/_%' ESCAPE '/';
I got a table where in a column there is paths so I want to select the items where I have certain number of slashes:
ID DIRECTORY
1 root/A
2 root/B
3 root/A/1/2
4 root/B/1/2
5 root/A/1
6 root/B/2
so, how do I select for example the elements that have only 2 slashes??
Edit 1: This is to be done in Android SQL-Lite Database
You can use a regular expression:
SELECT * FROM table WHERE path REGEXP '^([^/]*)/([^/]+)/([^/]*)$';
The above expression looks specifically for an optional group of characters not containing /, followed by /, followed by another group without /, followed by /, and optionally another set of characters before the end of the string.
So:
/Bxx92/2 -- match
5 root/A/1 -- match
6 root/Bxx92/2 -- match
6 root/Bxx92/2 -- match
7 root/Bxx92/ -- match
6 root/2 -- NO match
If there MUST be something before the first and after the last /, change the expression to '^([^/]+)/([^/]+)/([^/]+)$'
You can use this trick to count occurrences of a character in a string:
SELECT LENGTH('path') - LENGTH(REPLACE('path', '/', '')) AS `occurrences`
So you can achieve the goal with
SELECT id, path FROM
(SELECT id, path, LENGTH('path') - LENGTH(REPLACE('path', '/', '')) AS `occurrences`
FROM table) temp
WHERE occurrences = 2
However, I expect performance will be terrible. If you are going to query like that, consider adding a column with the path depth so that you can query directly with
SELECT id, path FROM table WHERE depth = 2
Related
In Android SQLite i got tabel like this
domainObjectId: String // like '9876543210'
name: String
description: String
I want to use FTS on this to search without worrying about diacritical marks, how ever i want to let user select also by typing part of object ID(ex. last 4 char)
I got select like
`SELECT * FROM tabel LEFT JOIN tabel_fts on tabel_fts.domainObjectId = tabel.domainObjectId WHERE tabel_fts MATCH '3210*' OR tabel.domainObjectId LIKE '%3210%'
But in return i get error
unable to use function MATCH in the requested context (code 1 SQLITE_ERROR);
Is this possible to add additional condition to select with MATCH?
Try to remove "MATCH" into separate "SELECT":
`SELECT * FROM tabel LEFT JOIN (select * from tabel_fts WHERE tabel_fts.domainObjectId MATCH '3210*') as tabel_fts WHERE tabel.domainObjectId LIKE '%3210%' OR table_fts.ID IS NOT NULL
By the way:
In your "WHERE tabel_fts" it seemed you've missed a column name
There is no "ON" condition in tables JOINm just "WHERE". That's OK? May be it would be better to use UNION?
There are a lot of questions about splitting a BigQuery, MySQL column, but I can't find one that fits my situation.
I am processing a large dataset (3rd party) that includes a freeform location field to normalize it for my Android app. When I run a select I'd like to split the column data by commas, take only the last segment and trim it of whitespace.
So far I've come up with the following by Googling documentation:
SELECT RTRIM(LOWER(SPLIT(location, ',')[OFFSET(-1)])) FROM `users` WHERE location <> ''
But the -1 trick to split at last element does not work (with either offset or ordinal). I can't use ARRAY_LENGTH with the same array inline and I'm not exactly sure how to structure a nested query and know the last column index of the row.
I might be approaching this from the wrong angle, I work with Android and NoSQL now so I haven't used MySQL in a long time
How do I structure this query correctly?
I'd like to split the column data by commas, take only the last segment ...
You can use below approach (BigQuery Standard SQL)
SELECT ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)]
Below is an example illustrating it:
#standardSQL
WITH `project.dataset.table` AS (
SELECT '1,2,3,4,5' location UNION ALL
SELECT '6,7,8'
)
SELECT location, ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)] last_segment
FROM `project.dataset.table`
with result
Row location last_segment
1 1,2,3,4,5 5
2 6,7,8 8
For trimming - you can use LTRIM(RTRIM()) - like in
SELECT LTRIM(RTRIM(ARRAY_REVERSE(SPLIT(location))[SAFE_OFFSET(0)]))
To get the last part of the split string, I use the len(string) - len(replace(string,delimeter,'')) trick to count the number of delimiters:
split(<string>,'-')[OFFSET(length(<string>)-length(replace(<string>,'-',''))]
I'm not entirely sure how possible this is in a select statement, or if I'm better getting all results and doing checks myself in Android Studio.
I've got 3 tables, a table that stores Recordings, a Table that stores Tags and a table that links the Tags to the Recordings - TagsLink.
The TagsLink table has 2 columns, one that stores the TagsID and one that stores the RecordingsID
What I'm hoping to do is only return RecordingsIDs that meet the selected Tags criteria. So if TagsID 3 is selected, Recordings 1, 2 and 4 are returned. And if TagsID 3 and 4 are selected, it returns only Recordings 2 and 4.
In my mind it's something along the lines of:
SELECT DISTINCT RecordingsID FROM TagsLink WHERE ...
If this isn't entirely possible, any advice on other ways of achieving this (even if it requires restructuring the database) would be greatly appreciated!
With this kind of query:
SELECT
RecordingsID
FROM
TagsLink
WHERE
TagsID IN (3, 4, ...)
GROUP BY
RecordingsID
HAVING COUNT(*) = 2 -- This number must match the number of tag IDs specified in the IN (...) list.
The key is to remember to adjust the count based on the tags you want to filter on.
Other similar and helpful answers here and here and here.
EDIT
To accommodate additional tables, filtering on different columns, use INTERSECT as follows:
SELECT
RecordingsID
FROM
TagsLink
WHERE
TagsID IN (3, 4, ...)
GROUP BY
RecordingsID
HAVING COUNT(*) = 2 -- This number must match the number of tag IDs specified in the IN (...) list.
INTERSECT
SELECT
RecordingsID
FROM
ContactsLink
WHERE
ContactsID IN (100, ...)
GROUP BY
RecordingsID
HAVING COUNT(*) = 1 -- This number must match the number of contacts IDs specified in the IN (...) list.
This should work:
SELECT RecordingsID FROM tagslink WHERE TagsID = 4
Intersect
SELECT RecordingsID FROM tagslink WHERE TagsID = 3
I could not test it with sqlite. However, the function to use is Intersect, not using parentheses that sqlite does not support them
I'm seeing some weird behaviour on my FTS enabled SQLite database. I have a table named fingerprints that contains a column named scan. Entries of scan are long strings that look like this:
00:13:10:d5:69:88_-58;0c:85:25:68:b4:30_-75;0c:85:25:68:b4:34_-76;0c:85:25:68:b4:33_-76;0c:85:25:68:b4:31_-76;0c:85:25:68:b4:35_-76;00:23:eb:ad:f6:00_-87; etc
It represent MAC addresses and signal strengths. Now I want to do string matching on the table and try to match for instance a MAC address:
SELECT _id FROM fingerprints WHERE scan MATCH "00:13:10:d5:69:88";
This returns a lot of rows that do not have the specified string in it for some reason. Second thing I will try to match is
SELECT _id FROM fingerprints WHERE scan MATCH "00:13:10:d5:69:88_-58";
This returns the same rows has before and is completely wrong.
Does SQLite treats the : _ - characters in any special way?
Thanks
What you're seeing is the effect of the FTS tokenizing your data.
The full text search doesn't work on un-processed long strings, it splits your data (and your search terms) into words and indexes them individually. The default tokenizer uses all alphanumeric characters and all characters with a code point >128 for words, and uses the rest of the characters (for example, as you're seeing : _ -) as word boundaries.
In other words, your search for 00:13:10:d5:69:88 will search for rows containing the words 00 and 13 and 10 and d5 and 69 and 88 in any order.
You can verify this behavior;
sqlite> CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple);
sqlite> INSERT INTO simple VALUES('00:13:10:d5:69:88');
sqlite> SELECT * FROM simple WHERE simple MATCH '69:10';
-> 00:13:10:d5:69:88
EDIT: Apparently SQLite is smarter than I originally gave it credit for, you can use phrase queries (scroll down about a page from the link destination) to look for word sequences, which would solve your problem. Phrase queries are specified by enclosing a space (or other word separator) separated sequence of terms in double quotes (").
sqlite> SELECT * FROM simple WHERE simple MATCH '"69:10"';
-> No match
sqlite> SELECT * FROM simple WHERE simple MATCH '"69 88"';
-> 00:13:10:d5:69:88
sqlite> SELECT * FROM simple WHERE simple MATCH '"69:88"';
-> 00:13:10:d5:69:88
I'm developing an Android application that has to perform substring search in a large table (about 500'000 entries with street and location names, so just a few words per entry).
CREATE TABLE Elements (elementID INTEGER, type INTEGER, name TEXT, data BLOB)
Note that only 20% of all entries contain strings in the "name" column.
Performing the following query almost takes 2 minutes:
SELECT elementID, name FROM Elements WHERE name LIKE %foo%
I now tried to use FTS3 in order to speed up the query. That was quite successful, query time decreased to 1 minute (surprisingly the database file size increased by only 5%, which is also quite good for my purpose).
The problem is, FTS3 seemingly doesn't support substring search, i.e. if I want to find "bar" in "foo bar" and "foobar", I only get "foo bar", although I need both results.
So actually I have two questions:
Is it possible to further speed up the query? My goal is 30 seconds for the query, but I don't know if that's realistic...
How can I get real substring search using FTS3?
Solution 1:
If you can make every character in your database as an individual word, you can use phrase queries to search the substring.
For example, assume "my_table" contains a single column "person":
person
------
John Doe
Jane Doe
you can change it to
person
------
J o h n D o e
J a n e D o e
To search the substring "ohn", use phrase query:
SELECT * FROM my_table WHERE person MATCH '"o h n"'
Beware that "JohnD" will match "John Doe", which may not be desired.
To fix it, change the space character in the original string into something else.
For example, you can replace the space character with "$":
person
------
J o h n $ D o e
J a n e $ D o e
Solution 2:
Following the idea of solution 1, you can make every character as an individual word with a custom tokenizer and use phrase queries to query substrings.
The advantage over solution 1 is that you don't have to add spaces in your data, which can unnecessarily increase the size of database.
The disadvantage is that you have to implement the custom tokenizer. Fortunately, I have one ready for you. The code is in C, so you have to figure out how to integrate it with your Java code.
You should add an index to the name column on your database, that should speed up the query considerably.
I believe SQLite3 supports sub-string matching like so:
SELECT * FROM Elements WHERE name MATCH '*foo*';
http://www.sqlite.org/fts3.html#section_3
I am facing some thing similar to your problem. Here is my suggestion try creating a translation table that will translate all the words to numbers. Then search numbers instead of words.
Please let me know if this is helping.
not sure about speeding it up since you're using sqllite, but for substring searches, I have done things like
SET #foo_bar = 'foo bar'
SELECT * FROM table WHERE name LIKE '%' + REPLACE(#foo_bar, ' ', '%') + '%'
of course this only returns records that have the word "foo" before the word "bar".