SQlite query for stop times in GTFS data

SQlite query for stop times in GTFS data - android

I am working with GTFS data on Android (SQlite). And I would like to improve performance when I do select queries in my database filled with GTFS data.
The query below select the stop times associated to a route at a stop:
The first sub query gets the daily stop times on thursday.
The second gets all the exception stop times which are not valid for TODAY (2013-07-25).
The third one gets all the exception stop time which are only valid for TODAY (2013-07-25).
Then I remove the non-valid one and add the valid one to the first sub query.
select distinct stop_times_arrival_time
from stop_times, trips, calendar
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and trip_route_id='11821949021891616'
and stop_times_stop_id='3377699721872252'
and calendar_start_date<='20130725'
and calendar_end_date>='20130725'
and calendar_thursday=1
and stop_times_arrival_time>='07:40'
except
select stop_times_arrival_time
from stop_times, trips, calendar, calendar_dates
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and calendar_dates_service_id = trip_service_id
and trip_route_id='11821949021891694'
and stop_times_stop_id='3377699720880977'
and calendar_thursday=1
and calendar_dates_exception_type=2
and stop_times_arrival_time > '07:40'
and calendar_dates_date = 20130725
union
select stop_times_arrival_time
from stop_times, trips, calendar, calendar_dates
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and calendar_dates_service_id = trip_service_id
and trip_route_id='11821949021891694'
and stop_times_stop_id='3377699720880977'
and calendar_thursday=1
and calendar_dates_exception_type=1
and stop_times_arrival_time > '07:40'
and calendar_dates_date = 20130725;
It took about 15 seconds to compute (which is very long).
I am sure there is a better to do this query since I do 3 different queries (almost the same by the way) which take time.
Any idea how to improve it?
EDIT:
Here is the schema:
table|calendar|calendar|2|CREATE TABLE calendar (
calendar_service_id TEXT PRIMARY KEY,
calendar_monday INTEGER,
calendar_tuesday INTEGER,
calendar_wednesday INTEGER,
calendar_thursday INTEGER,
calendar_friday INTEGER,
calendar_saturday INTEGER,
calendar_sunday INTEGER,
calendar_start_date TEXT,
calendar_end_date TEXT
)
index|sqlite_autoindex_calendar_1|calendar|3|
table|calendar_dates|calendar_dates|4|CREATE TABLE calendar_dates (
calendar_dates_service_id TEXT,
calendar_dates_date TEXT,
calendar_dates_exception_type INTEGER
)
table|routes|routes|8|CREATE TABLE routes (
route_id TEXT PRIMARY KEY,
route_short_name TEXT,
route_long_name TEXT,
route_type INTEGER,
route_color TEXT
)
index|sqlite_autoindex_routes_1|routes|9|
table|stop_times|stop_times|12|CREATE TABLE stop_times (
stop_times_trip_id TEXT,
stop_times_stop_id TEXT,
stop_times_stop_sequence INTEGER,
stop_times_arrival_time TEXT,
stop_times_pickup_type INTEGER
)
table|stops|stops|13|CREATE TABLE stops (
stop_id TEXT PRIMARY KEY,
stop_name TEXT,
stop_lat REAL,
stop_lon REAL
)
index|sqlite_autoindex_stops_1|stops|14|
table|trips|trips|15|CREATE TABLE trips (
trip_id TEXT PRIMARY KEY,
trip_service_id TEXT,
trip_route_id TEXT,
trip_headsign TEXT,
trip_direction_id INTEGER,
trip_shape_id TEXT
)
index|sqlite_autoindex_trips_1|trips|16|
And here is the query plan:
2|0|0|SCAN TABLE stop_times (~33333 rows)
2|1|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
2|2|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
3|0|3|SCAN TABLE calendar_dates (~10000 rows)
3|1|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
3|2|0|SEARCH TABLE stop_times USING AUTOMATIC COVERING INDEX (stop_times_stop_id=?) (~7 rows)
3|3|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
1|0|0|COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE (EXCEPT)
4|0|3|SCAN TABLE calendar_dates (~10000 rows)
4|1|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
4|2|0|SEARCH TABLE stop_times USING AUTOMATIC COVERING INDEX (stop_times_stop_id=?) (~7 rows)
4|3|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
0|0|0|COMPOUND SUBQUERIES 1 AND 4 USING TEMP B-TREE (UNION)

Columns that are used for lookups should be indexed, but for a single (sub)query, it is not possible to use more than one index per table.
For this particular query, the following additional indexes would help:
CREATE INDEX some_index ON stop_times(
stop_times_stop_id,
stop_times_arrival_time);
CREATE INDEX some_other_index ON calendar_dates(
calendar_dates_service_id,
calendar_dates_exception_type,
calendar_dates_date);

Related

Database Table Structure for a simple RPG

I'm currently practising on SQLite and making a simple text based RPG, but I need some advice with table structure.
So far I've come up with a "Player" table which stores the Player information.
An "Inventory" table, connected to its "Player" ID.
An "Item" table that holds all the Items.
Here is my issue. I have a "Weapon" model, "Shield", "Chest", "Legs" etc. etc. for each Item-type equipment, which holds maybe 50-100 items each. Should I store ALL items in a long list of "Item" table or should I make Sub-Tables? Like a "Weapon" table, a "Shield" table etc. and remove the "Item" table?
Thank you for your time!

The correct answer should be to follow recognised realtional database design guidelines which would very much depend upon the full functionality requirements of the game.
However, I'd suggest that the resolution would be neither just a list of items nor just separate tables for item types. Rather a table for items which has a column for the "type" which references tables (Weapon, Armour, collectables(etc)) and perhaps a type table.
Say for example a Weapon had a force value (how hard it hits) and a speed value (how frequently it hits), but armour only has a defence value and collectables had a weight value. The item list could be quite a complicated affair i.e. in this simple scenario that's 4 additional columns, with quite a bit of redundancy i.e. collectables and armour only utilise 25%, whilst 50% for a weapon, so perhaps you could introduce more complicated processing to utilise just 2 columns.
SQLite wise perhaps the tables could be :-
CREATE TABLE IF NOT EXISTS rpg_player (_id INTEGER PRIMARY KEY, player_name TEXT);
CREATE TABLE IF NOT EXISTS rpg_item(_id INTEGER PRIMARY KEY, type_reference INTEGER, item_subtype_reference INTEGER, UNIQUE(type_reference,item_subtype_reference));
CREATE TABLE IF NOT EXISTS rpg_inventory(player_id INTEGER, item_id INTEGER, number_held INTEGER, PRIMARY KEY(player_id, item_id)) WITHOUT ROWID;
CREATE TABLE IF NOT EXISTS rpg_item_types(_id INTEGER PRIMARY KEY, type_name TEXT, type_flags INTEGER);
CREATE TABLE IF NOT EXISTS rpg_weapon(weapon_id INTEGER PRIMARY KEY, weapon_name TEXT, weapon_flags INTEGER, weapon_force INTEGER, weapon_speed INTEGER);
CREATE TABLE IF NOT EXISTS rpg_armour(armour_id INTEGER PRIMARY KEY, armour_name TEXT, armour_flags INTEGER, armour_defence INTEGER);
CREATE TABLE IF NOT EXISTS rpg_collectable(collectable_id INTEGER PRIMARY KEY, collectable_name TEXT, collectable_flags INTEGER, collectable_weight INTEGER);
These say are populated as :-
Player Table
Item Table
The master Item table catering for all items to be easily referenced. (e.g. in the inventory). An entry has it's own unqiue id that references the type (weapon, armour.....) and then the item within the sub type table:-
The first row has a unique id of 1, (1st column) the item is of type 2(2nd column) (id 2 in the rpg_item_types table) (Armour) and is the item of armour that has an id of 5 (3rd column) in the rpg_armour table (Arm Thinggies).
Likewise item 3 is a Weapon (column 1 is 1, so type 1) that being the weapon that has an id in the weapon table of 2 (Great Sword).
Only references other tables but all items have a unique id
type_reference is the type whilst item_subtype_reference is the id of the item in that respective type table (weapon, armour, collecatble).
A table Constraint is set so that a combination of type_reference and item_subtype_reference must be unique as per UNIQUE(type_reference,item_subtype_reference)
Inventory Table
Item Types
This table has an entry for each subclass of items.
Sub Item Tables
Tables that model the specifics of the item e.g. rpg_weapon has a weapon_force and weapon_speed column, whilst rpg_armour only has an armour_defence column
You could create a simple list (output wise) of the Items using the following :-
--LIST ALL ITEMS
SELECT
CASE
WHEN type_name = 'Weapon' THEN weapon_name || ' Type (' || type_name || ')'
WHEN type_name = 'Armour' THEN armour_name || ' Type (' || type_name || ')'
WHEN type_name = 'Collectable' THEN collectable_name || ' Type (' || type_name || ')'
END AS description
FROM rpg_item
JOIN rpg_item_types ON type_reference = rpg_item_types._id
LEFT JOIN rpg_weapon ON item_subtype_reference = weapon_id
LEFT JOIN rpg_armour ON item_subtype_reference = armour_id
LEFT JOIN rpg_collectable ON item_subtype_reference = collectable_id
Resulting in :-
The following would list the all inventory items :-
SELECT
CASE
WHEN rpg_item.type_reference = 1 THEN 'Player - ' || player_name || ' has ' || weapon_name || ' it is a ' || type_name || ' it has a force of ' || weapon_force
WHEN rpg_item_types.type_name = 'Armour' THEN 'Player - ' || player_name || ' has ' || armour_name || ' it is a ' || type_name || ' with a defence rating of ' || armour_defence
WHEN rpg_item.type_reference = 3 THEN 'Player - ' || player_name || ' has ' || collectable_name || ' it is a ' || type_name || ' it has a weight of ' || collectable_weight
END AS description
FROM rpg_inventory
JOIN rpg_player ON player_id = rpg_player._id
JOIN rpg_item ON rpg_inventory.item_id = rpg_item._id
JOIN rpg_item_types ON rpg_item.type_reference = rpg_item_types._id
LEFT JOIN rpg_weapon ON rpg_item.item_subtype_reference = rpg_weapon.weapon_id
LEFT JOIN rpg_armour ON rpg_item.item_subtype_reference = rpg_armour.armour_id
LEFT JOIN rpg_collectable ON rpg_item.item_subtype_reference =rpg_collectable.collectable_id
The result based upon the above (ooops no weapons held by anyone) :-
A simple WHERE clause (WHERE player_id = ?) would restrict the list to a single player.
e.g. WHERE player_id = 2 would only list Fredrica's inventory.
You may well want to copy and paste the above and use it in an SQLite tool, there's quite a few around (I'm personally quite happy with SQLite Manager, other's would recommend other tools, all of the above was done using such a tool). There's probably quite a good chance that you could create the core Database access functionality just using such a tool.

Get all rows from group by query

I have database sqlite contain 2 tables:
names
n_data
and query:
select
n_data.id,n_data.value, count( n_data.id) as count
from
n_data
INNER JOIN names on names.id = n_data.name_id
group by
n_data.name_id
order by
n_data.id asc
In activity I have used
Cursor and while
while (res.moveToNext()) {
System.out.println("id=>"+res.getString(0)+" count=>"+res.getString(2)+" =value=>"+res.getString(1));
}
but result just show last row in group. How can I get all rows for every group?
CREATE TABLE "names" (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT );
INSERT INTO names (id,name) VALUES
(1,'name_1'),
(2,'name_2'),
(3,'name_3'),
(4,'name_4');
CREATE TABLE "n_data" (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name_id TEXT,
value TEXT );
INSERT INTO n_data (id,name_id,value) VALUES
(1,'3','value_8'),
(2,'2','value_7'),
(3,'2','value_6'),
(4,'2','value_5'),
(5,'1','value_4'),
(6,'1','value_3'),
(7,'1','value_2'),
(8,'1','value_1'),
(9,'3','value_9');

OP is satisfied by:
select
n_data.id,
group_concat(n_data.value) as 'all values',
count( n_data.id) as count
from
n_data INNER JOIN names
on names.id = n_data.name_id
group by n_data.name_id
order by n_data.id asc;
It uses group_concat(n_data.value) instead of n_data.value.
I.e. all the data.value which get counted by count(n_data.id) are concatenated.
Output (.headers on, .mode column and .width 3 32 6; SQLite 3.18.0 2017-03-28) :
id all values count
--- -------------------------------- ------
4 value_7,value_6,value_5 3
8 value_4,value_3,value_2,value_1 4
9 value_8,value_9 2
The tailored .width is needed, otherwise for id 8, only 3 values are shown, though 4 are retrieved.

How to optimize SQLite database fetching time

My database has 6384 records and I am using the below query:
SELECT T.t_name, S.s_code, S.s_name, R.s_code, R.s_name, M.arrival_time, L.arrival_time, M.dest_time, M.train_id, S.id, R.id
FROM TRAIN_SCHEDULE M,
TRAIN_SCHEDULE L,
TRAIN T,
STATION S,
STATION R
WHERE S.s_name = 'Versova'
AND R.s_name = 'Ghatkopar'
AND M.arrival_time > '00:00:00'
AND M.arrival_time < L.arrival_time
AND M.train_id = L.train_id
AND M.dest_time = L.dest_time
AND T.id = M.train_id
AND S.id = M.station_id
AND R.id = L.station_id
This query takes 8 second to fetch the data.
I have also indexed my tables, but fetching time is reduced to only 2 seconds.
Schema:
CREATE TABLE [STATION] (
[id] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[s_code] VARCHAR(10) NOT NULL,
[s_name] VARCHAR(50) NOT NULL);
CREATE TABLE TRAIN_SCHEDULE(
id INT,
station_id INT,
train_id INT,
arrival_time NUM,
departure_time NUM,
dest_time NUM
);
CREATE TABLE TRAIN(id INT,t_name TEXT);
CREATE INDEX idx_arrival_time ON train_schedule (arrival_time);
CREATE INDEX idx_dest_time ON train_schedule (dest_time);
CREATE INDEX idx_id ON train (id);
How can I improve this?

You can check with EXPLAIN QUERY PLAN which indexes are being used.
In this query, the database needs to scan through the STATION table; an index on the name column would improve this (although not by much with such a small table):
CREATE INDEX Station_Name ON STATION(s_name);
Also, lookups on the TRAIN_SCHEDULE table are done over multiple columns.
The query optimizer cannot use more than one index per table instance, so you should create a multi-column index.
And a column with a non-equality comparison must come last (see the documentation):
CREATE INDEX Schedule_Station_Train_DestTime_ArrivalTime
ON TRAIN_SCHEDULE(station_id, train_id, dest_time, arrival_time);
Also execute ANALYZE once to help the optimizer pick the right index.

SQLite subquery

I have a query with a subquery that returns multiple rows.
I have a table with lists and a table with users. I created a many-to-many table between these two tables, called list_user.
LIST
id INTEGER
list_name TEXT
list_description TEXT
USER
id INTEGER
user_name TEXT
LIST_USER
id INTEGER
list_id INTEGER
user_id INTEGER
My query with subquery
SELECT * FROM user WHERE id = (SELECT user_id FROM list_user WHERE list_id = 0);
The subquery works (and I use it in code so the 0 is actually a variable) and it returns multiple rows. But the upper query only returns one row, which is pretty logical; I check if the id equals something and it only checks against the first row of the subquery.
How do I change my statement so I get multiple rows in the upper query?

I'm surprised the = works in SQLite. It would return an error in most databases. In any case, you want the in statement:
SELECT *
FROM list
WHERE id in (SELECT user_id FROM list_user WHERE list_id = 0);

For a better performance, use this query:
SELECT LIST.ID,
LIST.LIST_NAME,
LIST.LIST_DESCRIPTION
FROM LIST,
USER,
LIST_USER
WHERE LIST.ID = LIST_USER.USER_ID = USER.ID AND
LIST.LIST_ID = 0

SQLite ORDER BY result of nested query

I have a table with multiple date fields, and I would like to select the rows with the dates closest to the current date, regardless of which column it is.
There are 2 tables in the database;
Covers:
create table covers (_id integer primary key autoincrement, "
+ "stallionName integer not null, mareName integer not null, firstCoverDate text not null, lastCoverDate text not null, "
+ "scan14Date text not null, scan28Date text not null, foalingDate text not null, inFoal integer not null, notes text not null," +
"FOREIGN KEY (stallionName) REFERENCES horses (_id), FOREIGN KEY (mareName) REFERENCES horses (_id))
stallionName and mareName are integers that reference the row _id of the relevant horse in the horse table (As such, the SQL query has multiple joins to get the names for the stallion and mare, instead of just the row _id). All date columns are of the form 'YYYY-MM-DD'
and horses:
create table horses (_id integer primary key autoincrement, "
+ "name text not null, type integer not null, birthDate text not null, vaccineDate text not null, "
+ "inFoal integer not null, notes text not null)
type is an integer referencing a spinner position (type is either mare, stallion, gelding etc)
I want to select 5 rows from the covers table with values of scan14Date, scan28Date and foalingDate that are nearest to today's date (i.e. the 5 most urgent rows)
This is my effort so far;
SELECT covers.* horses1.name AS stallionNameText, horses2.name AS mareNameText
FROM horses horses1 JOIN covers ON horses1._id = covers.stallionName JOIN horses horses2 ON horses2._id = covers.MareName
WHERE covers._id IN
(SELECT DISTINCT _id FROM
(SELECT covers.scan14Date AS date, covers._id
FROM covers
WHERE date > dateToday
UNION
SELECT covers.scan28Date AS date, covers._id
FROM covers
WHERE date > dateToday
UNION
SELECT covers.foalingDate AS date, covers._id
FROM covers
WHERE date > dateToday
ORDER BY date)
LIMIT 5)
(It can be assumed that 'dateToday' is also of the form YYYY-MM-DD. It's sorted out in java)
Now, the nested query below successfully selects the 5 'most urgent' row _ids;
(SELECT DISTINCT _id FROM
(SELECT covers.scan14Date AS date, covers._id
FROM covers
WHERE date > dateToday
UNION
SELECT covers.scan28Date AS date, covers._id
FROM covers
WHERE date > dateToday
UNION
SELECT covers.foalingDate AS date, covers._id
FROM covers
WHERE date > dateToday
ORDER BY date)
LIMIT 5)
And the first section successfully gets all the data for the row _id selected above;
SELECT covers.* horses1.name AS stallionNameText, horses2.name AS mareNameText
FROM horses horses1 JOIN covers ON horses1._id = covers.stallionName JOIN horses horses2 ON horses2._id = covers.MareName
WHERE covers._id IN
....
The issue I'm having is that the order of the row _ids returned by the nested query is lost in the overall query. If for example the nested query returns the row _ids (6, 3, 4, 12, 15) the overall query displays it in the order (3, 4, 6, 12, 15)
I would then like to return the stallion name, mare name, lastCoverDate, 14ScanDate, 28ScanDate and foalingDate.
My question is how can I maintain the order returned by the inner query? I naively tried to add ORDER BY date at the very end, but it predictably says there is no such column.
Thankyou for taking the time to read.
(Also, I'm sure that my SQL query won't be the most efficient way of doing it, but I'm fairly new to all this)

Include the date in the result of the inner query, then you can order by date in the outer query. Your Java wrapper can simply ignore the date field in the result.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

SQlite query for stop times in GTFS data - android

Related

Database Table Structure for a simple RPG

Get all rows from group by query

How to optimize SQLite database fetching time

SQLite subquery

SQLite ORDER BY result of nested query

Categories

Resources