Should primary key ever be reset? - android

Not sure if this has already been answered, and this is kind of a dumb question, but I'm kinda new to using SQL in android and I've made a simple task app using the language. In the app, I added a feature to delete all tasks. When I create a new one, the primary key keeps counting up. Now, there's nothing wrong with the app or the code or anything, but if all the tasks are deleted, should I reset the primary key, or is it bad practice to do so? If not, will it ever become large enough to provoke a crash?

I would generally keep it increment because it can simplify certain things like database backup/restores, and replication to other database nodes. It makes things more predictable when your rows are always unique by id.

From the SQLite documentation:
Except for WITHOUT ROWID tables, all rows within SQLite tables have a 64-bit signed integer key that uniquely identifies the row within its table.
How big is the largest 64 bit number? It is 9,223,372,036,854,775,807. This number is so large, that it is probably doubtful you will ever exceed it, unless you are doing very frequent and massive inserts. Actually, you might run out of storage space before you insert so many rows to even come close to this number.

Related

What's the price for referencial integrity of database tables on Android?

Well, the title says it all - does pragma foreign_keys = true to an existing database makes my database "less fast"?
Whenever data is added, removed, or changed, the database needs to check that the constraints still hold. The documentation lists some examples of how these checks look like.
In practice, it is very likely that you will need to access all the referenced tables anyway, so everything will be in the cache.
Furthermore, the biggest slowdown if a database write is the transaction overhead. So it is likely that the additional checking does not lead to a noticeable delay.
In any case, SELECT queries are not affected by foreign key constraints.
Although the accepted answer is sufficient, I want to put here one specific use case.
For anyone who has to imply foreign keys constraint to an existing database and probably needs to do some refactoring of the database, call setForeignKeyConstraintsEnabled in the onOpen method of the SQLiteOpenHelper.
From the book Android Enterprise:
The onOpen method is called only after onCreate, onUpgrade, and onDowngrade, leaving
those methods free to play fast and loose while they rebuild the schema. The occasion in
which an application must rename or recreate a table or two during an upgrade might be
one of the few times that it is truly a relief that foreign key constraints are not enforced.
Enabling foreign key constraints in the onOpen method causes them to be enforced only
after the database has been initialized.

Sharing Data between android app users

I have a bit of a theoretical question for which there is no code yet as I am still just in the thinking stage. I want to update an app to allow users to share their data with others through DropBox Datastore or something like that. However, when a user creates data which get populated into multiple sqlite tables on the device, each table has an auto-incremental integer as a primary key that is used as a foreign key in other tables to link the data.
If there is more than one user actually creating the data and sharing it then the primary key columns are obviously going to be an issue. If I download the data and store it locally I obviously can't insert user 1's key value in user 2's data table, firstly because of the auto-increment and secondly because user 2 might already have data that is not shared saved with that key value.
I have thought about a few options but nothing is particularly appealing or robust. I was thinking about creating a UUID to identify the device, that value would have to be stored in each of the tables and the primary key would be a combination of that column and the current primary key integer which would obviously have to have the auto-increment removed. So to pick up all related data from each table the id column and UUID column would both have to be used.
I feel like there must be a more robust method of achieving this though, any one have any better suggestions?
If I'm understanding well you need some sort centralised database in the cloud to communicate with your local app, is that right?
A client should never create the ids for such a system. A usual practice on these cases is to always have a remote id which is created by your DB in the cloud, and whenever you don't have this value yet, you can have a fallback value (local id created locally - which is different from the remote one).
So, to illustrate my words we could set the following example. Your app stores messages in database. Say you create messages with a local id 1,2,3. Those ids will never be meant to be unique in your central database in the cloud. Instead, you'd just use them as a local fallback. As soon as you can send those 3 messages to your centralised database, it'll give them 3 new remote ids you'll use for unique means (ie.: 35, 46, 54).
Note that when you have multiple requesters/users accessing one same database there's not such way to assure uniqueness unless you follow the explained approach, or you query a certain number of unique ids in advance and on demand to your database in the cloud.
Keep in mind, that the actual truth can be only delivered by the databases in your servers.

What is the best primary key strategy for an online/offline multi-client mobile application with SQLite and Azure SQL database as the central store?

What primary key strategy would be best to use for a relational database model given the following?
tens of thousands of users
multiple clients per user (phone, tablet, desktop)
millions of rows per table (continually growing)
Azure SQL will be the central data store which will be exposed via Web API. The clients will include a web application and a number of native apps including iOS, Android, Mac, Windows 8, etc. The web application will require an “always on” connection and will not have a local data store but will instead retrieve and update via the api - think CRUD via RESTful API.
All other clients (phone, tablet, desktop) will have a local db (SQLite). On first use of this type of client the user must authenticate and sync. Once authenticated and synced, these clients can operate in an offline mode (creating, deleting and updating records in the local SQLite db). These changes will eventually sync with the Azure backend.
The distributed nature of the databases leaves us with a primary key problem and the reason for asking this question.
Here is what we have considered thus far:
GUID
Each client creates it’s own keys. On sync, there is a very small chance for a duplicate key but we would need to account for it by writing functionality into each client to update all relationships with a new key. GUIDs are big and when multiple foreign keys per table are considered, storage may become an issue over time. Likely the biggest problem is the random nature of GUIDs which means that they can not (or should not) be used as the clustered index due to fragmentation. This means we would need to create a clustered index (perhaps arbitrary) for each table.
Identity
Each client creates it’s own primary keys. On sync, these keys are replaced with server generated keys. This adds additional complexity to the syncing process and forces each client to “fix” their keys including all foreign keys on related tables.
Composite
Each client is assigned a client id on first sync. This client id is used in conjunction with a local auto-incrementing id as a composite primary key for each table. This composite key will be unique so there should be no conflicts on sync but it does mean that most tables will require a composite primary key. Performance and query complexity is the concern here.
HiLo (Merged Composite)
Like the composite approach, each client is assigned a client id (int32) on the first sync The client id is merged with a unique local id (int32) into a single column to make an application wide unique id (int64). This should result in no conflicts during sync. While there is more order to these keys vs GUIDs since the ids generated by each client are sequential, there will be thousands of unique client-ids, so do we still run the risk of fragmentation on our clustered index?
Are we overlooking something? Are there any other approaches worth investigating? A discussion of the pros and cons of each approach would be quite helpful.
I've considered this question at length came to the decision that a GUID is usually the best solution. Here's a little information on why:
Identity
The Identity option sounds like it removes all the negatives, but having implemented a Single Page Web App that implemented this system, I can tell you it adds a significant amount of complexity to the code. A temporary id can spread through your client side data quite quickly, and it's really hard to create a system that has no holes in it when it comes to finding every single possible usage. It usually leads to application and data specific hard-coded information to track foreign keys on the client (which is tedious and error prone as the database changes and you forget to update this information). It also adds a lot of overhead to every sync, as it might have to run through multiple tables each sync to check for temporary ids. There might be a better way to implement this system, but I haven't seen a good approach that doesn't add a ton of complexity and possible ugly error states in your data.
Composite
The composite approaches also add a lot of complexity to your code in generating session ids and creating ids from them, and they don't really offer any advantages over GUIDs other than you can guarantee that it's unique - but the thing is, a GUID is theoretically unique, and while I was scared of the fact that there is a possibility of repeats, I realized that it was an infinitesimally small chance and there's actually a really easy method to handle the small possibility that it's not unique.
GUIDs
My biggest worries about using a GUID were
they have a large size and aren't traditional ints, which will make transferring large bits of data slower and degrade database performance
if you actually ever do run into a conflict, it can ruin your app, so you have to write complex code to handle a situation you will probably never use.
Then I realized that in an offline style web app, you're not usually transferring large amounts of data at once because it's all stored on the client.
You also don't worry about server database performance much either because that's done behind the scenes in a sync - you just worry about client side data performance.
Last, I realized that handling a conflict is really a trivial thing. Just test for a conflict and if you get one, create a new GUID on the server and continue with the operation. Then send a message back to the client that causes the client to throw up a little error message and then deletes all client side data and re-downloads it fresh from the server. This is really quick and easy to implement, and you probably already want this as a possible operation on an offline web app anyway. While it might sound inconvenient for the user, the likelihood of the user ever seeing this error is almost 0%.
Conclusion
In the end, I think for this type of app, GUID's are the easiest to implement and work the best with the least possibility for error and without creating overly complex code.
If your application doesn't have to run offline, but you have a client-side database for performance or other reasons, you can also consider throwing up a loading gif and pausing client side execution until the id is returned via ajax from the server.
The key (pun intended) thing to remember is to simply have a unique key for each object you are storing on the persistent store. How you handle the storage of that object is completely up to you and up to the methodology of how you access that key. Each of the strategies you listed have their own reasons for why they do what they do but in the end they are storing a key for a certain object in the db so all of its attributes can be changed while retaining the same object reference in the database.

Android: Use UUID as primary key in SQLite

My app needs to get synced with other app users (on there own devices). I also want to support offline editing, that are synchronized to the other collaborative users when the user gets connected to the internet.
So the User A changes (while he is offline) some data (in ohter words he would update database entries) or add new records to the database. When User A gets connected to the internet, all changes and new records are delivered to the other collaborative Users. So User B will get the changes/updates and can insert/update them into User Bs local device database.
But I need to ensure that the ids of the database entries are unique along the whole system. Therefore I need to use something like UUID.
My question: Is it a bad idea to use a UUID (String / Varchar) as primary key in a android sqlite database table instead of an integer that would be auto incremented?
I guess there would be performance issues by using strings (a UUID has 36 characters) as primary key.
I guess indexing uuids instead of integers takes longer (comparing string vs. comparing integers). I also guess that when Im using UUID, every time a new database record/entry has been inserted the database needs to reindex the primary key column, since they primary key index is not in a sorted order anymore (which would be when I would use integer auto increment primary key, because every future record is added at the end, because the new auto incremented primary key is always the greatest number so far, so the index will automatically be in sorted order). What i also need to do is JOINS over 2 - 3 tables. I also guess that comparing strings on JOINS instead of integer would slow down the database query.
However I cant see any other possibility to implement such a collaborative syncing system, so I must use UUID, right?
Another possibility would be to use a integer auto increment primary key and to use a second column uuid. So to work on the users local device, i would use this primary key (integer) for JOINS etc., while I would use the uuid column for syncing with the other users.
What do you guys think about that approach or is it in your opinion to much work, since you wont expect a big significant performance issue by ussing UUID directly as primary key?
Any other suggestions?
Is it a bad idea to use a UUID (String / Varchar) as primary key in a android sqlite database table instead of an integer that would be auto incremented?
The only for-certain problem that I can think of is that you will not be able to use CursorAdapter and its subclasses for displaying the results of queries on that table. CursorAdapter requires a unique integer _id column in the Cursor, and presumably you will not have one of those. You would have to create your own adapter, perhaps extending BaseAdapter, that handles it.
I guess there would be performance issues by using strings (a UUID has 36 characters) as primary key.
Possibly, but I will be somewhat surprised if it turns into a material problem on device-sized databases.
However I cant see any other possibility to implement such a collaborative syncing system, so I must use UUID, right?
You need some sort of UUID for your network protocol. Presumably, you will need that UUID in your database. Whether that UUID needs to be the primary key of a table, I can't say, because I don't know your schema.
Another possibility would be to use a integer auto increment primary key and to use a second column uuid. So to work on the users local device, i would use this primary key (integer) for JOINS etc., while I would use the uuid column for syncing with the other users.
Correct. You would have a UUID->local integer ID mapping table, use the UUIDs in your network protocol, and keep the local database mostly using the local integer IDs. Whether or not this will be a significant performance improvement (particularly given the increased database schema complexity), I can't say.
What do you guys think about that approach or is it in your opinion to much work, since you wont expect a big significant performance issue by ussing UUID directly as primary key?
IMHO, either run some performance tests so you get some concrete comparable data, or only worry about it if your database I/O seems sluggish.
One set of performance results for UUIDs as binary and text can be found in somewhat related UUID/SQLite question: https://stackoverflow.com/a/11337522/3103448
Per the results, both binary and string UUIDs can be efficient in SQLite for Create and Query when indexed. A separate trade-off is whether a human readable string is preferred to the smaller data size of binary file size.

Faster indexing in database and sort with criteria time

Well I have a database with few tables and I do not have any id. The fake id that I am using is datetime field (yyyy-mm-dd-hh-MM-ss) and this field is string. It works perfectly and it is great for sorting...BUT it is slow :), very slow I mean I do a very big mistake my 'primary' key is string, and the connection between the tables is made according this filed. That is because the datetime field is the only unique filed I had...
How should I make this faster, I mean it would be very easy if this is SQL database I would create foreign key and that it.
someone might say 'why don't you just use integer', well I have a lot of creation and deletions and if I use int it will be very complicated to keep the tables ordered and because of the frequent deletions the ids will look like 1,22,55,79...
I mean I do not know what is the right way to do it , that is why I am asking

Categories

Resources