Data structure:
houses (collection)
name (string)
users (map)
90c234jc23 (map)
percentage: 100% (string/number)
Rules:
allow read: request.auth.uid in resource.data.users;
The problem is when I try to query houses which user owns:
FirebaseFirestore.getInstance().collection(House.COLLECTION)
// .whereArrayContains(House.USERS_FIELD, currentUser.getUid()) // does not work
.whereEqualTo("users." + currentUser.getUid(), currentUser.getUid()) // does not work either
.get()
No result are returned.
You cannot perform this type of query in firestore as there is no 'map-contains-key' operator. However, there are very simple workarounds for implementing this by making slight adjustments to your datastructure.
Specific Solution
Requirement: For this solution to work, each map value has to be uniquely identifyable in a firestore query, meaning it cannot be a map or an array.
If your case meets the listed requirements, you can go with #Dennis Alund's solution which suggests the following data structure:
{
name: "The Residence",
users: {
uid1: 80,
uid2: 20
}
}
General Solution
If your map values are maps or arrays, you need to add a property to each value which will be constant across all created values of this type. Here is an example:
{
name: "The Residence",
users: {
uid1: {
exists: true,
percentage: 80,
...
},
uid2: {
exists: true,
percentage: 20,
...
},
}
}
Now you can simply use the query:
_firestore.collection('houses').whereEqualTo('users.<uid>.exists', true)
Edit:
As #Johnny Oshika correctly pointed out, you can also use orderBy() to filter by field-name.
You can use orderBy to find documents where map contains a certain key. Using this example document:
{
"users": {
"bob": {},
"sam": {},
}
}
.orderBy('users.bob') will only find documents that contain users.bob.
This query is not working because your users field is a map and not an array.
.whereArrayContains(House.USERS_FIELD, currentUser.getUid())
This query
.whereEqualTo("users." + currentUser.getUid(), currentUser.getUid())
is not working because your map value for users.<uid> is a string that says percentage: xx% and that statement is testing if percentage: xx% === <uid>, which is false.
And that strategy will be problematic since you can not do queries to find items that "are not null" or "strings not empty", etc.
I'm assuming that the percentage is the user's ownership in the house (?). If so, you might have better luck in trying to structure your house document data like this if you want to maintain the same structure of document as in your question
{
name: "The Residence",
users: {
uid1: 80,
uid2: 20
}
}
That will allow you to do a query such as
.whereGreaterThan("users." + currentUser.getUid(), 0)
to find users that has some shares of ownership in that house.
But a fair bit of warning, as soon as you need composite indexes you will start having problems to maintain that structure. You might instead want to consider storing an array of users that owns that house for ease of querying.
Related
I have a collection called "service". And the services have two attributes called, serviceableAt (where the service is available) and accessibleBy (who can access it).
service-1
serviceableAt
0 - Location-1
1 - Location-2
accessibleBy
0 - Seller
1 - Customer
I am trying to fetch all the services who are serviceable at Location-1 and accessible to Seller. So my query was:
fun services(at: String, by: UserRole) = firestore().collection(Refs.SERVICE_REF)
.whereArrayContains(Fields.SERVICEABLE_AT, at)
.whereArrayContains(Fields.ACCESSIBLE_BY, by)
Looks like multiple whereArrayContains are not supported. So what could be the alternate solution for temporarry basis until Firebase team comes up with a solution?
The common alternative would be to store the information as map fields:
servicableAtMap: {
"Location-1": true,
"Location-2": true
},
accessibleByMap: {
"Seller": true,
"Buyer": true
}
With the above you can now use equality conditions to find the matches, and you can have multiple equality filters in a query.
The downside of the above approach is that it will create/require a separate index for each sub-field, so that takes up more storage and contributes towards the maximum number of indexes you can have.
Say I have this kind of structure
A (collection): {
a (doc): {
name:'Tim',
B (collection):{
b (doc): {
color:'blue'
}
}
}
}
where A and B are collections while a and b are documents.
Is there a way to get everything contained in a root document with one query?
If I query like this
db.collection("A").doc("a").get()
I just gets name:'Tim' field. What I want is to also get all B's documents.
I basically wish my query returns
{
user:'Tim',
B (collection):{
b (doc): {
color:'blue'
}
}
}
Is it possibly or do I really need to make multiple queries one for each collection :/ ?
Say I have a really deep nested tree of collections representing the user profile, my costs will raise like hell since each time I load a user profile I have a multiplier of read requests 1 x N where N is the depth of my tree :/.
If you are concerned about costs of each pull, you will need to structure your data according to your common view / pull needs, rather than what you might prefer for a perfect structure. If you need to pull these things together every time, Consider using "maps" for things that do not actually need to be sub-collections with documents.
In this example, "preferences" is a map.
{
user: "Tim",
preferences: {
color: "blue",
nickname: "Timster"
}
}
Each document is also limited in size to 1MB - so if you need to store something for this user that will scale and continue to grow, like log records, then it would make sense to break logs into a sub-collection that only gets pulled when you want it, making each log entry a separate document... And whether all logs for all users are stored in a separate parent collection, or a sub-collection of each user really depends on how you will be pulling logs and what will result in fast speeds, balanced against costs of pulls. If you're showing this user their last 10 searches, then a search-log would make good sense as a sub-collection. If you're pulling all search data for all users for analysis, then a separate parent level collection would make sense because you can pull all logs in 1 pull, to prevent the need to pull logs from each user separately.
You can also nest your pulls and promises together for convenience purposes.
// Get reference to all of the documents
console.log("Retrieving list of documents in collection");
let documents = collectionRef.limit(1).get()
.then(snapshot => {
snapshot.forEach(doc => {
console.log("Parent Document ID: ", doc.id);
let subCollectionDocs = collectionRef.doc(doc.id).collection("subCollection").get()
.then(snapshot => {
snapshot.forEach(doc => {
console.log("Sub Document ID: ", doc.id);
})
}).catch(err => {
console.log("Error getting sub-collection documents", err);
})
});
}).catch(err => {
console.log("Error getting documents", err);
});
As we know querying in Cloud Firestore is shallow by default. This type of query isn't supported, although it is something Google may consider in the future.
Adding to Matt R answer, if you're using babel or you can use async/await, you can get the same result with less code(no catch/then):
// Get reference to all of the documents
console.log("Retrieving list of documents in collection");
let documents = await collectionRef.get();
documents.forEach(async doc => {
console.log("Parent Document ID: ", doc.id);
let subCollectionDocs = await collectionRef.doc(doc.id).collection("subCollection").get()
subCollectionDocs.forEach(subCollectionDoc => {
subCollectionDoc.forEach(doc => {
console.log("Sub Document ID: ", doc.id);
})
});
});
In my main page I have a list of users and i'd like to choose and open a channel to chat with one of them.
I am thinking if use the id is the best way and control an access of a channel like USERID1-USERID2.
But of course, user 2 can open the same channel too, so I'd like to find something more easy to control.
Please, if you want to help me, give me an example in javascript using a firebase url/array.
Thank you!
A common way to handle such 1:1 chat rooms is to generate the room URL based on the user ids. As you already mention, a problem with this is that either user can initiate the chat and in both cases they should end up in the same room.
You can solve this by ordering the user ids lexicographically in the compound key. For example with user names, instead of ids:
var user1 = "Frank"; // UID of user 1
var user2 = "Eusthace"; // UID of user 2
var roomName = 'chat_'+(user1<user2 ? user1+'_'+user2 : user2+'_'+user1);
console.log(user1+', '+user2+' => '+ roomName);
user1 = "Eusthace";
user2 = "Frank";
var roomName = 'chat_'+(user1<user2 ? user1+'_'+user2 : user2+'_'+user1);
console.log(user1+', '+user2+' => '+ roomName);
<script src="https://getfirebug.com/firebug-lite-debug.js"></script>
A common follow-up questions seems to be how to show a list of chat rooms for the current user. The above code does not address that. As is common in NoSQL databases, you need to augment your data model to allow this use-case. If you want to show a list of chat rooms for the current user, you should model your data to allow that. The easiest way to do this is to add a list of chat rooms for each user to the data model:
"userChatrooms" : {
"Frank" : {
"Eusthace_Frank": true
},
"Eusthace" : {
"Eusthace_Frank": true
}
}
If you're worried about the length of the keys, you can consider using a hash codes of the combined UIDs instead of the full UIDs.
This last JSON structure above then also helps to secure access to the room, as you can write your security rules to only allow users access for whom the room is listed under their userChatrooms node:
{
"rules": {
"chatrooms": {
"$chatroomid": {
".read": "
root.child('userChatrooms').child(auth.uid).child(chatroomid).exists()
"
}
}
}
}
In a typical database schema each Channel / ChatGroup has its own node with unique $key (created by Firebase). It shouldn't matter which user opened the channel first but once the node (& corresponding $key) is created, you can just use that as channel id.
Hashing / MD5 strategy of course is other way to do it but then you also have to store that "route" info as well as $key on the same node - which is duplication IMO (unless Im missing something).
We decided on hashing users uid's, which means you can look up any existing conversation,if you know the other persons uid.
Each conversation also stores a list of the uids for their security rules, so even if you can guess the hash, you are protected.
Hashing with js-sha256 module worked for me with directions of Frank van Puffelen and Eduard.
import SHA256 from 'crypto-js/sha256'
let agentId = 312
let userId = 567
let chatHash = SHA256('agent:' + agentId + '_user:' + userId)
I've read the Firebase docs on Stucturing Data. Data storage is cheap, but the user's time is not. We should optimize for get operations, and write in multiple places.
So then I might store a list node and a list-index node, with some duplicated data between the two, at very least the list name.
I'm using ES6 and promises in my javascript app to handle the async flow, mainly of fetching a ref key from firebase after the first data push.
let addIndexPromise = new Promise( (resolve, reject) => {
let newRef = ref.child('list-index').push(newItem);
resolve( newRef.key()); // ignore reject() for brevity
});
addIndexPromise.then( key => {
ref.child('list').child(key).set(newItem);
});
How do I make sure the data stays in sync in all places, knowing my app runs only on the client?
For sanity check, I set a setTimeout in my promise and shut my browser before it resolved, and indeed my database was no longer consistent, with an extra index saved without a corresponding list.
Any advice?
Great question. I know of three approaches to this, which I'll list below.
I'll take a slightly different example for this, mostly because it allows me to use more concrete terms in the explanation.
Say we have a chat application, where we store two entities: messages and users. In the screen where we show the messages, we also show the name of the user. So to minimize the number of reads, we store the name of the user with each chat message too.
users
so:209103
name: "Frank van Puffelen"
location: "San Francisco, CA"
questionCount: 12
so:3648524
name: "legolandbridge"
location: "London, Prague, Barcelona"
questionCount: 4
messages
-Jabhsay3487
message: "How to write denormalized data in Firebase"
user: so:3648524
username: "legolandbridge"
-Jabhsay3591
message: "Great question."
user: so:209103
username: "Frank van Puffelen"
-Jabhsay3595
message: "I know of three approaches, which I'll list below."
user: so:209103
username: "Frank van Puffelen"
So we store the primary copy of the user's profile in the users node. In the message we store the uid (so:209103 and so:3648524) so that we can look up the user. But we also store the user's name in the messages, so that we don't have to look this up for each user when we want to display a list of messages.
So now what happens when I go to the Profile page on the chat service and change my name from "Frank van Puffelen" to just "puf".
Transactional update
Performing a transactional update is the one that probably pops to mind of most developers initially. We always want the username in messages to match the name in the corresponding profile.
Using multipath writes (added on 20150925)
Since Firebase 2.3 (for JavaScript) and 2.4 (for Android and iOS), you can achieve atomic updates quite easily by using a single multi-path update:
function renameUser(ref, uid, name) {
var updates = {}; // all paths to be updated and their new values
updates['users/'+uid+'/name'] = name;
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.once('value', function(snapshot) {
snapshot.forEach(function(messageSnapshot) {
updates['messages/'+messageSnapshot.key()+'/username'] = name;
})
ref.update(updates);
});
}
This will send a single update command to Firebase that updates the user's name in their profile and in each message.
Previous atomic approach
So when the user change's the name in their profile:
var ref = new Firebase('https://mychat.firebaseio.com/');
var uid = "so:209103";
var nameInProfileRef = ref.child('users').child(uid).child('name');
nameInProfileRef.transaction(function(currentName) {
return "puf";
}, function(error, committed, snapshot) {
if (error) {
console.log('Transaction failed abnormally!', error);
} else if (!committed) {
console.log('Transaction aborted by our code.');
} else {
console.log('Name updated in profile, now update it in the messages');
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.on('child_added', function(messageSnapshot) {
messageSnapshot.ref().update({ username: "puf" });
});
}
console.log("Wilma's data: ", snapshot.val());
}, false /* don't apply the change locally */);
Pretty involved and the astute reader will notice that I cheat in the handling of the messages. First cheat is that I never call off for the listener, but I also don't use a transaction.
If we want to securely do this type of operation from the client, we'd need:
security rules that ensure the names in both places match. But the rules need to allow enough flexibility for them to temporarily be different while we're changing the name. So this turns into a pretty painful two-phase commit scheme.
change all username fields for messages by so:209103 to null (some magic value)
change the name of user so:209103 to 'puf'
change the username in every message by so:209103 that is null to puf.
that query requires an and of two conditions, which Firebase queries don't support. So we'll end up with an extra property uid_plus_name (with value so:209103_puf) that we can query on.
client-side code that handles all these transitions transactionally.
This type of approach makes my head hurt. And usually that means that I'm doing something wrong. But even if it's the right approach, with a head that hurts I'm way more likely to make coding mistakes. So I prefer to look for a simpler solution.
Eventual consistency
Update (20150925): Firebase released a feature to allow atomic writes to multiple paths. This works similar to approach below, but with a single command. See the updated section above to read how this works.
The second approach depends on splitting the user action ("I want to change my name to 'puf'") from the implications of that action ("We need to update the name in profile so:209103 and in every message that has user = so:209103).
I'd handle the rename in a script that we run on a server. The main method would be something like this:
function renameUser(ref, uid, name) {
ref.child('users').child(uid).update({ name: name });
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.once('value', function(snapshot) {
snapshot.forEach(function(messageSnapshot) {
messageSnapshot.update({ username: name });
})
});
}
Once again I take a few shortcuts here, such as using once('value' (which is in general a bad idea for optimal performance with Firebase). But overall the approach is simpler, at the cost of not having all data completely updated at the same time. But eventually the messages will all be updated to match the new value.
Not caring
The third approach is the simplest of all: in many cases you don't really have to update the duplicated data at all. In the example we've used here, you could say that each message recorded the name as I used it at that time. I didn't change my name until just now, so it makes sense that older messages show the name I used at that time. This applies in many cases where the secondary data is transactional in nature. It doesn't apply everywhere of course, but where it applies "not caring" is the simplest approach of all.
Summary
While the above are just broad descriptions of how you could solve this problem and they are definitely not complete, I find that each time I need to fan out duplicate data it comes back to one of these basic approaches.
To add to Franks great reply, I implemented the eventual consistency approach with a set of Firebase Cloud Functions. The functions get triggered whenever a primary value (eg. users name) gets changed, and then propagate the changes to the denormalized fields.
It is not as fast as a transaction, but for many cases it does not need to be.
I am using a FirebaseRecyclerAdapter to inflate a RecyclerView with data provided by the Firebase Realtime Database.
I began sorting the nodes by their child date which was set to be the value of ServerValue.TIMESTAMP. I added a indexOn property to the parent node of the nodes I want to sort with the value date to the Firebase Database Rules.
"parent-node": {
".read": "auth != null",
".write": "auth != null",
".indexOn" : "date",
...
},
This worked fine but newer nodes were added to the end of my RecyclerView. As the FirebaseRecyclerAdapter and its FirebaseArray add nodes with a ChildEventListener, it means that the data was sorted from oldest to newest.
I managed to reverse this by using the negative value of ServerValue.TIMESTAMP.
private void prepareUpload() {
//mDatabase is a reference to the root of the Firebase Database
final DatabaseReference timestampReference = mDatabase.child("timestamp");
final String timestampKey = timestampReference.push().getKey();
timestampReference.child(timestampKey).setValue(ServerValue.TIMESTAMP);
timestampReference.child(timestampKey).addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(DataSnapshot dataSnapshot) {
if (dataSnapshot.getValue() != null) {
if (!(Long.parseLong(dataSnapshot.getValue().toString()) < 0)) {
timestampReference.child(timestampKey).setValue(0 - Long.parseLong(dataSnapshot.getValue().toString()));
} else {
upload(/*Starting upload with new timestamp here (this is just a dummy method)*/);
timestampReference.child(timestampKey).removeValue();
}
}
}
#Override
public void onCancelled(DatabaseError databaseError) {
Log.e(TAG, databaseError.getMessage());
}
});
}
Now the nodes I want to sort look like this.
{
"-K_tlLWVO21NXUjUn6ko" : {
"date" : -1483806697481,
"debug" : "old",
...
},
"-K_tmjVqTUcKXHaQDphk" : {
"date" : -1483807061979,
"debug" : "newer",
...
},
"-K_uC-AJIvDOuBzhJ3JJ" : {
"date" : -1483813945897,
"debug" : "newest",
...
}
}
They are sorted from newest to oldest and get added to the top of my RecyclerView exactly as I wanted it to be.
Coming to my personal question:
Is this a proper way to sort nodes from newest to oldest? This seems like a big workaround to me.
Am I missing something?
Thanks in advance!
Edit: I just saw that my prepareUpload() method is uselessly sending the negative value to the database again, just to receive it once again afterwards.. I will change this to calculate the negative value on client side. Please ignore this.
No, you are not missing anything. And yes, that's a proper way to sort nodes from newest to oldest.
That's just the way firebase works. Many times, you might feel like you're doing a big workaround and you think "There must be a simpler way to do this", but there's probably not.
Firebase is indeed a great platform to build your application on. But it still has a few things to improve, like the realtime database. I mean, there is no way to query your data properly. You may be able to do some basic filtering and sorting, but If you want to do more than that, you're on your own.
To help us with that, they have created the Firebase Database for SQL Developers Series which you can watch on this link. Those series explain that there is no proper/default way to structure your data on Firebase. Everytime you start a different project, you'll always have to spend some minutes brainstorming about how you can structure the data in a way that it simplifies and reduces queries for better performance.
I hope Google some day implements more searching and filtering capabilities in this data service.
What you have done is absolutely correct. The only difference is, you should have both, a timestamp and a negativeTimestamp key just in case you need either of them. It all depends on your requirements anyways.
In-fact people from firebase suggest the same. You can watch this youtube video and move to 04:40 to see the answer to your question from the firebase team themselves.
In terms of getting the sort order that you want, yes, inverted time stamps are the answer. Check out this older answer: https://stackoverflow.com/a/25613337/4816918 for more information.
The .indexOn key is only for performance optimizations (see the bottom of the "Index Your Data" page, so the correct way to request ordered data is by using .orderByChild,.orderByKey, or .orderByValue in your query. Look for the "Sort data" section of the "Work with Lists of Data on Android" page.