I'm having a bit of trouble with a Firebase Query. I want to query for objects, where the objects child value contains a certain string. So far I have something that looks like this:
Firebase *ref = [[Firebase alloc] initWithUrl:#"https://dinosaur-facts.firebaseio.com/dinosaurs"];
[[[[ref queryOrderedByKey] queryStartingAtValue:#"b"] queryEndingAtValue:#"b~"]
observeEventType:FEventTypeChildAdded withBlock:^(FDataSnapshot *snapshot) {
NSLog(#"%#", snapshot.key);
}];
But that only gives objects that have a starting value of "b". I want objects that contains the string "b". How do I do that?
There are no contains or fuzzy matching methods in the query API, which you have probably already guessed if you've scanned the API and the guide on queries.
Not only has this subject been discussed ad nauseam on SO [1] [2] [3] [4] [5], but I've touched several times on why one should use a real search engine, instead of attempting this sort of half-hearted search approach.
There is a reason it's often easier to Google a website to find results than to use the built-in search, and this is a primary component of that failure.
With all of that said, the answer to your question of how to do this manually, since there is no built-in contains, is to set up a server-side process that loads/streams data into memory and does manual searching of the contents, preferably with some sort of caching.
But honestly, ElasticSearch is faster and simpler, and more efficient here. Since that's a vast topic, I'll defer you to the blog post on this subject.
Related
I'm messing around with Cloud Firestore. Trying to decide whether I should use it for my next project.
I would like to make a nested query, but all the tutorials and examples I found in the official documentation only query objects which are 2 levels deep and most of the time direct key/id calling.
I need something which is I believe called "nested query" I may be wrong on that one though, maybe it is not the correct phrase for such a thing in NoSQL which I just started to learn.
This is a skeleton/pilot app for a game where users can create characters. and I would like to query whether a character's name is already taken or not.
Here is my simple DB structure:
The main collection is named "users"
In "users" I have user documents.
In each user document, I have a collection named "characters"
In "characters" I have character documents.
In each character document there are two fields, name and level.
I tried it various ways with queries and the closest thing I could get was iterating through the whole thing which I believe is not the perfect solution.
Can somebody please help me to write an efficient nested query whether "Example Name" is already an existing character in the DB and tell me what is the correct way when you want to write like "infinitely deep" nested queries?
If each user document contains a sub-collection that has the same ("characters") name, then I think you are looking for a collection group query. So a query should look like this:
val queryByName = db.collectionGroup("characters").whereEqualTo("name", "Adam");
Don't also forget to create an index.
Besides that, Firestore is as fast as it is at level 1 is also at level 100. So no worries.
I understand that using .indexOn is better for performance but to understand further here's my question whether or not I should design my tree nodes differently.
Let's say I want to search for a name and see if it exists. I could have:
names
jack : true
john: true
or
people
UID1
name : jack
age : 10
If I had .indexOn at "name" in the "people" node. Would it have the same cost/performance as the first tree? The reason I ask is because I want to avoid making as many tree nodes as possible.
The cost for reading from the Firebase Realtime Database is based on the bandwidth that is transferred. In the first JSON, you'd only be reading true, while in the second snippet you'd end up reading the entire UID1 node. So that would be (marginally) more expensive.
If on the other hand, you also look up the user profile after reading jack: true from the first JSON, then that approach probably reads more data and would thus be (again: marginally) more expensive.
In the first JSON snippet, you can look jack directly based on their path, without needing a query. A direct lookup is the fastest way to read a node.
In the second JSON snippet you're going to need a query. When you have only a few users, the performance is going to be quite similar. But as the number of users grows, this query will start taking more time (even when you've defined an index to ensure it happens server-side).
But this performance difference won't be very noticeable until you have hundreds of thousands of users. Before that it is likely dwarfed by the impact of network performance.
I'm trying to perform a filter by pattern over a Firestore collection. For exemple, in my Firestore database I have a brand called adidas. The user would have an search input, where typing "adi", "adid", "adida" or "adidas" returns the adidas document. I pointed out several solutions to do this :
1. Get all documents and perform a front-end filter
var brands = db.collection("brands");
filteredBrands = brands.filter((br) => br.name.includes("pattern"));
This solution is obviously not an option due to the Firestore pricing. Moreover it could be quite long to perform the request if the number of documents is high.
2. Use of Elasticsearch or Algolia
This could be interesting. However I think this is a bit overkill to add these solutions' support for only a pattern search, and also this can quickly become expensive.
3. Custom searchName field at object creation
So I had this solution : at document creation, create a field with an array of possible search patterns:
{
...
"name":"adidas",
"searchNames":[
"adi",
"adida",
"adidas"
],
...
}
so that the document could be accessed with :
filteredBrands = db.collection("brands").where("searchNames", "array-contains", "pattern");
So I had several questions:
What do you think about the pertinence and the efficiency of this 3rd solution? How far do you think this could be better than using a third party solution as Elasticsearch or Algolia?
Do you have any other idea for performing pattern filter over a firestore collection?
IMHO, the first solution is definitely not an option. Downloading an entire collection to search for fields client-side isn't practical at all and is also very costly.
The second option is the best option considering the fact that will help you enable full-text search in your entire Cloud Firestore database. It's up to you to decide if it is worth using it or not.
What do you think about the pertinence and the efficiency of this 3rd solution?
Regarding the third solution, it might work but it implies that you create an array of possible search patterns even if the brand name is very long. As I see in your schema, you are adding the possible search patterns starting from the 3rd letter, which means that if someone is searching for ad, no result will be found. The downside of this solution is the fact that if you have a brand named Asics Tiger and the user is searching for Tig or Tige, you'll end up having again no results.
Do you have any other ideas for performing pattern filters over a Firestore collection?
If you are interested to get results only from a single word and using as a pattern the staring letters of the brand, I recommend you a better solution which is using a query that looks like this:
var brands = db.collection("brands");
brands.orderBy("name").startAt(searchName).endAt(searchName + "\uf8ff")
In this case, a search like a or ad will work perfectly fine. Besides that, there will be no need to create any other arrays. So there will be less document writing.
I have also written an article called:
How to filter Firestore data cheaper?
That might also help.
As there is no functionality of foreign Key in Firestore like that of MYSQL, so I am not able to replicate one of my important functionality that is to update a file in one place and it will reflect in every place. Also, Firebase has no functionality to update all the document's specific filed at once.
There are already these kinds of questions but I could not get my solution. Suppose I have a million documents containing a filed which is the density of a material. Later on, I found that my density value was wrong so how to update that value in all documents efficiently. Also, I do not want to use server/admin SDK.
If you need to change the contents of 1 million documents, then you will need to query for those 1 million documents, iterate the results, then update each of those 1 million documents individually.
There is no equivalent of a sql "update where" statement that updates multiple documents in one query. It requires one update per document.
If don't want to use the Admin SDK, then the option that you have is to update the value of your densityMaterial property on the client, which might not be the best solution. However, if you can divide the update operation in smaller chunks, you might succeed.
If you are using a POJO class to map each document, then you might be interested in my answer from the following post:
How to update one field from all documents using POJO in Firestore?
And if you are not using a POJO class, please check my answer from the following post:
Firestore firebase Android search and update query
Regarding the cost, you'll be billed with one write operation for every document that is updated. If all 1 MIL documents will be updated, then you'll be billed with 1 MIL write operations.
Edit:
Suppose I have a million documents containing a filed which is the density of a material. Later on, I found that my density value was wrong so how to update that value in all documents efficiently.
If all of those 1 MIL documents contain a property called densityMaterial, that holds the exact same value, it doesn't make any sense to store that property within each document. You can create a single document that contains that particular value, and in each and every document of those 1 MIL, simply add only a reference to that document. A DocumentReference is a supported data-type. Now, if you need to change that value, it will incur only a single document write.
However, if you have different values for the densityMaterial property and all of them are wrong, then you don't have a problem with the database, you have a problem with the mechanism/people that are adding data. It's not a matter of a database problem if you have added 1 MIL incorrect documents.
Why not chose MySQL?
MySQL cannot scale in the way Cloud Firestore does. Firestore simply scales massively.
Can I avoid this problem anyhow?
Yes, you can buy using a single document for such details.
I have read many forum (and stack overflow) posts regarding escaping characters and sanitizing user input, but I'd like to tie it all together and make it a little more specific to the Android platform. Here's my circumstance:
I have an Android app that communicates with a web service via SOAP XML messages. Here's a sample XML message that might be sent (I'm leaving out the SOAP envelope around it):
<Log>
<Summary>user entered text</Summary>
<Details>user entered text</Details>
</Log>
As you can see, there are 2 places a user can input text in a form that is then inserted into this message to be sent to the web service. I need to:
A) make sure it's valid XML and
B) make sure it doesn't contain any malicious SQL content.
Are there any pre-included utilities in the Android API to escape invalid XML chars (such as &) that the user may have entered? (So that I can simply say "escapeXML(xmlstring);" or something like that)
Is there any way to check for malicious SQL (or other code injection) or should that be handled on the server-side?
As a side note: I'd almost prefer that the user was only able to enter A-z, 0-9 and basic punctuation (so as to avoid weird unicode characters that can't even be seen or interpreted sometimes). Is there a good way to restrict user input to a subset of characters?
I know this is a couple questions built into one, so if you only know part of it, please provide an answer anyways and I will be more than happy to upvote or accept it. Thanks in advance for all the help! (StackOverflow is where I come when I've consumed way too many forum threads and have gotten myself all twisted around about what is appropriate in my circumstance)
The best way to deal with SQL Injection is using parameterized queries. This is done on the server side. Everything else is secondary, unnecessary or barely scratches the surface of the issue.
You should read these:
Safe DateTime in a T-SQL INSERT statement
problem in inserting the value in the database
http://www.codinghorror.com/blog/2005/04/give-me-parameterized-sql-or-give-me-death.html
On Jeff Atwood's blog, I like where he says:
Non-parameterized SQL is the GoTo statement of database programming.