I'm not sure if this question belongs here, as it is solely based on theory, however I think this fits best in this stackexchange compared to the rest.
I have 500,000 taxis with Android 4 computers inside them. Everyday, after one person or party makes a trip, the computer sends the information about the trip to the Node.js server. There are roughly 35 trips a day, so that means 500,000 taxis * 35 trips = 17,500,000 reports sent a day to the Node.js server. Also, each report has roughly 4000 characters in it, sized around 5KB.
The report that the taxi computers send to the node.js server is just an http post. Node.js will then send back a confirmation to the taxi. If the taxi does not receive the confirmation for report A in an allotted amount of time, it will resend report A.
The node.js server simply receives the report. Sends the confirmation back to the taxi. And then sends the full report to the MongoDB.
One potential problem : Taxi 1 sends report A to node.js. Node.js does not respond within the allotted time, so Taxi 1 resends report A to node.js. Node.js eventually processes everything and sends report A twice to MongoDB.
Thus MongoDB is in charge of checking whether or not it received multiple of the same reports. Then MongoDB inserts the data.
I actually have a couple of questions. Is this too much for NodeJS to handle (I don't think so, but it could be a problem)? Is this too much for MongoDB to handle? I feel like checking for duplicate reports may severely hinder the performance.
How can I make this whole system more efficient? What should I alter or add?
First potential problem is easy to overcome. Calculate a hash of the trip and store them in mongo. Put the key on that field and then compare every next document if the same hash exists. This way checking for duplicate will be extremely easy and really fast. Keep in mind that this document should not have something like time of sending in it.
Second problem: 17,500,000/day is 196/second nontheless sound scary but in reality it is not so much for decent server and for sure is not a problem for Mongodb.
It is hard to tell how to make it more efficient and I highly doubt you should think about it now. Give it a try, do something, check what is not working efficiently and come back with specific questions.
P.S. not to answer all this in the comments. You have to understand that the question is extremely vague. No one knows what do you mean by trip document and how big is it. It can be 1kb, It may be 10Mb, it can be 100Mb (which is bigger then 16 Mb mongodb limit). No one knows. When I told that 196 documents/sec is not a problem, I did not said that exactly this amount of documents is the maximum cap, so even if it will be 2, 3 times more it is still sounds feasible.
You have to try it yourself. Take avarage amazon instance and see how many YOUR documents (create documents which are close to your size and structure) it can save per second. If it can not handle it, try to see how much it can, or can amazon big instance handle it.
I gave you a rough estimate that this is possible, and I have no idea that you want to "include admins using MongoDB, to update, select,". Have you told this in your question?
Related
Our app with more than a million subscribers is facing huge delivery issues with FCM. It has become worse lately and the service is hardly working anymore. We are receiving errors like:
{ code: 'messaging/message-rate-exceeded',
message: 'Topic quota exceeded.' },
codePrefix: 'messaging' }
We get this error a lot. And it seems to be worse during EU / US evenings. In some cases over 90% of the notifications are failing.
We are in contact with the firebase support team, but so far there seems to be no solution. The gave us lots of information with some useful facts though:
resources are shared between developers. So the max message rate can be different because of other developers taking up resources.
OR queries should be converted to multiple AND queries because OR queries actually generate messages to all of user base, and then the filtering condition is applied
240 messages/minute and 5,000 messages/hour to a single device.
limit upstream messages at 15,000/minute per project (we don't understand this one)
limit upstream messages per device at 1,000/minute
They also updated their docs at https://firebase.google.com/docs/cloud-messaging/concept-options#topics_throttling
So we are aware of message rate limits and fanouts mechanism. In our case we have an approximate of 6000 different topic send requests per hour and on average 10k subscribers per topic.
A single user will never get more than 50-100 notifications per hour.
We believe we are not hitting the limits set by FCM.
Back in the GCM time everything worked fine. So we are quite unhappy about the current situation. The core functionality of the app is really bad right now. And a solution seems to be not there.
We are considering switching to a SSE solution.
There is a story about someone who succesfully moved away from FCM
https://f-droid.org/en/2018/09/03/replacing-gcm-in-tutanota.html
But since Google has made it very difficult lately to have background processes running, I wonder what other people with similar experience did.
Or can we still fix this situation?
One such alternative is Cloud Alert - it can replace FCM, provides high throughput and unlimited messages. It uses a background job and maintains its own connection to its dedicated servers. While a free plan is present, your 1 million connection requirement would put you into the paid bracket.
Disclosure: I work for Cloud Alert.
Though as a beginner, I am trying to develop an android app which is story based, i will like to know the best way to serve content to the user, i mean like a continuous update of content, just like updating news by the hour. As users will install just once, how will they get the latest content of my news or story based app.
I have access to domain names and hosting if it requires uploading such content through a domain.
from your experience, what is the best method to achieve this? I humbly await a response, thanks
So given the clarification in the comments this is the answer:
The best way is PUSHING the content to the user's device.
Generally speaking, the two ways for a new content to reach an app are :
1.polling your server (or any third-party server) for new data every,say, 20 minutes. The disadvantage of this method is that it drains the battery. Every time that the phone connects to the internet, the radio in that phone stays on (or in a standby mode) for something like 2 minutes. Those modes (on and standby) drain the battery. Another problem is that it does use data needlessly and in some countries cellular data is expensive (Canada for example).
This could be a solution if the data changes very very frequently (for example a stock's price can change many many times a day). But generally speaking method 2 is the preferred method..
2.Pushing the content to the user's phone.
Your server will send a message to the device once new data that you want to send is there (and you could also put that data in the payload of the message if it's not too much).
This means that the phone will connect only when some new data is available.
Saves battery life,and gets the information as soon as it is available!
I recommend using GCM (Google Cloud Messaging) for this purpose which is free, and simple to use. If you have no idea how to do that in Android (which is likely since you said that you are a beginner), it is explained really well in Udacity's
Advanced Android App Development. It is a free course by Udacity and Google, but the section about GCM is only about 15 minutes long.
If you know how to implement a server but don't know how to use GCM in your server (and you don't find Google's documentation helpful) do let me know..
If you don't know how to implement a server...well then it's something you will have to learn in order to get your content to your users as that's the best way.
I hope this helps! :)
We are building an application which requires good amount of data exchanges between different users. We are using SQLite to store the info and Rest api to exchange data with server.
To ensure high performance and less CPU /memory hogging but to also maintain good user experience we need following suggestions:
1 We tried running sync at frequency of 30 seconds but it hogs resources.Is there any client side framework which can be used to sync sqlite with MySQL or we have to only plan all possible events for same
2 How does applications like Gmail /twitter work- do they sync only on demand or keep on syncing in background. I feel it is on demand but not sure.
3 Notifications should be server side or client side (based on updates in sqlite). In whatsapp I observed it is client side only. If I do not click a received message I keep on getting the notification about same
4 IF we keep notifications server side and sync on demand basis. then on clicking a new notification when app will open up at that time should we make a sync call
Need an expert opinion that such applications should be designed to manage sync and notifications in such a way that it does not hogs resources and also gives online kind of experience to customer
Your question is pretty broad, but I'll at least give you a direction to start.
I've run local databases in iOS and Android that are over 100 MB without incident. SQLite should never be your problem, if you use it correctly. Even with 100,000 rows of data, it is fast and efficient. Where people get into trouble is by not properly indexing the data or over-normalizing the data. A quick search can tell you how to use indexes to optimize your queries, so I won't go into that any further. Over-normalization seems to be not fully understood, so I'll go into a bit more depth on it.
When designing a server database, it is important to minimize the amount of duplicate data. This often is done by breaking up a single record into multiple tables and using foreign keys. On a 1 GB database, this type of data normalization may save 20%, which is fairly significant. This gain in storage comes at the cost of perform. Sequential lookups and joins are frequently necessary to get complete data. On a server, there are plenty of CPU cycles and memory, and no one really notices if a request takes an extra millisecond or two.
A mobile app is not a full database server. The user is actively staring at the screen waiting for the app to respond. Additionally, the CPU and memory available are minimal, which makes that delay take even longer. To add insult to injury, mobile databases are only a small portion the size of a server database and duplicate data is already pretty minimal. The same data normalization that may have saved 200 MB (20% of 1 GB) server size, may now only save 5% of 10 MB, or 500 KB. That minor gain is not worth the effort or performance hit.
When you sync, you do not need a full data set each time. You only need to get data that has changed since the last sync. In many cases, that will be no change at all. You should find a way to identify what the device has on it now and only get the changes.
It is important that the UI does not stall waiting for the network request. All network activity should be done on a background thread and notify the UI to refresh once the sync completes.
Lastly, I'll mention that SQLite is NOT thread safe. It is important to limit concurrency with your database access.
I know this is a somewhat abstract question and that an answer depends on the device, network, user preferences and so forth. Nevertheless I'm really in need of some educated opinion with regards to how "excessive" I can allow my polling to be. Let's say one of my typical polling requests consists of an empty request body (a simple GET) and a couple of hundred kilobytes in response, would a service that polls for this every 4 hours be in the category of excessive?
I'm in the unfortunate situation of not being able to use C2DM, so please no answers suggesting this.
Let's say one of my typical polling requests consists of an empty request body (a simple GET) and a couple of hundred kilobytes in response, would a service that polls for this every 4 hours be in the category of excessive?
I wouldn't think so. Every 4 minutes would be unpleasant from a battery standpoint. Ideally (IMHO), make the polling period be configurable via a preference.
Also, you might wish to watch Reto Meier's Google I|O 2012 presentation, particularly the "Efficiency" section, which gets into lots of good low-level battery consumption stuff.
Also also, with that payload size, IMHO battery is less of an issue than the bandwidth cost for those on pay-as-you-go plans. You're talking ~1MB/day, ~30MB/month. That's probably not unreasonable, but it would be nice if the user understood why you're downloading all that data (vs., say, some sort of diff or delta approach), and it would be nice if the user could throttle your behavior within your app. Otherwise, the user might elect to throttle you from Settings on Android 4.x+ devices.
I'm faced with another dilemma, with regards to synchronizing (or updating?) data across to the server, from a mobile device (using Android).
I've looked into SyncML as the standard for doing this, but my big concern is that we plan on syncing a large amount of data across (not just 1 record), and probably only doing it once, twice or at most 3 times a day, or maybe not even once a day - all dependant on certain circumstances.
The other thing - the device or server will still be able to function properly without having to sync across. The sync would just be an update, essentially.
From reading up on the SyncML specs, it applies more to syncing across small pieces of data, and at a very fast interval (ie. every 5-15 minutes, but I guess can be regulated by the user). At any rate, the synchronization process is more involved, and important for both the device and the server (more so the device, I guess).
Here's a quote from the documentation that got me thinking:
2.2.3 Data synchronization SyncML is oriented towards synchronization of
small independent records, as the
modied records are transmitted
entirely. This is adequate for address
entries, short messages and similar
data. On the primary target of SyncML,
mobile devices, most data is of this
type. The devices must be able to keep
track which of their records have been
changed. Each record is identied by a
unique ID, so con icts can be detected
quite simple. As the record ID's may
not be arbitrarily chosen but
automatically created, mapping between
server and client ID's is dened in
the protocol. Mapping is always
managed by the server. When the client
receives a new item from the server,
he can send a map update command to
tell the server what ID he assigned to
the item. Now the server uses the
client ID in all his messages.
So, I guess my question is whether we should continue to look into SyncML for this, or built an in-house solution - maybe something more tailored to delivering large pieces of data across, which can define it as well?
I'm faced with the problem too. I prefer syncml solution, mainly because it's more extensible.
The data tables we want to sync are indeterminate, syncml may be a better choice.