Let's say have an app that has 10s of millions of installs and 10s of thousands of active users at a given point of time. I need to log my users' activity data to my servers. Currently, I make HTTP requests from the device to my servers. I have a bunch of machines running a web server, sitting behind amazon's ELB. They parse the data coming from the devices and put it in mongodb.
Now, I would like to capture device data by using upstream CCS provided by Google' GCM (so that I can piggyback on GCM for more reliable delivery of data) I have written a prototype XMPP server and I can make whole thing work, but I am worried about scaling it up. What will happen if Google starts sending me messages at a rate faster than I can consume? Earlier, I was able to use multiple servers behind load balancer to tackle high request rate. Is there a concept of load balancing here?
If I open multiple connections from my server to Google's server (Google says I can have till 1000 connections for a given sender id), will the incoming requests be load balanced between these connections?
Finally, is there recommended solution which takes care of solving most of the problems above? Will using ejabberd solve some of the problems above?
Thanks a bunch.
What will happen if Google starts sending me messages at a rate faster than I can consume?
At the end https://developers.google.com/cloud-messaging/ccs you may read
Conversely, to avoid overloading the app server, CCS stops sending if there are too many unacknowledged messages. Therefore, the app server should "ACK" upstream messages, received from the client application via CCS, as soon as possible to maintain a constant flow of incoming messages. The aforementioned pending message limit doesn't apply to these ACKs. Even if the pending message count reaches 100, the app server should continue sending ACKs for messages received from CCS to avoid blocking delivery of new upstream messages.
In the same document, you find partial answer to your second and third questions
If at any point the connection fails, you should immediately reconnect. There is no need to back off after a disconnect that happens after authentication.
For me it means, that Google implemented a simple redundancy logic and probably not a fair load balancing system (anyway I hope so). If you have that high volumes, I suggest you to contact them directly.
For the last ones, ejabberd is a good product, there are a lot of deployed systems with a clustered infrastructure and a plenty of documents on how do taht. I suggest you to start from here http://docs.ejabberd.im/admin/guide/clustering/ .
Anyway, for your high volumes I would evaluate RabbitMQ which is another Erlang jewel.
ejabberd can be clustered and placed behind a load balancer to distribute connections. A 3 or 4 server cluster should be able to handle that load fine and give you fail over protection. You can add servers if needed. Once you get close to 10 servers you may want to consider using Redis for the in memory DB rather than mnesia.
Related
As for clarification, this question is not duplicated since the
situation differs from other related questions.
We are working on an client side application which will receive data from a server side PHP-powered web application. Data are critical and must be delivered to user as soon as possible. It doesn't matter if client request for data from server or server push data to client, the only thing we need is a reliable and fast option.
There are several methods but non of them fit our project:
Use GCM push notification ability:
This is a great option but in practice, we lost several pushes so it's not reliable and in other hand, delay is so much. I repeat, the situation is critical so it must be fast.
Request data from server by the client with a 1 or 2 second interval:
This is what we think is the best solution so far but is really expensive. It's reliable and fast. But in other hand, the pressure on our disturbed servers get extremely high and they become useless even with our current client numbers. If the number of clients get larger, we'll be down.
SMS based push:
The other option for us is to send SMS to client phones and use that data to operate application. Using this method, the pressure on our server will get really low (just like GCM option). But sending SMS in our countries mobile network is usually delayed, normally, 10 seconds. Although this option have good reliability but the speed so low that we can't use it.
FM radio signal based push:
We can use clients FM radio receiver to get data from local broadcasting stations. This method is reliable and very fast but the cost of stations will kill us! and even if we handle it (read: we can't), clients does not connect their earphone to the smartphones always.
So, what are the alternatives? what is a reliable and almost fast method which does not make a lot of pressure on our servers?
Would probably recommend using WebSockets for case you describe (using OkHttp library for example) - see following for nice overview of it's use https://medium.com/#ssaurel/learn-to-use-websockets-on-android-with-okhttp-ba5f00aea988. A common pattern would be use of WebSockets with Http REST requests (for an initial catch up query for example). Also you would typically only use WebSockets while app was in foreground and rely on push notifications otherwise.
I know, it's so. But I don't understand, why?
Why not simply send queries to server periodically? Sure, it may discharge battery and increase internet traffic. I understand it. But how usage of Google Cloud Messaging can eliminate this problems?
I have found an answer. But it isn't pretty clear for me.
Can anyone give me a clear explanation?
Let's say you have 50 applications on your phone that do not use GCM. Each app developer decides it is appropriate to poll their respective backend once a minute.
Since these are all separate applications, each call will likely not happen at the same time as another api call. The biggest kill to battery is when the radio within an android device has to turn back on after being off to make an API call, so multiple calls happening with blocks of time in between drains battery faster (read this article on the radio state machine to better understand why this is https://developer.android.com/training/efficient-downloads/efficient-network-access.html)
In addition, each application will be hitting a separate endpoint. Each time you make an API call, you have to go through the connect process for a given server. With batched api requests or HTTP 2.0, multiple calls going to the same server can be optimized by not having to re-do a handshake or the connect process.
Now imagine, all 50 applications used GCM. GCM will poll an endpoint at some regular time interval on behalf of all 50 apps. Let's say GCM polls once a minute to a server that all the respective apps' backends send their notifications to to send to a device. You have reduced 50 different oddly timed API calls that are likely turning on and off the battery to one api call. You will use less data for polling. You do not incur the cost of the connect step of an HTTP call to 50 different servers. In addition, google is using the same polling already in place checking for OS updates, so there is no additional network overhead from using GCM (this info is based on old docs What technology does GCM (Google Cloud Messaging) use?)
Also, see this explanation straight from the Android website in an article entitled "Minimizing the Effects of Regular Updates" (http://developer.android.com/training/efficient-downloads/regular_updates.html):
Every time your app polls your server to check if an update is required, you activate the wireless radio, drawing power unnecessarily, for up to 20 seconds on a typical 3G connection.
Google Cloud Messaging for Android (GCM) is a lightweight mechanism used to transmit data from a server to a particular app instance. Using GCM, your server can notify your app running on a particular device that there is new data available for it.
Compared to polling, where your app must regularly ping the server to query for new data, this event-driven model allows your app to create a new connection only when it knows there is data to download.
The result is a reduction in unnecessary connections, and a reduced latency for updated data within your application.
GCM is implemented using a persistent TCP/IP connection. While it's possible to implement your own push service, it's best practice to use GCM. This minimizes the number of persistent connections and allows the platform to optimize bandwidth and minimize the associated impact on battery life.
I am an android user and of course I use whatsapp, twitter for android, facebook and many other apps that notify me of events.
As a proogramer whats keeps me wondering is how fast notifications or whatsapp messages arrive.
My intuition tells me that is not possible for the whatsapp or twitter server to open a TCP connection with my cellphone by a given port to deliver a new message. If i am in wifi mode the router would block that connection.
And if my whatsapp client is pooling the server every second.... Poor server if it has 1000 clients making request every second.
What is the approach to face this issue?.
Is there some other protocol involved?.
Those apps use services that utilize "long polling" - primarily based on XMPP or some variation of XMPP (like jabber - http://www.jabber.org/). The client does not poll often. A quote for the Wiki page:
The original and "native" transport protocol for XMPP is Transmission
Control Protocol (TCP), using open-ended XML streams over long-lived
TCP connections.
It sends a message to the server that basically is a mechanism for the server to send a message back at any time (as long as the client is available). It's like sending a request to an HTTP server and the server "time-out" does not occur for a very long time (hours), so the client just waits. If the server receives a message destined for the client, it sends a "response" to that request. After the time out does occur, the client sends another request and waits.
GCM does the same thing - but does not require you to setup servers for all portions of the connection. It's easy to search for GCM, AWS, etc. to see examples.
Typically GCM should be used if you dont want to guarantee immediate delivery and it is okay for your app to miss out on certain messages.
This is because GCM tries to optimize by bundling several messages (even from other apps) into a single package. And it has a limited buffer to maintain the messages per device (in case the device is not reachable).
Here is just one way to do the job.
I'm developing a multiplayer Android game with push notifications by using Google GCM.
My web server has a REST API. Most of the requests sent to this API send a request to Google GCM server to send a notification to the opponent.
The thing is on average, a call to my API is ~140 ms long, and ~100 ms is due to the http request sent to Google server.
What can I do to speed up this? I was thinking (I have full control of my server, my stack is Bottle/gunicorn/nginx) of creating an independent process with a database that will try to send a queue of GCM requests, but maybe there's a much simpler way to do that directly in bottle or in pure python.
The problem is that your clients are waiting for your server to send the GCM push notifications. There is no logic to this behavior.
You need to change your server-side code to process your API requests, close the connection to your client, and only then send the push notifications.
The best thing you can do is making all networking asynchronous, if you don't do this yet.
The issue is that there will always be users with a slow internet connection and there isn't a generic approach to bring them fast internet :/.
Other than that, ideas are to
send only few small packets or one huge in favor of many small packets (that's faster)
use UDP over TCP, UDP being connectionless and naturally faster
I've solved my problem thanks to this thread:
I'm using Celery to send my notifications through a task queue.
I can't believe how simple it is!
Thanks anyway :)
I have a server with sql database.
Also have about 100k users on android application.
What I need now is to send immediately notifications from the server to all devices.
Im researching the GCM system but as I see there`s a huge delay on the receiving side.
What I need is when I click the send button on my server,everyone device to receive it in a few seconds.
Is the delay only happening when using the HTTP connection?
Is it going to be different with the XMPP connection ?
You are trying to broadcast a message to nearly 100k users and currently xmpp downstream messaging does not support broadcasting. Use http server to send message to 1000 devices at a time. This can be improved by using multi curl. see this https://github.com/mseshachalam/GCMMessage-MultiCURL
In general the GCM is the right choice for massive broadcasting.
On the other hand the messages are not guaranteed to be delivered immediately, the delay might be up to 25(!) minutes given, that all devices have your app up and running.
See Google Cloud Messaging - messages either received instantly or with long delay for explanations why