How does a SIP based app scheme work?

How does a SIP based app scheme work? - android

This is an abstract question on how the SIP protocol works. Let us say I have a SIP server (Asterisk/Yate). And I have two Android devices that wish to connect to each other to have an audio call. ( I am looking for a purely VoIP call, no need for telephone numbers or carrier information).
How would this work? Do the packets have to pass through the server? or does the connection happen between the end-points. If the packets has to pass through the server, does the SIP server also provides profiles, or do profiles have to be created by a third party?
I need to understand how the scheme works in order to start planning building the system.
I have read lots of technical documentations, but none show an abstraction of the system. If you can provide me with resources, that would be great too.
Thanks

Since your device not know where each other located(ip/port), they call sip server or proxy.
Sip server match dialplan and send request changed(server) or unchanged(proxy) to other side.
In INVITE request each peer send address/port and info about media stream RTP
If that info unchanged(proxy) they can see each other rtp info and send rtp packets directly.
Also there is posible enother INVITE after call bridged,called re-INVITE with info about new stream for rtp(can be sound on other ip/port or video)
There are nothing called profiles in sip standart, sorry.
Anyway seams bad idea start planning voip system if you have limited REAL experience with sip server.
There are alot of articles(including wikipedia), videos on youtube with topic "how sip works", there are no way put all that here in one answer.

Related

Does WebRTC causes any load on the broadcaster site?

Imagine someone is broadcasting an audio or video world wide through WebRTC i.e one to many communication (app like periscope which i think is not done using WebRTC). Will it get affected by the broadcasters less bandwidth ? will it increase the load on broadcaster side causing loss of packets which will decrease quality of communication ? As this topic is new and very little content is available on net please suggest some good books and online tutorials.

WebRTC facilitates peer to peer communication. Going by this login a simple WebRTC broadcast application will heavily depend on broadcaster's bandwidth as the number of recipients will be same to the number of outbound media streams.
This is one of the main reasons why WebRTC Gateways or Media Servers or similar terms have been developed. In this case the broadcaster simply sends a single stream to the intermediary Gateway or Server and the other recipients then connect to the Gateway or Server and receive the stream from there.
To put it in simple terms, you basically add a central WebRTC client to which everyone connects.
You can read more at Janus, Kurento, Licode, etc.
Or find official RTCPeerConnection documentation here.

Connecting two Android devices using RTP without any server needed

I am developing a VoIP service where two devices can have an audio conversation.
I have been reading about signalling protocols like SIP. I understand the power of SIP but I prefer to have my own signalling protocol since my service will be very basic and will not require the full power of SIP.
As far as I understand, the main purpose of SIP and other protocols is to find the address of the remote party. This information I can get from my server, so no need for SIP here.
Establishing the connection from the caller should be relatively easy.
The problem I am facing however: How can I make the callee Android system listen for RTP packets specific to my app? I need this so I can fire up an intent to deal with answering the call etc.
A ServerSocket has been suggested, but I am not using TCP, and I am trying to make a client-to-client connection, not a server-client one.

SIP is not only used to find the IP address of remote party but also to agree on ports involved in media communication.
For 1-to-1 calls, when caller establishes connection it sends SIP INVITE to callee IP and SIP port. Caller knows that IP because it can query for it from the Registrar server for example.
The SIP INVITE sent from caller to callee has RTP/RTCP ports of caller stored in SDP payload of SIP INVITE message.
When receiving SIP INVITE the callee immediately responds to caller with 180 Ringing, the callee then may display an alert indicating that incoming call is received. Assuming that user accepts the call 200 OK is generated and sent to caller. This 200 OK message has SDP payload with RTP/RTCP ports that will be used by callee to listen for media packets. When 200 OK is received by caller both parties know the ports that will be used for the communication.
If I were you I wouldn't invent any protocols. I would rather try using 3rd party libraries such as pjsip, sofia-sip or some Android SIP stack.
If this is not the case, I would try to adapt the above message flow to your protocol. The problem is that the caller has to know in advance callee's IP and port where signalling message has to be sent. This may be difficult to achieve without any server involved (to query for callee port and IP).

How to communicate App with Server without internet connection?

I want to know if there is any way to communicate users without internet connection to a server.
I thought it might be possible through SMS and machine-readable encoding. However this question confirms that iOS apps allows sending but not reading SMS: iphone app reading sms
I've also read a lot about using USSD but it seems that mobile opened messages aren't possible in iOS (dial USSD code from iphone programatically) and while in Android is possible to call a code programatically there is no USSD API to, nor it's possible to send USSD messages silently.
Is there any way to transfer data between my app and my server with only basic voice-sms signal?

You are correct that SMS could be used to communicate with the server, if the server has access to a GSM modem. For large volumes of SMSes, you'd expect to have an internet connection between your server and an SMS gateway that connects directly to the messaging centre in the mobile network, instead of a modem.
This is the model of the SMS voting servers.
You can send/receive SMS as long as you have a GSM network, and you're right; no IP connection is needed.
The Android platform lets you send and receive SMSes - see here.
Disadvantages are that SMS can be expensive, and has no guaranteed delivery, and no guaranteed delivery time. It's not suitable for real time communication.
USSD is another form of communication between a mobile device and the network that's built directly into the GSM network, but USSD messages are owned or licenced by the network operators, and aren't free for customers to use, as SMS is.
EDIT: USSD isn't a native protocol in CDMA, but various implementations are available from different operators. For example, here's a patent application describing an idea for one such system, which does not appear to require an IP connection. Googling "USSD CDMA" also gives various news items about commercial implementations, technical details unknown. I think you just have to find out what your target operator(s) offer.
I haven't worked directly with WAP, but a glance at the WAP Protocol Stack shows that it can indeed run over CDMA, or GSM without an IP connection. There is also a very useful Wikipedia article. My experience testing MMS is that it usually doesn't work without an IP connection, even though it is supposed to (according to that Wiki article, with WAP/SMS). So I would question how far European operators or mobile devices are supporting or testing WAP. Whether WAP is a practical choice could come down to pricing/availability at the end of the day, rather than technical issues.

Well, there are yet other options, depending on the amount of data you need to transfer back and forth to the server.
Instead of sms, for short amount of data you could try to implement an Asterisk PBX, whith your mobile calling your server and then sending other DTMF digits as your data, which would be interpreted by your Asterisk PBX (like an interactive phone audio menu from your cable company). Asterisk is GPL open-source.
Another option (an expensive one and with heavy work to do), would be to generate an audio signal encoded with your data content, and dial a phone number linked with an attached fax/modem pci board on a server, sending that audio like a call. It would'nt need to be a long call, as you could fit lots of data in a short bursts of audio stream.
Your server could check that data by acessing the content on the receiving end. Simply recording the call from the attached fax/modem pci board, or you could use an Asterisk PBX server on a local computer to save the audio file and then process by your server software.
Anyway, you would need to create a new protocol and data encoding type, like you mentioned "machine-readable encoding".
So, for the data types you could just save a lot of short audio files on your mobile and play them as your datatypes, but it would be easier to just go for the DTMF already mentioned above. Or you could encode like this: get the sound spectrum allowed to use through the voice call (Wideband/Narrowband), and divide it by the amount of single characters or chunks needed (take a look at how to encode in base64, to have some ideas). Then create a function to just encode your data as a short audio stream (read about PCM encoding and also read more about Fast Fourier transforms, if you want to complicate (but speed up) even further.
Create a simple protocol like this: first audio packet is a sequence of tones that makes a request, authenticates, and waits for acknowledgement response from the server (which could be just by not dropping the call 1 second after that). The 2nd audio packet is the size of the first frame of data, then the 3d audio packet onwards is the data itself with the size shown before. And so on. Have a look at the ftp protocol description for simplicity. Then you need to refine it so that the time of every packet above is the least possible while mantaining confiability.
For costs saving on voice calling, you could also explore phone number options, like Google Voice, Skype or any Voip service.

initiating a rtsp connection over cellular with android

The majority of sim accounts are public dynamic. Most if not all cellular providers do not allow incoming connections to public dynamic ip addresses. (3g anyway, maybe not 4g/LTE)
The issue of connecting is not one of dynamic ips, but rather blocked incoming ports.
So, if I wanted to stream video from an android phone on demand (based on information gleaned from this conversation (Streaming video from Android camera to server)), what would be the chain of events to properly intitiate a connection.
My idea of this (roughly):
app on android phone initiates and keeps open some sort of connection to media server (wowza or something).
At some point when server wants video from phone, it uses the open connection to request a video stream.
Android phone pushes rtsp stream to server.
Is this correct, and if so, what type of connection should i use as the permanent control connection. Also, is it possible to push rtsp or would i have to do something else?
Thanks!

I know this is an old question but if anybody else is searching for something similar the following is now available:
http://developer.android.com/guide/google/gcm/index.html
This essentially allows a message to be sent from a server to an app on an Android device (it replaces C2DM which did a similar thing).
Update
Google GCM has now being replaced in turn by Google Firebase Cloud Messaging:
https://firebase.google.com/docs/cloud-messaging/
Using a could based app messaging service like this, the steps would be:
Add a message subscription service to your app (e.g. Firebase)
The App registers with the cloud messaging service when it starts up
When the server wants video from the phone (as noted in the questions above) the server sends a message to the app
The app opens a connections to the streaming server and starts to stream video to the server.
Note: there is a comment below about how this approach does not allow an incoming connection from the server to the Android phone.
This, in fact, is not how streaming from a phone typically works. The phone actually makes an 'outgoing' connection to a streaming server which it then streams the video to. Other devices wanting to see the video then stream it form here.
There are several reasons why this is the preferred approach, one of the key ones being that supporting a quality streaming service that will play back on most common devices, browsers, OS's etc requires transcoding the video into multiple bit rates, and even encodings in some cases, and packaging and serving in the appropriate streaming packaging format. Doing all this on the mobile device would be very compute and storage intensive.

SIP vs direct TCP sockets

I am implementing a real time - multi user voice transfer application for Android.
I have read that, as a standard - RTP packets are encapsulated into SIP and then sent to destination(s). What is the advantage of doing so?
my idea was to use a server, just to receive control messages from nodes and open sockets. All these nodes would be in 1 group. THen, I send out the IP addresses of each of these nodes, so that a single sender can multicast its packets directly to the destination.
is there a fatal flaw here? ( iam not concerned about the power consumption)
How does SIP do better? or does it ?
Thanks

RTP isn't "encapsulated into SIP packets". SIP is a signalling protocol. RTP is a media stream protocol. SIP is used to negotiate and set up (and tear down) media streams.
TCP is a horrible choice for media (RTP) packets; it's not clear from your writing if you're suggesting that.
Multicast is unlikely to work for many network paths/recipients.
routers play merry hell with incoming data; you'll need more than just SIP to deal with users on the open net. See STUN, TURN, ICE, UPnP, etc.

SIP or Session Initiation Protocol is a protocol that was designed specifically to address the problem you're trying to solve. Generally, the reason you should reuse (rather than reinvent the wheel) is because other people have studied the same problem and presumably have come up with a better solution as a collective group than you could as an individual. Of course that's not always true, but generally speaking it holds!
If you want to learn about SIP you could study the RFC 3261 specification, or start with the Wikipedia entry if you want a quick overview.
That being said, if you don't need the overhead of a complete and carefully tested protocol you could roll your own but make sure that when you're making that decision you know what you're foregoing and have a good reason to do so.
SIP is a signaling protocol that usually runs over TCP (although not required) and if you look at it closely you will see it is very similar to HTTP in many respects. Just like HTTP it can transport a great deal of payloads and it does so with text headers, much like HTTP can be used to transport HTML, XML, plain text or any arbitrary binary payload.

In the most simple system, you could just use Voice packets over RTP over UDP.
But you'd have no way to turn audio off, and would have to know IP addresses, port numbers, they type of codec and its characteristics beforehand.
In an overly simple view, SIP is a way to:
1. find the ip address of another endpoint from a URL. (May need STUN, TURN, ICE, etc)
2. agree on which codec to use and its options
There's a lot more to SIP than that, you may want to investigate SIP's conferencing features based on what you wrote.
You may write your own signaling protocol, and if this is for a school project, that will work just fine.
But if you are doing a commercial project, bear in mind that there is a lot more to telephony than meets the eye. The original SIP spec was greatly revised and is now a cluster of RFCs which are still being modified and added to. I recommend that you take advantage of this work rather than possibly reinventing the mistakes others have made.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.