I want to be able to send an audio stream to Android/IOS devices.
The current encoding for the stream is mp3 128 kbps. If i'd send this over the network it will take huge amount of mobile data.
I was thinking of compressing the data with gzip but i think that would make no difference as mp3 is already a reduced file.
Is there any way to reduce the size of the stream and play it on the mobile device?
Thanks,
Dan
First off, your math is ignoring a key unit. Your MP3 stream is 128 kilobits (note the bits) per second. This comes out to be a little under 60 megabytes per hour after you factor in a little bit of overhead and metadata.
Now, as Mark said you can use a different bitrate and/or codec. For most mobile streams, I choose either a 64kbit or 96kbit stream, and then either MP3 or AAC depending on compatibility. AAC does compress a bit better, providing a better sounding stream at those low bitrates, but you will still need an MP3 stream for some devices.
Also note that you should not assume your users are using the mobile network on their mobile devices. Give your users a choice of which stream to use. Some have unlimited data and great coverage. Others use WiFi all the time.
All you can do is re-compress to a lower bit rate and use a different compression method, e.g. AAC. An AAC should sound better at the same bit rate.
Related
I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!
I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.
How streaming apps like Youtube, Hotstar or any other video player app, programmatically detects if network is getting slow over run-time and based on that they change video quality based on changes in network speed?
Many streaming services nowadays use HTTP-based streaming protocols. But there are exceptions; especially with low-latency streaming; e.g. WebRTC or Websocket-based solutions.
Assuming that you're using a HTTP-based protocol like HLS or MPEG-DASH, the "stream" is a long chain of video segments that are downloaded one after another. A video segment is a file in "TS" or "MP4" format (in some MP4 cases, video and audio are splitted into separate files); typically a segment has 2 or 6 or 10 seconds of audio and/or video.
Based on the playlist or manifest (or sometimes simply from decoding the segment), the player knows how many seconds of a single segment contains. It also knows how long it took to download that segment. You can measure the available bandwidth by diving the (average) size of a video segment file by the (average) time it took to download.
At the moment that it takes more time to download a segment than to play it, you know that the player will stall as soon as the buffer is empty; stalling is generally referred to as "buffering". Adaptive Bitrate (aka. ABR) is a technique that tries to prevent buffering; see https://en.wikipedia.org/wiki/Adaptive_bitrate_streaming (or Google for the expression) - when the player notices that the available bandwidth is lower than the bit rate of the video stream, it can switch to another version of the same stream that has a lower bit rate (typically achieved by higher compression and/or lower resolution - which results in less quality, but that's better than buffering)
PS #1: WebRTC and Websocket-based streaming solutions cannot use this measuring trick and must implement other solutions
PS #2: New/upcoming variants of HLS (eg. LL-HLS and LHLS) and MPEG-DASH use other HTTP technologies (like chunked-transfer or HTTP PUSH) to achieve lower latency - these typically do not work well with the mentioned measuring technique and use different techniques which I consider outside scope here.
You have to use a streaming server in order to do that. Wowza server is one of them (not free). The client and server will exchange information about the connexion and distribute chuncks of the video, depending on the network speed.
I Used This Code Capture Audio In Android Studio
But Size Audio Its Big(1min=1MB)
How Can Compress Audio Without Quality loss
AudioRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
AudioRecorder.setOutputFormat(MediaRecorder.OutputFormat.AAC_ADTS);
AudioRecorder.setAudioEncoder(MediaRecorder.OutputFormat.AMR_NB);
AudioRecorder.setAudioChannels(1);
AudioRecorder.setAudioEncodingBitRate(44100);
AudioRecorder.setAudioSamplingRate(128000);
You're setting the encoder to AMR_N8. About 1 MB/sec sounds about right for that encoder. If you want to compress it you can choose another encoding, such as Vorbis.
Please note that the different encoders have different purposes. None of them will compress without quality loss- we're converting analog sound to digital values, there is always quality loss with that. Different encoders are optimized for different uses. AMR is optimized for human voice. It filters out frequencies not in vocal range. That's a quality loss, but it may be one you want (like in a call). You can't get everything- you're going to have to sacrifice size or quality. I suggest you study up on the different encodings found at https://developer.android.com/reference/android/media/MediaRecorder.AudioEncoder.html and figure out what's best for you.
I am developing an Android application that needs to send short (<60 second) voice messages to a server.
File size is very important because we don't want to eat up data plans. Sound quality is important to the point the message needs to be recognizable, but it should require significantly less bandwidth/quality than music files.
Which of the standard Android audio encoders (http://developer.android.com/reference/android/media/MediaRecorder.AudioEncoder.html) and file formats (http://developer.android.com/reference/android/media/MediaRecorder.OutputFormat.html) are likely to be best for this application?
Any hints on good starting places for bit rates, etc. would be welcome as well.
We need to ultimately be able to play them on Windows and iOS, but it's okay if that takes some back-end conversion. There doesn't seem to be an efficient cross-platform format/encoder so that's where we'll put in the work.
AMR is aimed precisely at speech compression, and is the codec most commonly used for normal circuit-switched voice calls.The narrow-band variant (AMR-NB, 8kHz sample rate) is still the most widely used and should be supported on pretty much any mobile phone you can find. The wide-band variant (AMR-WB, 16kHz sample rate) offers better quality and is preferred if the target device supports it and you can spare the bandwidth.
Typical bitrates for AMR ranges from around 6 to 14 kbit/s.
I'm not sure if there are any media players for Windows that handle .3GP files with AMR audio directly (VLC might). There are converters that can be used, though.
HE-AAC (v1) could also be used for speech encoding, however this page suggests that encoding support on Android is limited to Android 4.1 and above. Suitable rates might be 16 kHz / 64 kbps.
I'm doing some experiments with video streaming from and the front camera of the android device to a local server. Currently I plan to use WiFi. I may move to Bluetooth 4.0 in the future.
I'm looking for insights, experience and DOs and DON'Ts and other ideas that I should consider in relation to protocol options (TCP, UDP, ...? ) and video codec. The image quality should be good enough to run computer vision algorithms such as face and object detection, recognition and tracking on the server side. The biggest concern is power. I want to make sure that the streaming is as power efficient as possible. I understand more power efficiency means a lower frame rate.
Also, I need to way to just send the video frames without displaying them directly on the screen.
Thanks.
You didn't mention whether you will be doing encoding or decoding on the device.
Some tips:
UDP will be less power hungry in general especially under deteriorating network conditions:
See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.134.5517&rep=rep1&type=pdf
Check for more papers on this on google
In terms of codecs in general you can say the order is H264 > MPEG4 > H.263 in terms of power needed for both encoding and decoding.
Higher the bitrate more the power needed for decoding but the codec difference is a bigger difference than the bitrate one. I say this because to get same quality as a H.264 stream with H.263 you need higher bitrate. But h.263 at that bitrate should consume lesser power than H.264 at the lower bitrate. So do not apply it cross codec. Just at the codec chosen use the lowest bitrate/framerate you can.
In encoding though very low bitrates can make the encoder work harder so increase power consumption. So encoding bitrates should be low, but not so low that the encoder is streched. This means choosing a reasonable bitrate which does not produce a continous blocky stream but gives a decent stream output.
Within each codec if you can control the encoding then you can also control the decoding power. The following applies to both:
i.e. Deblocking, B Pictures will add to power requirements. Keeping to lower profiles [Baseline for H.264, Simple Profile for MPEG4 and Baseline for H.263] will result in lesser power requirements in encoding and decoding. In MPEG4 switch off 4MV support if you can. Makes streams even simpler to decode. Remember each of these also have a quality impact so you have to find what is acceptable quality.
Also unless you can really measure the power consumption I am not sure you need very fine tweaking of the toolsets. Just sticking to the lower profiles should suffice.
Worse the video quality during capture more the power needed during encoding. So bright lighted videos need lesser effort to encode, low light videos need more power.
There is no need to send videos to a screen. you receive video over a socket and do whatever you want to do with that data. That is upto you. You do not have to decode and display it.
EDIT: Adding a few more things I could think off
In general the choice of codec and its profile will be the biggest thing affecting a video encoding/decoding system's power consumption.
The biggest difference may come from the device configuration. If you have hardware accelerators for a particular codec in the device it may be cheaper to use those than software codec for another one. So though H.264 may require more power than MPEG4 when both are in software, if the device has H.264 in hardware then it may be cheaper than MPEG4 in software. So check you device hardware capability.
Also video resolution matters. Smaller videos are cheaper to encode. You can clock your device at lower speeds when running smaller resolutions.