I am able to see the video playing in my TextureView but it is fairly corrupted. I have verified that I am receiving complete packets in the correct order. I have been able to parse the RTP header correctly. I believe my issue is related to the SPS and PPS and the MediaCodec.
My understanding is that you are supposed to strip the RTP header from the message and insert an RTP start code of 0x00000001 to the start of your message so that your input buffer to the decoder is of the form 0x00000001[sps] 0x00000001[pps] 0x00000001[video data].
My confusion is that the MediaCodec appears to require a MediaFormat with the SPS and PPS manually defined separately. I have found this example that I am currently using along with the message format I have defined above:
MediaFormat format = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, width, height);
// from avconv, when streaming sample.h264.mp4 from disk
byte[] header_sps = {0, 0, 0, 1, 0x67, 0x64, (byte) 0x00, 0x1e, (byte) 0xac, (byte) 0xd9, 0x40, (byte) 0xa0, 0x3d,
(byte) 0xa1, 0x00, 0x00, (byte) 0x03, 0x00, 0x01, 0x00, 0x00, 0x03, 0x00, 0x3C, 0x0F, 0x16, 0x2D, (byte) 0x96}; // sps
byte[] header_pps = {0, 0, 0, 1, 0x68, (byte) 0xeb, (byte) 0xec, (byte) 0xb2, 0x2C}; // pps
format.setByteBuffer(CSD_0, ByteBuffer.wrap(header_sps));
format.setByteBuffer(CSD_1, ByteBuffer.wrap(header_pps));
As you can see, I am not providing the MediaFormat with the SPS and PPS from my video stream, but instead using a hard coded set from an internet example. I've tried to find sources explaining how to extract the SPS and PPS from a packet, but haven't been able to find anything.
Questions:
Am I supposed to strip the SPS and PPS from my buffer before passing it to the MediaCodec if the MediaFormat is already being provided the SPS and PPS?
How do you correctly parse the SPS and PPS from a message?
Here's the first few bytes of one of my RTP packets with the header included:
80 a1 4c c3 32 2c 24 7a f5 5c 9f bb 47 40 44 3a 40 0 ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0 0 1 c0 0 71 80 80
5 21 0 5d d6 d9 ff fb 12 c4 e7 0 5 5c 41 71 2c 30 c1 30 b1 88 6c
f5 84 98 2c 82 f5 84 82 44 96 72 45 ca 96 30 35 91 83 86 42 e4 90
28 b1 81 1a 6 57 a8 37 b0 60 56 81 72 71 5c 58 a7 4e af 67 bd 10
13 1 af e9 71 15 13 da a0 15 d5 72 38 36 2e 35 11 31 10 a4 12 1e
26 28 40 b5 3b 65 8c 30 54 8a 96 1b c5 a7 b5 84 cb a9 aa 3d d4 53
47 0 45 34 55 0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff bf 9 95 2b 73 93 4e c3 f9 b1 d0 5f f5 de c9 9e f7 f8 23 ab
a5 aa
Yes you are correct that the mediacodec requires the SPS and PPS to be initialized first. You must extract the SPS/PPS from the SDP response which is the reply from the DESCRIBE command sent to the server(camera) during the RTSP handshake. Within the SDP response there is a sprop parameter set which contains the SPS/PPS. You can see them on WireShark as:
Media format specific parameters: sprop-parameter-sets=Z2QAKKwbGoB4AiflwFuAgICgAAB9AAAOph0MAHz4AAjJdd5caGAD58AARkuu8uFAAA==,aO44MAA=
They are separated by comma and must be decoded using Base64. See this for an explanation: How to decode sprop-parameter-sets in a H264 SDP?
Related
I'm really stuck on getting a video stream to play on a java fx project.
-- Short version:
I'm streaming h264/avcc flavor video from an android phone to a desktop computer. However javafx doesn't have an easy solution for displaying stream. I'm attempting to use javacv / ffmpeg in an attempt to make this work. However I am getting errors from ffmpeg.
1) Is there a better way to display streaming video on javafx?
2) Do you have a sample project or good tutorial for javacv ffmpegframegrabber?
3) I think I may be missing some small detail in mycode but Im not sure what i would be.
-- Longer Version:
1) On the android end Im getting video using mediarecorder. In order to get the sps/pps info I record and save a small movie to the device and then parse the sps and pps data.
2) Next, on the android, I split up the nalus to meet MTU req and send them over a udp connection to my desktop
3)On my desktop I reassmble the nalus( or trash them if they loose data) and feed those to an input stream that I gave to the framegreabber constructor.
-- The Code and Logs:
The errors are long and numerous depending on the flavor I feed it. Here are two separate examples which are usually repeated at great length
[h264 # 0000020225907a40] non-existing PPS 0 referenced
[h264 # 0000020225907a40] decode_slice_header error
[h264 # 0000020225907a40] no frame!
[h264 # 00000163d8637a40] illegal aspect ratio
[h264 # 00000163d8637a40] pps_id 3412 out of range
[AVBSFContext # 00000163e28a0e00] Invalid NAL unit 0, skipping.
!! One big caveat that I am aware off is that I have not implemented timestamps
which I created on the android device when feeding ffmpeg. I think it should still show distorted images without this though
Because I have spent all day guessing and trying I have several "flavors" of data I have shoved through. I am only showing the first section of each nal which I believe if correct would at least show a garbage image as long as my sps and pps are right
sps: 67 80 80 1E E9 01 68 22 FD C0 36 85 09 A8
pps: 68 06 06 E2
Below is annex B style.
These were each prefixed with either 00 00 01 and 00 00 00 01
Debug transfer 65 B8 40 0B E5 B8 7B 80 5B 85
Debug transfer 41 E2 20 7A 74 34 3B D6 BE FA
Debug transfer 41 E4 40 2F 01 E0 0C 06 EE 91
Debug transfer 41 E6 60 3E A1 20 5A 02 3C 6D
Debug transfer 41 E8 80 13 B0 B9 82 C3 03 F4
Debug transfer 41 EC C0 1B A3 0C 28 F1 B0 C8
Debug transfer 41 EE E0 1F CE 07 30 EE 05 06
Debug transfer 41 F1 00 08 ED 80 9C 20 09 73
Debug transfer 41 F3 20 09 E9 00 86 60 21 C3
VideoDecoderaddPacket type: 24
Debug transfer 67 80 80 1E E9 01 68 22 FD C0
Debug transfer 68 06 06 E2
Debug transfer 65 B8 20 00 9F 80 78 00 12 8A
Debug transfer 41 E2 20 09 F0 1E 40 7B 0C E0
Debug transfer 41 E4 40 09 F0 29 30 D6 00 AE
Debug transfer 41 E6 60 09 F1 48 31 80 99 40
[h264 # 000001c771617a40] non-existing PPS 0 referenced
Here I tried Avcc style. You can see the first line is the combination of the sps pps followed by idr and then repeated non idr
Debug transfer 18 00 0E 67 80 80 1E E9 01 68
Debug transfer 00 02 4A 8F 65 B8 20 00 9F C5
Debug transfer 00 02 2F DA 41 E2 20 09 E8 0F
Debug transfer 00 02 2C 34 41 E4 40 09 F4 20
Debug transfer 00 02 4D 92 41 E6 60 09 FC 2B
Debug transfer 00 02 47 02 41 E8 80 09 F0 72
Debug transfer 00 02 52 50 41 EA A0 09 EC 0F
Debug transfer 00 02 58 8A 41 EC C0 09 FC 6F
Debug transfer 00 02 55 F9 41 EE E0 09 FC 6E
Debug transfer 00 02 4D 79 41 F1 00 09 F0 3E
Debug transfer 00 02 4D B6 41 F3 20 09 E8 64
The following class is where I try to get javacv/ffmpeg to show the video. I dont think its an ideal solution and am researching canvasfram as a replacement to the image view.
public class ImageDecoder {
private final static String TAG = "ImageDecoder ";
private ImageDecoder(){
}
public static void streamImageToImageView(
final ImageView view,
final InputStream inputStream,
final String format,
final int frameRate,
final int bitrate,
final String preset,
final int numBuffers
)
{
System.out.println("Image Decoder Starting...");
try( final FrameGrabber grabber = new
FFmpegFrameGrabber(inputStream))
{
final Java2DFrameConverter converter = new Java2DFrameConverter();
grabber.setFrameNumber(frameRate);
grabber.setFormat(format);
grabber.setVideoBitrate(bitrate);
grabber.setVideoOption("preset", preset);
grabber.setNumBuffers(numBuffers);
System.out.println("Image Decoder waiting on grabber.start...");
grabber.start(); //---- this call is blocking the loop
System.out.println("Image Decoder Looping---------------------------
-------- hit stop");
while(!Thread.interrupted()){
//System.out.println("Image Decoder Looping");
final Frame frame = grabber.grab();
if (frame != null){
final BufferedImage bufferedImage =
converter.convert(frame);
if (bufferedImage != null){
Platform.runLater(() ->
view.setImage(SwingFXUtils.toFXImage(bufferedImage, null)));
}else{
System.out.println("no buf im");
}
}else{
System.out.println("no fr");
Thread.currentThread().interrupt();
}
}
}catch (Exception e){
System.out.print(TAG + e);
}
}
}
Any help is greatly appreciated.
So I had two problems.
The first was that my sps pps parsing method had a mistake. Notice the 2nd and 3rd bytes are same
The second was I accidentally over sized a buffer and created large 0x00 padded areas which emulated start codes.
This was a big project for me and I want to help other. Please visit my website where I wrote a lengthy multi-part discussion about streaming h264
I'm currently working a project with Open Mobile API. Basically, I got this problem when i exchange apdu to the UICC, all my commands converted automatically to the extended logical APDU command (CLA : 0xC1). I'm using Samsung Galaxy S6 Edge during this test with android version : 5.0.2.
APDU > Header [CLA INS P1 P2] 00 70 00 00 194,69 etu MANAGE CHANNEL
< Outgoing data 01
< Return code [SW1 SW2] 90 00
APDU > Header [CLA INS P1 P2] 01 A4 04 00 194,69 etu SELECT
Incoming data A0 00 00 05 59 10 10 FF FF FF FF 89 00 00 01 00
< Outgoing data 6F 1A 84 10 A0 00 00 05 59 10 10 FF FF FF FF 89
00 00 01 00 A5 06 73 00 9F 65 01 FF
< Return code [SW1 SW2] 90 00
APDU > Header [CLA INS P1 P2] C1 E2 91 00 187,69 etu
Incoming data BF 2D 00
< Return code [SW1 SW2] 6D 00
APDU > Header [CLA INS P1 P2] 00 70 80 01 192,69 etu MANAGE CHANNEL
< Return code [SW1 SW2] 90 00
What could be a problem? Who is responsible to change my CLA command to 0xC1? Why the phone change the CLA command to 0xC1?
Note : Based on my application log, I send this 81 E2 91 00 02 BF 2D 00
Thanks for your help.
I am trying to play AAC audio live stream coming from Red5 server, so to decode the audio data i am using Javacv-ffmpeg. Data is received as packets of byte[]
Here is what i tried
public Frame decodeAudio(byte[] adata,long timestamp){
BytePointer audio_data = new BytePointer(adata);
avcodec.AVCodec codec1 = avcodec.avcodec_find_decoder(avcodec.AV_CODEC_ID_AAC);// For AAC
if (codec1 == null) {
Log.d("showit","avcodec_find_decoder() error: Unsupported audio format or codec not found: " + audio_c.codec_id() + ".");
}
audio_c = null;
audio_c = avcodec.avcodec_alloc_context3(codec1);
audio_c.sample_rate(44100);
audio_c.sample_fmt(3);
audio_c.bits_per_raw_sample(16);
audio_c.channels(1);
if ((ret = avcodec.avcodec_open2( audio_c, codec1, (PointerPointer)null)) < 0) {
Log.d("showit","avcodec_open2() error " + ret + ": Could not open audio codec.");
}
if (( samples_frame = avcodec.avcodec_alloc_frame()) == null)
Log.d("showit","avcodec_alloc_frame() error: Could not allocate audio frame.");
avcodec.av_init_packet(pkt2);
samples_frame = avcodec.avcodec_alloc_frame();
avcodec.av_init_packet(pkt2);
pkt2.data(audio_data);
pkt2.size(audio_data.capacity());
pkt2.pts(timestamp);
pkt2.pos(0);
int len = avcodec.avcodec_decode_audio4( audio_c, samples_frame, got_frame, pkt2);
}
But len after decoding returns -1 for first frame and then -22 always.
First packet is like this always
AF 00 12 08 56 E5 00
Further packets are like
AF 01 01 1E 34 2C F0 A4 71 19 06 00 00 95 12 AE AA 82 5A 38 F3 E6 C2 46 DD CB 2B 09 D1 00 78 1D B7 99 F8 AB 41 6A C4 F4 D2 40 51 17 F5 28 51 9E 4C F6 8F 15 39 49 42 54 78 63 D5 29 74 1B 4C 34 9B 85 20 8E 2C 0E 0C 19 D2 E9 40 6E 9C 85 70 C2 74 44 E4 84 9B 3C B9 8A 83 EC 66 9D 40 1B 42 88 E2 F3 65 CF 6D B3 20 88 31 29 94 29 A4 B4 DE 26 B0 75 93 3A 0C 57 12 8A E3 F4 B9 F9 23 9C 69 C9 D4 BF 9E 26 63 F2 78 D6 FD 36 B9 32 62 01 91 19 71 30 2D 54 24 62 A1 20 1E BA 3D 21 AC F3 80 33 3A 1A 6C 30 3C 44 29 F2 A7 DC 9A FF 0F 99 F2 38 85 AB 41 FD C7 C5 40 5C 3F EE 38 70
Couldn't figure out where is the problem, whether in setting the AVcodec context audio_c or setting packet for decoder.
Any help appreciated. Thanks in advance.
The first packet (config) describes the stream data if I'm not mistaken the follow on data is the encoded audio. You can't assume the sample rate etc.. as you have done above, you need to pull that out of the config data which is marked "AF 00".
I have a similar problem. I've intercepted the packets with Wireshark and here is what it told me:
AF is the control byte of AAC frame, and it decodes to following bits:
1010 .... = Format: HE-AAC
.... 11.. = Sample rate: 44kHz (allthough FFMPEG shows me 48kHz and I would lean to believe it more)
.... ..1. = Sample size: 16 bit
.... ...1 = Channels: stereo
I still can't figure out how to universally decode this data.
edit:
Ha! I've got something :)
I guess the first 2 bytes are RTMP specific bytes. The second one seems to state, whether it is the configuration (0) or actual payload (1) - i've found no sources confirming that, it is just my assumption.
Then the first, short package is the AAC configuration description described here:
http://thompsonng.blogspot.com/2010/03/aac-configuration.html
In my case it is:
11 90
which is binary:
0001 0001 1001 0000
and that decodes to:
0001 0... .... .... = 2 = AAC LC
.... .001 1... .... = 3 = 48 kHz
.... .... .001 0... = 2 = 2 channels (stereo)
.... .... .... .0.. = 0 = 1024 sample length
.... .... .... ..0. = 0 = doesn't depends on core code (?)
.... .... .... ...0 = 0 = extension flag
I am looking to use adb screencap utility without the -p flag. I imagined output will be dumped in raw format, but doesn't look like it. My attempts of opening the raw image file with Pillow (python) library resulted in:
$ adb pull /sdcard/screenshot.raw screenshot.raw
$ python
>>> from PIL import Image
>>> Image.open('screenshot.raw')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/....../lib/python2.7/site-packages/PIL/Image.py", line 2025, in open
raise IOError("cannot identify image file")
IOError: cannot identify image file
Found out not the right way to read raw images like this, I even gave the following a shot: How to read a raw image using PIL?
>>> with open('screenshot.raw', 'rb') as f:
... d = f.read()
...
>>> from PIL import Image
>>> Image.frombuffer('RGB', len(d), d)
__main__:1: RuntimeWarning: the frombuffer defaults may change in a future release; for portability, change the call to read:
frombuffer(mode, size, data, 'raw', mode, 0, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/..../lib/python2.7/site-packages/PIL/Image.py", line 1896, in frombuffer
return frombytes(mode, size, data, decoder_name, args)
File "/Users/..../lib/python2.7/site-packages/PIL/Image.py", line 1821, in frombytes
im = new(mode, size)
File "/Users/..../lib/python2.7/site-packages/PIL/Image.py", line 1787, in new
return Image()._new(core.fill(mode, size, color))
TypeError: must be 2-item sequence, not int
All possible mode option lead to same TypeError exception.
Here is what hexdump utility reveals:
$ hexdump -C img.raw | head
00000000 d0 02 00 00 00 05 00 00 01 00 00 00 1e 1e 1e ff |................|
00000010 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff |................|
*
000038c0 1e 1e 1e ff 1e 1e 1e ff 21 21 21 ff 2b 2b 2b ff |........!!!.+++.|
000038d0 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff |................|
*
00004400 1e 1e 1e ff 1e 1e 1e ff 47 47 47 ff 65 65 65 ff |........GGG.eee.|
00004410 20 20 20 ff 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff | .............|
00004420 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff 1e 1e 1e ff |................|
*
On osx:
$ file screenshot.raw
screenshot.raw: data
screencap help page doesn't reveal much either about format of output data without -p flag:
$ adb shell screencap -h
usage: screencap [-hp] [FILENAME]
-h: this message
-p: save the file as a png.
If FILENAME ends with .png it will be saved as a png.
If FILENAME is not given, the results will be printed to stdout.
Format:
4 bytes as uint32 - width
4 bytes as uint32 - height
4 bytes as uint32 - pixel format
(width * heigth * bytespp) bytes as byte array - image data, where bytespp is bytes per pixels and depend on pixel format. Usually bytespp is 4.
Info from source code of screencap.
For your example:
00000000 d0 02 00 00 00 05 00 00 01 00 00 00 1e 1e 1e ff
d0 02 00 00 - width - uint32 0x000002d0 = 720
00 05 00 00 - height - uint32 0x00000500 = 1280
01 00 00 00 - pixel format - uint32 0x00000001 = 1 = PixelFormat.RGBA_8888 => bytespp = 4 => RGBA
1e 1e 1e ff - first pixel data - R = 0x1e; G = 0x1e; B = 0x1e; A = 0xff;
Pixels with data stored in array of bytes with size 720*1280*4.
Thanks to the extract of your file , I guess your raw file is formated as
width x height then the whole set of RGBA pixels (32 bits) (width x height times)
Here I see you get a 720x1280 image captured..
May the ImageMagick toolset help you to view/convert it in a more appropriate file format.
Here below a sample that may help you
(ImageMagick convert command, for osx see http://cactuslab.com/imagemagick/ )
# skip header info
dd if=screenshot.raw of=screenshot.rgba skip=12 bs=1
# convert rgba to png
convert -size 720x1280 -depth 8 screenshot.rgba screenshot.png
If it doesn't work you may try changing skip=12 by skip=8 and/or 720x1280 by 1280x720 ..
Hope that help
To read adb screencap raw format in python:
from PIL import Image
Image.frombuffer('RGBA', (1920, 1080), raw[12:], 'raw', 'RGBX', 0, 1)
The most important part is skipping the header, as mentioned in #Emmanuel's answer
Note that (1920, 1080) are your device resolution which can be obtained with
adb shell wm size
Hopefully this will save someone 12 hours investigating why cv2.matchTemplate has different match on almost identical images.
In one of my application, I open binary files, and I got some error report by users on some files. When they send me the files, if I download them on Gmail in the desktop, the file displays nicely in my app. When I download them with the native Android GMail app, the file doesn't open.
Here are the first 64 bytes of the original file, and as it appear when downloaded from the desktop (displayed as hexa):
03 00 08 00 D8 0C 00 00 01 00 1C 00 BC 02 00 00
2D 00 00 00 00 00 00 00 00 01 00 00 D0 00 00 00
00 00 00 00 00 00 00 00 10 00 00 00 25 00 00 00
33 00 00 00 3D 00 00 00 44 00 00 00 49 00 00 00
And here are the first 64 bytes of the file downloaded with the native GMail app (hexa again) :
EF BF BD EF BF BD 2D EF BF BD 25 33 3D 44 49 4D
52 63 72 76 EF BF BD EF BF BD EF BF BD EF BF BD
EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD EF
BF BD EF BF BD 29 2E 3E 43 54 59 69 6E 7F EF BF
Is there a sort of compression applied to this file or is the GMail app corrupting it ? Especially if you look at the end of the first sample, you have the following bytes 10, 25, 33, 3D, 44, 49, which also appear in the first line of the second sample, which leed me to think that it's a compression of some sort.
I'm not sure of the exact source, but if you look at http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%F6&mode=char then that pattern is due to something trying to interpret the file as UTF-8, doing replacement, then writing the file as UTF-8.