I am currently developing an application which produces certain metadata with respect to preview frames coming from the camera. I can see this metadata being produced properly and I have no problems here.
However, I have to embed this metadata to these frames of interest (frames are processed by a native algorithm to produce this metadata). I am using ffmpeg with x264 to encode the frames into H.264. I have checked x264.h and some documentations but failed to find what I seek.
My question is; is there any unused portion of H.264 syntax that I can embed my metadata to encoded frames?
I hope I was clear enough. Thanks in advance.
Most video elementary streams have a provision for "user data". In h.264 this is part of the SEI nal unit. You can add one before every frame you want to associate it with. I don't think that x264 has support to add user data from outside.
Two choices:
Modify x264 / ffmpeg to add the SEI message where ever you want it taking input in some form you like.
Create your stream, create your metadata. Now write a small program separately to read your metadata and parse the files and push a SEI NAL before the frame you want.
For SEI syntax you should be able to google and get it. The best place to look though is the H.264 standard. A easier way is to just look at the code in x264. It does insert one user data at the begining (the encoding parameters).
Related
I am building an application in which I need to trim videos. It is possible to do this using ffmpeg, but I can't use it because it uses the gpl license.
I tried using mediaCodec but can't use the codes I found.
How can i trim videos on android?
I had to develop trim functionality into my app a few months back and found that FFMPEG is very heavy and wasn't as accurate as MediaCodec.
None of the examples helped me but as I was developing in Kotlin I had to rewrite it anyway.
Here is the breakdown of how to use MediaCodec:
Pass the file to your mediacodec class
Extract the video from a file
Create your buffer size
Seek to where you want to file to be trimmed from or to
Mux your audio and video together
We tried to find a way to do the start and finish times together but we ended up just duplicating the clip first and passing both in with a start and and end time.
You'll need to post your code and show where you're having the issue with MediaCodec for people to help you.
i want to build a video editor like this app Link
my problem is that should i use ffmpeg or any other similar library to encode the videos to edit and then to decode them
or should i use completely different approach to edit the videos.
any help will be appreciated
Why not make a list of your proposed features then check if FFmpeg can do them? That will answer your own question.
You can use FFmpeg to decode various formats to raw data like pixels (for image) and PCM (for audio) then use the audio programming skills or pixel manipulation skills you already have to modify the data. If you have no skills then you're limited to making a user-interface for FFmpeg, aren't you?
For example : If a user moves the slider to adjust image (video frame) brightness is your code using a for loop to adjust each pixel values or maybe you're using a colorMatrix? How will you show live preview since FFmpeg must first encode the entire video with new brightness. This information is missing from your question.
Then use FFmpeg again to re-encode to output format (some formats like MPEG require a paid license to encode data in its format by any "paid-for" software so check your rights as Android developer, maybe Google covered that step for you).
I need to save a video file generated by two video streams coming from two different sources. I'm using rtsp over tcp/ip, and the videos are encoded with h264.
I need to first record the video from the first source and than continue with the second source.
So what I tried was to declare two AVFormatContext instances, initialize both with avformat_open_input(&context, "rtsp://......",NULL,&options)
and then read frames with av_read_frame(context,&packet)
and write them in the video file av_write_frame(oc,&packet);
It works fine saving the video from the first source, but if by example I saved y frames from the first context, when I try reading and saving the frames from the second context in the same file, for the first y frames I am tring to save, av_write_frame(oc,&packet2);
would retun -22, and would not add the frame to the file.
I think the problem is that the context variable remembers how many frames were read, and it gives every read packet an identification number, to make sure it isn't written twice. But when I'm using a new context those identification numbers reset, the AVOutputFormat or the AVFormatContext also retain the id of the package they are expecting to receive, and would not write anything until they receive a package with that id.
Now I'm wondering how could I solve this inconvenience. I can't find any setter for that id, or any way to reuse the same context. I thought to modify the ffmpeg sources but they are pretty complex and I couldn't find what I was looking for.
An alternative would be to save the two video in two different files but, I don't know how to append them afterwards, as ffmpeg can only append videos encoded with mpeg and rencoding the video isn't really an option, as it will take to much time. Also I couldn't find any other functional way to append two mp4 videos encoded with h264.
I'll be happy to hear any kind of usable ideea to this problem.
If you are saving raw h.264 streams why not simply store two seperate streams and then concatenate the file chunks on the command line seperately using a system command system("cat file1 file2 > finalfile")
If your output is one of the following you can append directly using cat
Transport stream [ts] with same codecs
.mpg files
raw h.264 files
raw mpeg4 files which have exactly same encoding headers [same dimensions, profile and toolsets mentioned in header]
H.263 streams
You cannot concatenate directly mp4 files or 3gpp files.
I'd like to write an app that merges multiple images into a movie on Android. JMF has a basic implementation (JpegImagesToMovie). But, JMF isn't supported on Dalvik.
Is there an alternative library that I can use for this ? Or if there is no library available, does anyone have any pointers for what I need to research to implement myself.
Rgds, Kevin.
I'm not aware of any pure-Java video encoders, and the built-in video encoder in Android appears to be limited to capturing video from the camera alone, rather than a custom input source.
You could look at writing a multi-part JPEG (quite rare but well supported) writer, or even an MJPEG (used by many digicams) encoder.
Short version: What is the best way to get data encoded in an MP3 (and ideally in an
AAC/Ogg/WMA) into a Java array or ByteBuffer that I can then
manipulate?
I'm putting together a program that has slowing down and speeding up
sound files as one of its features. This works fine for WAV files,
which are a header plus the exact binary data that needs to be sent to
the speaker, and now I need to implement it for MP3 (ideally, this
would also support AAC, Ogg, and WMA, but since those are less popular
formats this is not required). Android does not expose an interface
to decode the MP3 without playing it, so I need to create that
interface.
Three options present themselves, though I'm open to others:
1) Write my own decoder. I already have a functional frame detector
that I was hoping to use for option (3), and now should only need to
implement the Huffman decoding tables.
2) Use JLayer, or an equivalent Java library, to handle the
decoding. I'm not entirely clear on what the license ramifications
are here.
3) Connect to the libmedia library/MediaPlayerService. This is what
SoundPool does, and the amount of use of that service make me believe
that while it's officially unstable, that implementation isn't going
anywhere. This means writing JNI code to connect to the service, but
I'm finding that that's a deep rabbit hole. At the surface, I'm
having trouble with the sp<> template.
I did that with libmad and the NDK. JLayer is way to slow and the media framework is a moving target. You can find info and source code at http://apistudios.com/hosted/marzec/badlogic/wordpress/?p=231
I have not tried it, but mp3transform is LGPL.