Android MediaCodec decoder input/output frame count - android

I'm working on video transcoding in Android, and using the standard method as these samples to extract/decode a video. I test the same process on different devices with different video devices, and I found a problem on the frame count of decoder input/output.
For some timecode issues as in this question, I use a queue to record the extracted video samples, and check the queue when I got a decoder frame output, like the following codes:
(I omit the encoding-related codes to make it clearer)
Queue<Long> sample_time_queue = new LinkedList<Long>();
....
// in transcoding loop
if (is_decode_input_done == false)
{
int decode_input_index = decoder.dequeueInputBuffer(TIMEOUT_USEC);
if (decode_input_index >= 0)
{
ByteBuffer decoder_input_buffer = decode_input_buffers[decode_input_index];
int sample_size = extractor.readSampleData(decoder_input_buffer, 0);
if (sample_size < 0)
{
decoder.queueInputBuffer(decode_input_index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
is_decode_input_done = true;
}
else
{
long sample_time = extractor.getSampleTime();
decoder.queueInputBuffer(decode_input_index, 0, sample_size, sample_time, 0);
sample_time_queue.offer(sample_time);
extractor.advance();
}
}
else
{
DumpLog(TAG, "Decoder dequeueInputBuffer timed out! Try again later");
}
}
....
if (is_decode_output_done == false)
{
int decode_output_index = decoder.dequeueOutputBuffer(decode_buffer_info, TIMEOUT_USEC);
switch (decode_output_index)
{
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
{
....
break;
}
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
{
....
break;
}
case MediaCodec.INFO_TRY_AGAIN_LATER:
{
DumpLog(TAG, "Decoder dequeueOutputBuffer timed out! Try again later");
break;
}
default:
{
ByteBuffer decode_output_buffer = decode_output_buffers[decode_output_index];
long ptime_us = decode_buffer_info.presentationTimeUs;
boolean is_decode_EOS = ((decode_buffer_info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0);
if (is_decode_EOS)
{
// Decoder gives an EOS output.
is_decode_output_done = true;
....
}
else
{
// The frame time may not be consistent for some videos.
// As a workaround, we use a frame time queue to guard this.
long sample_time = sample_time_queue.poll();
if (sample_time == ptime_us)
{
// Very good, the decoder input/output time is consistent.
}
else
{
// If the decoder input/output frame count is consistent, we can trust the sample time.
ptime_us = sample_time;
}
// process this frame
....
}
decoder.releaseOutputBuffer(decode_output_index, false);
}
}
}
In some cases, the queue can "correct" the PTS if the decoder gives error value (e.g. a lot of 0s). However, there are still some issues about the frame count of decoder input/output.
On an HTC One 801e device, I use the codec OMX.qcom.video.decoder.avc to decode the video (with MIME types video/avc). The sample time and PTS is matched well for the frames, except the last one.
For example, if the extractor feeds 100 frames and then EOS to the decoder, the first 99 decoded frames has the exactly same time values, but the last frame is missing and I get output EOS from the decoder. I test different videos encoded by the built-in camera, by ffmpeg muxer, or by a video processing AP on Windows. All of them have the last one frame disappeared.
On some pads with OMX.MTK.VIDEO.DECODER.AVC codec, things becomes more confused. Some videos has good PTS from the decoder and the input/output frame count is correct (i.e. the queue is empty when the decoding is done.). Some videos has consistent input/output frame count with bad PTS in decoder output (and I can still correct them by the queue). For some videos, a lot of frames are missing during the decoding. For example, the extractor get 210 frames in a 7 second video, but the decoder only output the last 180 frames. It is impossible to recover the PTS using the same workaround.
Is there any way to expect the input/output frame count for a MediaCodec decoder? Or more accurately, to know which frame(s) are dropped by the decoder while the extractor gives it video samples with correct sample time?

Same basic story as in the other question. Pre-4.3, there were no tests confirming that every frame fed to an encoder or decoder came out the other side. I recall that some devices would reliably drop the last frame in certain tests until the codecs were fixed in 4.3.
I didn't search for a workaround at the time, so I don't know if one exists. Delaying before sending EOS might help if it's causing something to shut down early.
I don't believe I ever saw a device drop large numbers of frames. This seems like an unusual case, as it would have been noticeable in any apps that exercised MediaCodec in similar ways even without careful testing.

Related

How to get frame by frame from MP4? (MediaCodec)

Actually I am working with OpenGL and I would like to put all my textures in MP4 in order to compress them.
Then I need to get it from MP4 on my Android
I need somehow decode MP4 and get frame by frame by request.
I found this MediaCodec
https://developer.android.com/reference/android/media/MediaCodec
and this MediaMetadataRetriever
https://developer.android.com/reference/android/media/MediaMetadataRetriever
But I did not see approach how to request frame by frame...
If there is someone who worked with MP4, please give me a way where to go.
P.S. I am working with native way (JNI), so does not matter how to do it.. Java or native, but I need to find the way.
EDIT1
I make some kind of movie (just one 3d model), so I am changing my geometry as well as textures every 32 milliseconds. So, it is seems to me reasonable to use mp4 for tex because of each new frame (32 milliseconds) very similar to privious one...
Now I use 400 frames for one model. For geometry I use .mtr and for tex I use .pkm (because it optimized for android) , so I have around 350 .mtr files(because some files include subindex) and 400 .pkm files ...
This is the reason why I am going to use mp4 for tex. Because one mp4 much more smaller than 400 .pkm
EDIT2
Plase take a look at Edit1
Actually all that I need to know is there API of Android that could read MP4 by frames? Maybe some kind of getNextFrame() method?
Something like this
MP4Player player = new MP4Player(PATH_TO_MY_MP4_FILE);
void readMP4(){
Bitmap b;
while(player.hasNext()){
b = player.getNextFrame();
///.... my code here ...///
}
}
EDIT3
I made such implementation on Java
public static void read(#NonNull final Context iC, #NonNull final String iPath)
{
long time;
int fileCount = 0;
//Create a new Media Player
MediaPlayer mp = MediaPlayer.create(iC, Uri.parse(iPath));
time = mp.getDuration() * 1000;
Log.e("TAG", String.format("TIME :: %s", time));
MediaMetadataRetriever mRetriever = new MediaMetadataRetriever();
mRetriever.setDataSource(iPath);
long a = System.nanoTime();
//frame rate 10.03/sec, 1/10.03 = in microseconds 99700
for (int i = 99700 ; i <= time ; i = i + 99700)
{
Bitmap b = mRetriever.getFrameAtTime(i, MediaMetadataRetriever.OPTION_CLOSEST_SYNC);
if (b == null)
{
Log.e("TAG", String.format("BITMAP STATE :: %s", "null"));
}
else
{
fileCount++;
}
long curTime = System.nanoTime();
Log.e("TAG", String.format("EXECUTION TIME :: %s", curTime - a));
a = curTime;
}
Log.e("TAG", String.format("COUNT :: %s", fileCount));
}
and here execution time
E/TAG: EXECUTION TIME :: 267982039
E/TAG: EXECUTION TIME :: 222928769
E/TAG: EXECUTION TIME :: 289899461
E/TAG: EXECUTION TIME :: 138265423
E/TAG: EXECUTION TIME :: 127312577
E/TAG: EXECUTION TIME :: 251179654
E/TAG: EXECUTION TIME :: 133996500
E/TAG: EXECUTION TIME :: 289730345
E/TAG: EXECUTION TIME :: 132158270
E/TAG: EXECUTION TIME :: 270951461
E/TAG: EXECUTION TIME :: 116520808
E/TAG: EXECUTION TIME :: 209071269
E/TAG: EXECUTION TIME :: 149697230
E/TAG: EXECUTION TIME :: 138347269
This time in nanoseconds == +/- 200 milliseconds... It is very slowly... I need around 30 milliseconds by frame.
So, I think this method is execution on CPU, so question if there a method that executing on GPU?
EDIT4
I found out that there is MediaCodec class
https://developer.android.com/reference/android/media/MediaCodec
also I found similar question here MediaCodec get all frames from video
I understood that there is a way to read by bytes, but not by frames...
So, still question - if there is a way to read mp4 video by frames?
The solution would look something like the ExtractMpegFramesTest, in which MediaCodec is used to generate "external" textures from video frames. In the test code, the frames are rendered to an off-screen pbuffer and then saved as PNG. You would just render them directly.
There are a few problems with this:
MPEG video isn't designed to work well as a random-access database.
A common GOP (group of pictures) structure has one "key frame" (essentially a JPEG image) followed by 14 delta frames, which just hold the difference from the previous decoded frame. So if you want frame N, you may have to decode frames N-14 through N-1 first. Not a problem if you're always moving forward (playing a movie onto a texture) or you only store key frames (at which point you've invented a clumsy database of JPEG images).
As mentioned in comments and answers, you're likely to get some visual artifacts. How bad these look depends on the material and your compression rate. Since you're generating the frames, you may be able to reduce this by ensuring that, whenever there's a big change, the first frame is always a key frame.
The firmware that MediaCodec interfaces with may want several frames before it starts producing output, even if you start at a key frame. Seeking around in a stream has a latency cost. See e.g. this post. (Ever wonder why DVRs have smooth fast-forward, but not smooth fast-backward?)
MediaCodec frames passed through SurfaceTexture become "external" textures. These have some limitations vs. normal textures -- performance may be worse, can't use as color buffer in an FBO, etc. If you're just rendering it once per frame at 30fps this shouldn't matter.
MediaMetadataRetriever's getFrameAtTime() method has less-than-desirable performance for the reasons noted above. You're unlikely to get better results by writing it yourself, although you can save a bit of time by skipping the step where it creates a Bitmap object. Also, you passed OPTION_CLOSEST_SYNC in, but that will only produce the results you want if all your frames are sync frames (again, clumsy database of JPEG images). You need to use OPTION_CLOSEST.
If you're just trying to play a movie on a texture (or your problem can be reduced to that), Grafika has some examples. One that may be relevant is TextureFromCamera, which renders the camera video stream on a GLES rect that can be zoomed and rotated. You can replace the camera input with the MP4 playback code from one of the other demos. This'll work fine if you're only playing forward, but if you want to skip around or go backward you'll have trouble.
The problem you're describing sounds pretty similar to what 2D game developers deal with. Doing what they do is probably the best approach.
I can see why it might seem easy to have all your textures in a single file, but this is a really really bad idea.
MP4 is a video codec it is highly optimised for a list of frames which have a high level of similarity to adjacent frames i.e. motion. It is also optimised to be decompressed in sequential order, so using a 'random access' approach will be very inefficient.
To give a bit more detail video codecs store key frames (one a second, but the rate changes) and delta frames the rest of the time. The key frames are independently compressed just like separate images, but the delta frames stored as the difference from one or more other frames. The algorithm assumes this difference will be fairly minimal, after motion compensation has been performed.
So if you want to access a single delta frame you code will have to decompress a nearby key frame and all the delta frames that connect it to the frame you want, this will be much slower than just using single frame JPEG.
In short, use JPEG or PNG to compress your textures and add them all to a single archive file to keep it tidy.
Yes there is way to extract single frames from mp4 video.
In principle, you seem to look for alternative way to load textures, where usual way is GLUtils.texImage2D (which fills texture from a Bitmap).
First, you should consider what others advice, and expect visual artifacts from compression. But assuming that your textures form related textures (e.g. an explosion), getting these from video stream makes sense. For unrelated images you'll get better results using JPG or PNG. And note that mp4 video doesn't have alpha channel, often used in textures.
For the task, you can't use MediaMetadataRetriever, it won't give you needed accuracy to extract all frames.
You'd have to work with MediaCodec and MediaExtractor classes. Android documentation for MediaCodec is detailed.
Actually you'll need to implement kind of customized video player, and add one key function: frame step.
Close thing to this is Android's MediaPlayer, which is complete player, but 1) lacks frame-step, and 2) is rather closed-source because it's implemented by lot of native C++ libraries which are impossible to extend and hard to study.
I advice this with experience of creating a frame-by-frame video player, and I did it by adopting MediaPlayer-Extended, which is written in plain java (no native code), so you can include this in your project and add function that you need. It works with Android's MediaCodec and MediaExtractor.
Somewhere in MediaPlayer class you'd add function for frameStep, and add another signal + function in PlaybackThread to decode just one next frame (in paused mode). However, the implementation of this would be up to you. Result would be that you let decoder to obtain and process single frame, consume the frame, then repeat with next frame. I did it, so I know that this approach works.
Another half of the task is about obtaining the result. A video player (with MediaCodec) outputs frames into a Surface. Your task would be to get the pixels.
I know about way how to read RGB bitmap from such surface: you need to create OpenGL Pbuffer EGLSurface, let MediaCodec render into this surface (Android's SurfaceTexture), then read pixels from this surface. This is another nontrivial task, you need to create shader to render EOS texture (the surface), and use GLES20.glReadPixels to obtain RGB pixels into a ByteBuffer. You'd then upload this RGB bitmaps into your textures.
However, as you want to load textures, you may find optimized way how to render the video frame directly into your textures, and avoid moving pixels around.
Hope this helps, and good luck in implementation.
Actually I want to post my implementation for current time.
Here h file
#include <jni.h>
#include <memory>
#include <opencv2/opencv.hpp>
#include "looper.h"
#include "media/NdkMediaCodec.h"
#include "media/NdkMediaExtractor.h"
#ifndef NATIVE_CODEC_NATIVECODECC_H
#define NATIVE_CODEC_NATIVECODECC_H
//Originally took from here https://github.com/googlesamples/android-
ndk/tree/master/native-codec
//Convert took from here
https://github.com/kueblert/AndroidMediaCodec/blob/master/nativecodecvideo.cpp
class NativeCodec
{
public:
NativeCodec() = default;
~NativeCodec() = default;
void DecodeDone();
void Pause();
void Resume();
bool createStreamingMediaPlayer(const std::string &filename);
void setPlayingStreamingMediaPlayer(bool isPlaying);
void shutdown();
void rewindStreamingMediaPlayer();
int getFrameWidth() const
{
return m_frameWidth;
}
int getFrameHeight() const
{
return m_frameHeight;
}
void getNextFrame(std::vector<unsigned char> &imageData);
private:
struct Workerdata
{
AMediaExtractor *ex;
AMediaCodec *codec;
bool sawInputEOS;
bool sawOutputEOS;
bool isPlaying;
bool renderonce;
};
void Seek();
ssize_t m_bufidx = -1;
int m_frameWidth = -1;
int m_frameHeight = -1;
cv::Size m_frameSize;
Workerdata m_data = {nullptr, nullptr, false, false, false, false};
};
#endif //NATIVE_CODEC_NATIVECODECC_H
Here cc file
#include "native_codec.h"
#include <cassert>
#include "native_codec.h"
#include <jni.h>
#include <cstdio>
#include <cstring>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <cerrno>
#include <climits>
#include "util.h"
#include <android/log.h>
#include <string>
#include <chrono>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
#include <android/log.h>
#include <string>
#include <chrono>
// for native window JNI
#include <android/native_window_jni.h>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
using namespace std;
using namespace std::chrono;
bool NativeCodec::createStreamingMediaPlayer(const std::string &filename)
{
AMediaExtractor *ex = AMediaExtractor_new();
media_status_t err = AMediaExtractor_setDataSource(ex, filename.c_str());;
if (err != AMEDIA_OK)
{
return false;
}
size_t numtracks = AMediaExtractor_getTrackCount(ex);
AMediaCodec *codec = nullptr;
for (int i = 0; i < numtracks; i++)
{
AMediaFormat *format = AMediaExtractor_getTrackFormat(ex, i);
int format_color;
AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_COLOR_FORMAT, &format_color);
bool ok = AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_WIDTH, &m_frameWidth);
ok = ok && AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_HEIGHT,
&m_frameHeight);
if (ok)
{
m_frameSize = cv::Size(m_frameWidth, m_frameHeight);
} else
{
//Asking format for frame width / height failed.
}
const char *mime;
if (!AMediaFormat_getString(format, AMEDIAFORMAT_KEY_MIME, &mime))
{
return false;
} else if (!strncmp(mime, "video/", 6))
{
// Omitting most error handling for clarity.
// Production code should check for errors.
AMediaExtractor_selectTrack(ex, i);
codec = AMediaCodec_createDecoderByType(mime);
AMediaCodec_configure(codec, format, nullptr, nullptr, 0);
m_data.ex = ex;
m_data.codec = codec;
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
m_data.isPlaying = false;
m_data.renderonce = true;
AMediaCodec_start(codec);
}
AMediaFormat_delete(format);
}
return true;
}
void NativeCodec::getNextFrame(std::vector<unsigned char> &imageData)
{
if (!m_data.sawInputEOS)
{
m_bufidx = AMediaCodec_dequeueInputBuffer(m_data.codec, 2000);
if (m_bufidx >= 0)
{
size_t bufsize;
auto buf = AMediaCodec_getInputBuffer(m_data.codec, m_bufidx, &bufsize);
auto sampleSize = AMediaExtractor_readSampleData(m_data.ex, buf, bufsize);
if (sampleSize < 0)
{
sampleSize = 0;
m_data.sawInputEOS = true;
}
auto presentationTimeUs = AMediaExtractor_getSampleTime(m_data.ex);
AMediaCodec_queueInputBuffer(m_data.codec, m_bufidx, 0, sampleSize,
presentationTimeUs,
m_data.sawInputEOS ?
AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM : 0);
AMediaExtractor_advance(m_data.ex);
}
}
if (!m_data.sawOutputEOS)
{
AMediaCodecBufferInfo info;
auto status = AMediaCodec_dequeueOutputBuffer(m_data.codec, &info, 0);
if (status >= 0)
{
if (info.flags & AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM)
{
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM", "AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM :: %s",
//
"output EOS");
m_data.sawOutputEOS = true;
}
if (info.size > 0)
{
// size_t bufsize;
uint8_t *buf = AMediaCodec_getOutputBuffer(m_data.codec,
static_cast<size_t>(status), /*bufsize*/nullptr);
cv::Mat YUVframe(cv::Size(m_frameSize.width, static_cast<int>
(m_frameSize.height * 1.5)), CV_8UC1, buf);
cv::Mat colImg(m_frameSize, CV_8UC3);
cv::cvtColor(YUVframe, colImg, CV_YUV420sp2BGR, 3);
auto dataSize = colImg.rows * colImg.cols * colImg.channels();
imageData.assign(colImg.data, colImg.data + dataSize);
}
AMediaCodec_releaseOutputBuffer(m_data.codec, static_cast<size_t>(status),
info.size != 0);
if (m_data.renderonce)
{
m_data.renderonce = false;
return;
}
} else if (status < 0)
{
getNextFrame(imageData);
} else if (status == AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED)
{
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED", "AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED :: %s", //
"output buffers changed");
} else if (status == AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED)
{
auto format = AMediaCodec_getOutputFormat(m_data.codec);
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED", "AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED :: %s",
//
AMediaFormat_toString(format));
AMediaFormat_delete(format);
} else if (status == AMEDIACODEC_INFO_TRY_AGAIN_LATER)
{
__android_log_print(ANDROID_LOG_ERROR, "AMEDIACODEC_INFO_TRY_AGAIN_LATER",
"AMEDIACODEC_INFO_TRY_AGAIN_LATER :: %s", //
"no output buffer right now");
} else
{
__android_log_print(ANDROID_LOG_ERROR, "UNEXPECTED INFO CODE", "UNEXPECTED
INFO CODE :: %zd", //
status);
}
}
}
void NativeCodec::DecodeDone()
{
if (m_data.codec != nullptr)
{
AMediaCodec_stop(m_data.codec);
AMediaCodec_delete(m_data.codec);
AMediaExtractor_delete(m_data.ex);
m_data.sawInputEOS = true;
m_data.sawOutputEOS = true;
}
}
void NativeCodec::Seek()
{
AMediaExtractor_seekTo(m_data.ex, 0, AMEDIAEXTRACTOR_SEEK_CLOSEST_SYNC);
AMediaCodec_flush(m_data.codec);
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
if (!m_data.isPlaying)
{
m_data.renderonce = true;
}
}
void NativeCodec::Pause()
{
if (m_data.isPlaying)
{
// flush all outstanding codecbuffer messages with a no-op message
m_data.isPlaying = false;
}
}
void NativeCodec::Resume()
{
if (!m_data.isPlaying)
{
m_data.isPlaying = true;
}
}
void NativeCodec::setPlayingStreamingMediaPlayer(bool isPlaying)
{
if (isPlaying)
{
Resume();
} else
{
Pause();
}
}
void NativeCodec::shutdown()
{
m_bufidx = -1;
DecodeDone();
}
void NativeCodec::rewindStreamingMediaPlayer()
{
Seek();
}
So, according to this implementation for format conversion (in my case from YUV to BGR) you need to set up OpenCV, for understand how to do it check this two source
https://www.youtube.com/watch?v=jN9Bv5LHXMk
https://www.youtube.com/watch?v=0fdIiOqCz3o
And also for sample I leave here my CMakeLists.txt file
#For add OpenCV take a look at this video
#https://www.youtube.com/watch?v=jN9Bv5LHXMk
#https://www.youtube.com/watch?v=0fdIiOqCz3o
#Look at the video than compare with this file and make the same
set(pathToProject
C:/Users/tetavi/Downloads/Buffer/OneMoreArNew/arcore-android-
sdk/samples/hello_ar_c)
set(pathToOpenCv C:/OpenCV-android-sdk)
cmake_minimum_required(VERSION 3.4.1)
set(CMAKE VERBOSE MAKEFILE on)
set(CMAKE CXX FLAGS "${CMAKE_CXX_FLAGS} -std=gnu++11")
include_directories(${pathToOpenCv}/sdk/native/jni/include)
# Import the ARCore library.
add_library(arcore SHARED IMPORTED)
set_target_properties(arcore PROPERTIES IMPORTED_LOCATION
${ARCORE_LIBPATH}/${ANDROID_ABI}/libarcore_sdk_c.so
INTERFACE_INCLUDE_DIRECTORIES ${ARCORE_INCLUDE}
)
# Import the glm header file from the NDK.
add_library(glm INTERFACE)
set_target_properties(glm PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES
${ANDROID_NDK}/sources/third_party/vulkan/src/libs/glm
)
# This is the main app library.
add_library(hello_ar_native SHARED
src/main/cpp/background_renderer.cc
src/main/cpp/hello_ar_application.cc
src/main/cpp/jni_interface.cc
src/main/cpp/video_render.cc
src/main/cpp/geometry_loader.cc
src/main/cpp/plane_renderer.cc
src/main/cpp/native_codec.cc
src/main/cpp/point_cloud_renderer.cc
src/main/cpp/frame_manager.cc
src/main/cpp/safe_queue.cc
src/main/cpp/stb_image.h
src/main/cpp/util.cc)
add_library(lib_opencv SHARED IMPORTED)
set_target_properties(lib_opencv PROPERTIES IMPORTED_LOCATION
${pathToProject}/app/src/main/jniLibs/${CMAKE_ANDROID_ARCH_ABI}/libopencv_java3.so)
target_include_directories(hello_ar_native PRIVATE
src/main/cpp)
target_link_libraries(hello_ar_native $\{log-lib} lib_opencv
android
log
GLESv2
glm
mediandk
arcore)
Usage:
You need to create stream media player with this method
NaviteCodec::createStreamingMediaPlayer(pathToYourMP4file);
and then just use
NativeCodec::getNextFrame(imageData);
Feel free to ask

Android DSP - Trouble with AudioTrack and Sub-Audible Signals

I'm having some trouble getting the AudioTrack class to do what I want it to do. I'm trying to make an app that first lets the user draw a single-cycle waveform on the screen, and then outputs an arbitrary number of cycles of that waveform at an arbitrary frequency through the headphone out. For the primary use, the frequency will be below the audible range, something like 1/60Hz - 20Hz (If anyone's familiar with Eurorack/modular synths, I'm hoping to use the headphone out as a CV source).
The problem I'm having is that the AudioTrack seems to output a highly inaccurate reproduction of the waveform at the low end of this frequency range, even though it does output an accurate reproduction at the high end. The lower the frequency gets, the more the waveform gets 'squished' to the left on my oscilloscope. The pictures below show this phenomenon on a waveform that is supposed to start each cycle low and increase linearly to a max value, but this phenomenon happens with other waveshapes too.
So far, I have the app set up (1) to create an ArrayList to hold the data points from the user's input, (2) to convert the ArrayList into a float[] to feed to the AudioTrack, and (3) to setup the AudioTrack and write the float[] to it when needed. Since the ArrayList and float[] both maintain an accurate reproduction of the waveform, I'm pretty sure my problem is at (3), or else it's a hardware limitation of my phone/scope.
Here's the relevant code for the AudioTrack.
Method that (re)initializes the AudioTrack:
public void updateTrackForWave(Waveform wave, Context context) {
if (wave.getInterpolatedWaveData() == null) return;
int sampleRate = Integer.parseInt(
((AudioManager) context.getSystemService(Context.AUDIO_SERVICE))
.getProperty(AudioManager.PROPERTY_OUTPUT_SAMPLE_RATE));
if (wave.getOutputChannel() == Waveform.OutputChannelEnum.left) {
if (mTrackL != null) mTrackL.release();
mTrackL = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_FLOAT,
Float.BYTES * wave.getInterpolatedWaveData().length,
AudioTrack.MODE_STATIC);
mTrackL.setVolume(AudioTrack.getMaxVolume());
} else if (wave.getOutputChannel() == Waveform.OutputChannelEnum.right) {
if (mTrackR != null) mTrackR.release();
mTrackR = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_FLOAT,
Float.BYTES * wave.getInterpolatedWaveData().length,
AudioTrack.MODE_STATIC);
mTrackR.setVolume(AudioTrack.getMaxVolume());
}
}
Method that plays AudioTrack:
private void playWaveOnTrack(Waveform wave, AudioTrack track) {
if (track == null) return;
if (track.getPlayState() == AudioTrack.PLAYSTATE_PLAYING) {
return;
} else {
if (wave.isCycleMode()) track.setLoopPoints(0,
wave.getInterpolatedWaveData().length, -1);
else if (wave.isOneShotMode()) track.setLoopPoints(0,
wave.getInterpolatedWaveData().length, 0);
if (track.getPlayState() == AudioTrack.PLAYSTATE_PAUSED) {
track.play();
} else if (track.getPlayState() == AudioTrack.PLAYSTATE_STOPPED) {
track.write(wave.getInterpolatedWaveData(), 0,
wave.getInterpolatedWaveData().length, AudioTrack.WRITE_BLOCKING);
track.play();
}
}
}
It's worth noting that wave.getInterpolatedWaveData() returns the float[] from (3) above.
Anyways, any help you could spare on this would be greatly appreciated! Not sure if I've made a mistake in the code I have, or if there's some code I should add (maybe an AudioEffect of some kind?), or if I'm asking too much of my phone, or what.
PS I'm new here and to Android programming in general, so please do point out any forum norms I should be following but am not, or any alternative coding approaches I might not know about.
Pictures:
Waveform at 100Hz
Waveform at 10Hz
Waveform at 2Hz
It's a hardware limitation of your phone. In order to avoid a constant DC current flowing through each headphone speaker, they are capacitively coupled -- each channel is in series with one or more capacitors. This gives your output waveform a time constant; the average voltage level of the waveform will always tend towards 0 (GND), which makes it impossible to output DC, or even low-frequency signals.

Truncate video with MediaCodec

I've used Android MediaCodec library to transcode video files (mainly change the resolution Sample code here)
Another thing I want to achieve is to truncate the video - to only take the beginning 15 seconds. The logic is to check videoExtractor.getSampleTime() if it's greater than the 15 seconds, I'll just write an EOS to the decoder buffer.
But I get an exception Caused by: android.media.MediaCodec$CodecException: Error 0xfffffff3
Here is my code:
while ((!videoEncoderDone) || (!audioEncoderDone)) {
while (!videoExtractorDone
&& (encoderOutputVideoFormat == null || muxing)) {
int decoderInputBufferIndex = videoDecoder.dequeueInputBuffer(TIMEOUT_USEC);
if (decoderInputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER)
break;
ByteBuffer decoderInputBuffer = videoDecoderInputBuffers[decoderInputBufferIndex];
int size = videoExtractor.readSampleData(decoderInputBuffer, 0);
long presentationTime = videoExtractor.getSampleTime();
if (size >= 0) {
videoDecoder.queueInputBuffer(
decoderInputBufferIndex,
0,
size,
presentationTime,
videoExtractor.getSampleFlags());
}
videoExtractorDone = !videoExtractor.advance();
if (!videoExtractorDone && videoExtractor.getSampleTime() > mVideoDurationLimit * 1000000) {
videoExtractorDone = true;
}
if (videoExtractorDone)
videoDecoder.queueInputBuffer(decoderInputBufferIndex,
0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
break;
}
The full source code can be found here.
I am not sure if this is the source of the error or not, but i think it is not safe write EOS to decoder buffer at arbitrary point.
The reason is when the input video is using H264 Main Profile or above,
pts may not be in increasing order (because the existence of B-frame) so you may miss several frames at the end of the video.
Also, when the last frame you send to the decoder is B-frame, decoder might be expecting the next packet but you send the EOS flag and produce error (not quite sure).
What you can do though, you can send EOS flag to the encoder using videoEncoder.signalEndOfInputStream() after you reach your desired frame, (pts of the output of decoder is guaranted to be in increasing order, at least after android version >= 4.3 ?)

How to extract PCM samples from MediaCodec decoder's output

I'm trying to obtain the PCM samples for further processing from a decoded mp4 buffer. I'm first extracting the audio track from a video file recorded with the phone's camera app, and I've made sure the audio track is being selected when I get the 'audio/mp4' mime key:
MediaExtractor extractor = new MediaExtractor();
try {
extractor.setDataSource(fileUri.getPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int numTracks = extractor.getTrackCount();
for(int i =0; i<numTracks; ++i) {
MediaFormat format = extractor.getTrackFormat(i);
String mime = format.getString(MediaFormat.KEY_MIME);
//Log.d("mime =",mime);
if(mime.startsWith("audio/")) {
extractor.selectTrack(i);
decoder = MediaCodec.createDecoderByType(mime);
decoder.configure(format, null, null, 0);
//getSampleCryptoInfo(MediaCodec.CryptoInfo info)
break;
}
}
if (decoder == null) {
Log.e("DecodeActivity", "Can't find audio info!");
return;
}
decoder.start();
After that, I iterate through the track, feeding the codec the stream of encoded access units, and pulling the decoded access units into a ByteBuffer (this is code I recycled from a video rendering example posted here https://github.com/vecio/MediaCodecDemo):
ByteBuffer[] inputBuffers = decoder.getInputBuffers();
ByteBuffer[] outputBuffers = decoder.getOutputBuffers();
BufferInfo info = new BufferInfo();
boolean isEOS = false;
while (true) {
if (!isEOS) {
int inIndex = decoder.dequeueInputBuffer(10000);
if (inIndex >= 0) {
ByteBuffer buffer = inputBuffers[inIndex];
int sampleSize = extractor.readSampleData(buffer, 0);
if (sampleSize < 0) {
// We shouldn't stop the playback at this point, just pass the EOS
// flag to decoder, we will get it again from the
// dequeueOutputBuffer
Log.d("DecodeActivity", "InputBuffer BUFFER_FLAG_END_OF_STREAM");
decoder.queueInputBuffer(inIndex, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
isEOS = true;
} else {
decoder.queueInputBuffer(inIndex, 0, sampleSize, extractor.getSampleTime(), 0);
extractor.advance();
}
}
}
int outIndex = decoder.dequeueOutputBuffer(info, 10000);
switch (outIndex) {
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
Log.d("DecodeActivity", "INFO_OUTPUT_BUFFERS_CHANGED");
outputBuffers = decoder.getOutputBuffers();
break;
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
Log.d("DecodeActivity", "New format " + decoder.getOutputFormat());
break;
case MediaCodec.INFO_TRY_AGAIN_LATER:
Log.d("DecodeActivity", "dequeueOutputBuffer timed out!");
break;
default:
ByteBuffer buffer = outputBuffers[outIndex];
// How to obtain PCM samples from this buffer variable??
decoder.releaseOutputBuffer(outIndex, true);
break;
}
// All decoded frames have been rendered, we can stop playing now
if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
Log.d("DecodeActivity", "OutputBuffer BUFFER_FLAG_END_OF_STREAM");
break;
}
}
The code seems to work with no errors so far, but I'm currently stuck at trying to figure out how to obtain the PCM samples from the ByteBuffer that is taking the value of the output buffer. I guess I could assume that since I'm working with 16-bit stereo audio file, there should be at least two bytes in an interleaved scheme... however I'm not really sure abut this, so to unequivocally retrieve the PCM samples from this byte stream. Does anybody know how get these from the MediaCodec API?
I've read a couple of alternatives using ffmpeg or openSL, but since I am new to Android programming I was hoping to avoid the complications of using c-based APIs and build my first app using only the tools provided by the Android Framework (I'm using KitKat). Any help will be greatly appreciated.
UPDATE: I was able to extract the PCM samples, the way I was assuming to do it and also the way#marcone pointed out. To do so, I added these lines below the buffer assignment:
byte[] b = new byte[info.size-info.offset];
int a = buffer.position();
buffer.get(b);
buffer.position(a);
and finally write the byte array to a file by:
f.write(b,0,info.size-info.offset);
The problem I'm dealing now with is:
The decoded audio samples do not exactly match with the decoding of the mp4 audio track done by iZotope. there is a 48 samples mismatch in the wave files size, and a 2112 samples delay in the decoded signals. My question now is: would all the mp4 decoders yield the same output PCM stream, or is it dependent on the implementation of the decoder?
I found the delays to be caused by the AAC encoding priming and remainder times, as explained here:
https://developer.apple.com/library/mac/documentation/quicktime/qtff/QTFFAppenG/QTFFAppenG.html
In my case, the priming time is always 2112 samples, and the remainder is naturally variable depending on the audio size.
I know the problem is solved here. But the MediaCodec is used Synchronously in the current code, which is deprecated as of now. I learn from this question and made the same thing with Async use of MediaCodec. Just posting the github link so that it might help someone later on.
Github Asynchronous implementation: link
FYI: The audioplayer used is just copy paste from some other thread for the time being. It is depricated. I will update it when i get time. Also the code is in Kotlin.(It is still easy to understand)
Please check out the Async link for official MediaCodec documentation

Using MediaCodec to save series of images as Video

I am trying to use MediaCodec to save a series of Images, saved as Byte Arrays in a file, to a video file. I have tested these images on a SurfaceView (playing them in series) and I can see them fine. I have looked at many examples using MediaCodec, and here is what I understand (please correct me if I am wrong):
Get InputBuffers from MediaCodec object -> fill it with your frame's
image data -> queue the input buffer -> get coded output buffer ->
write it to a file -> increase presentation time and repeat
However, I have tested this a lot and I end up with one of two cases:
All sample projects I tried to imitate have caused Media server to die when calling queueInputBuffer for the second time.
I tried calling codec.flush() at the end (after saving output buffer to file, although none of the examples I saw did this) and the media server did not die, however, I am not able to open the output video file with any media player, so something is wrong.
Here is my code:
MediaCodec codec = MediaCodec.createEncoderByType(MIMETYPE);
MediaFormat mediaFormat = null;
if(CamcorderProfile.hasProfile(CamcorderProfile.QUALITY_720P)){
mediaFormat = MediaFormat.createVideoFormat(MIMETYPE, 1280 , 720);
} else {
mediaFormat = MediaFormat.createVideoFormat(MIMETYPE, 720, 480);
}
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 700000);
mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 10);
mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420SemiPlanar);
mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5);
codec.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
codec.start();
ByteBuffer[] inputBuffers = codec.getInputBuffers();
ByteBuffer[] outputBuffers = codec.getOutputBuffers();
boolean sawInputEOS = false;
int inputBufferIndex=-1,outputBufferIndex=-1;
BufferInfo info=null;
//loop to read YUV byte array from file
inputBufferIndex = codec.dequeueInputBuffer(WAITTIME);
if(bytesread<=0)sawInputEOS=true;
if(inputBufferIndex >= 0){
if(!sawInputEOS){
int samplesiz=dat.length;
inputBuffers[inputBufferIndex].put(dat);
codec.queueInputBuffer(inputBufferIndex, 0, samplesiz, presentationTime, 0);
presentationTime += 100;
info = new BufferInfo();
outputBufferIndex = codec.dequeueOutputBuffer(info, WAITTIME);
Log.i("BATA", "outputBufferIndex="+outputBufferIndex);
if(outputBufferIndex >= 0){
byte[] array = new byte[info.size];
outputBuffers[outputBufferIndex].get(array);
if(array != null){
try {
dos.write(array);
} catch (IOException e) {
e.printStackTrace();
}
}
codec.releaseOutputBuffer(outputBufferIndex, false);
inputBuffers[inputBufferIndex].clear();
outputBuffers[outputBufferIndex].clear();
if(sawInputEOS) break;
}
}else{
codec.queueInputBuffer(inputBufferIndex, 0, 0, presentationTime, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
info = new BufferInfo();
outputBufferIndex = codec.dequeueOutputBuffer(info, WAITTIME);
if(outputBufferIndex >= 0){
byte[] array = new byte[info.size];
outputBuffers[outputBufferIndex].get(array);
if(array != null){
try {
dos.write(array);
} catch (IOException e) {
e.printStackTrace();
}
}
codec.releaseOutputBuffer(outputBufferIndex, false);
inputBuffers[inputBufferIndex].clear();
outputBuffers[outputBufferIndex].clear();
break;
}
}
}
}
codec.flush();
try {
fstream2.close();
dos.flush();
dos.close();
} catch (IOException e) {
e.printStackTrace();
}
codec.stop();
codec.release();
codec = null;
return true;
}
My question is, how can I get a working video from a stream of images using MediaCodec. What am I doing wrong?
Another question (if I am not too greedy), I would like to add an Audio track to this video, can it be done with MediaCodec as well, or must I use FFmpeg?
Note: I know about MediaMux in Android 4.3, however, it is not an option for me as my App must work on Android 4.1+.
Update
Thanks to fadden answer, I was able to reach EOS without Media server dying (Above code is after modification). However, the file I am getting is producing gibberish. Here is a snapshot of the video I get (only works as .h264 file).
My Input image format is YUV image (NV21 from camera preview). I can't get it to be any playable format. I tried all COLOR_FormatYUV420 formats and same gibberish output. And I still can't find away (using MediaCodec) to add audio.
I think you have the right general idea. Some things to be aware of:
Not all devices support COLOR_FormatYUV420SemiPlanar. Some only accept planar. (Android 4.3 introduced CTS tests to ensure that the AVC codec supports one or the other.)
It's not the case that queueing an input buffer will immediately result in the generation of one output buffer. Some codecs may accumulate several frames of input before producing output, and may produce output after your input has finished. Make sure your loops take that into account (e.g. your inputBuffers[].clear() will blow up if it's still -1).
Don't try to submit data and send EOS with the same queueInputBuffer call. The data in that frame may be discarded. Always send EOS with a zero-length buffer.
The output of the codecs is generally pretty "raw", e.g. the AVC codec emits an H.264 elementary stream rather than a "cooked" .mp4 file. Many players won't accept this format. If you can't rely on the presence of MediaMuxer you will need to find another way to cook the data (search around on stackoverflow for ideas).
It's certainly not expected that the mediaserver process would crash.
You can find some examples and links to the 4.3 CTS tests here.
Update: As of Android 4.3, MediaCodec and Camera have no ByteBuffer formats in common, so at the very least you will need to fiddle with the chroma planes. However, that sort of problem manifests very differently (as shown in the images for this question).
The image you added looks like video, but with stride and/or alignment issues. Make sure your pixels are laid out correctly. In the CTS EncodeDecodeTest, the generateFrame() method (line 906) shows how to encode both planar and semi-planar YUV420 for MediaCodec.
The easiest way to avoid the format issues is to move the frames through a Surface (like the CameraToMpegTest sample), but unfortunately that's not possible in Android 4.1.

Categories

Resources