How to get frame by frame from MP4? (MediaCodec)

How to get frame by frame from MP4? (MediaCodec) - android

Actually I am working with OpenGL and I would like to put all my textures in MP4 in order to compress them.
Then I need to get it from MP4 on my Android
I need somehow decode MP4 and get frame by frame by request.
I found this MediaCodec
https://developer.android.com/reference/android/media/MediaCodec
and this MediaMetadataRetriever
https://developer.android.com/reference/android/media/MediaMetadataRetriever
But I did not see approach how to request frame by frame...
If there is someone who worked with MP4, please give me a way where to go.
P.S. I am working with native way (JNI), so does not matter how to do it.. Java or native, but I need to find the way.
EDIT1
I make some kind of movie (just one 3d model), so I am changing my geometry as well as textures every 32 milliseconds. So, it is seems to me reasonable to use mp4 for tex because of each new frame (32 milliseconds) very similar to privious one...
Now I use 400 frames for one model. For geometry I use .mtr and for tex I use .pkm (because it optimized for android) , so I have around 350 .mtr files(because some files include subindex) and 400 .pkm files ...
This is the reason why I am going to use mp4 for tex. Because one mp4 much more smaller than 400 .pkm
EDIT2
Plase take a look at Edit1
Actually all that I need to know is there API of Android that could read MP4 by frames? Maybe some kind of getNextFrame() method?
Something like this
MP4Player player = new MP4Player(PATH_TO_MY_MP4_FILE);
void readMP4(){
Bitmap b;
while(player.hasNext()){
b = player.getNextFrame();
///.... my code here ...///
}
}
EDIT3
I made such implementation on Java
public static void read(#NonNull final Context iC, #NonNull final String iPath)
{
long time;
int fileCount = 0;
//Create a new Media Player
MediaPlayer mp = MediaPlayer.create(iC, Uri.parse(iPath));
time = mp.getDuration() * 1000;
Log.e("TAG", String.format("TIME :: %s", time));
MediaMetadataRetriever mRetriever = new MediaMetadataRetriever();
mRetriever.setDataSource(iPath);
long a = System.nanoTime();
//frame rate 10.03/sec, 1/10.03 = in microseconds 99700
for (int i = 99700 ; i <= time ; i = i + 99700)
{
Bitmap b = mRetriever.getFrameAtTime(i, MediaMetadataRetriever.OPTION_CLOSEST_SYNC);
if (b == null)
{
Log.e("TAG", String.format("BITMAP STATE :: %s", "null"));
}
else
{
fileCount++;
}
long curTime = System.nanoTime();
Log.e("TAG", String.format("EXECUTION TIME :: %s", curTime - a));
a = curTime;
}
Log.e("TAG", String.format("COUNT :: %s", fileCount));
}
and here execution time
E/TAG: EXECUTION TIME :: 267982039
E/TAG: EXECUTION TIME :: 222928769
E/TAG: EXECUTION TIME :: 289899461
E/TAG: EXECUTION TIME :: 138265423
E/TAG: EXECUTION TIME :: 127312577
E/TAG: EXECUTION TIME :: 251179654
E/TAG: EXECUTION TIME :: 133996500
E/TAG: EXECUTION TIME :: 289730345
E/TAG: EXECUTION TIME :: 132158270
E/TAG: EXECUTION TIME :: 270951461
E/TAG: EXECUTION TIME :: 116520808
E/TAG: EXECUTION TIME :: 209071269
E/TAG: EXECUTION TIME :: 149697230
E/TAG: EXECUTION TIME :: 138347269
This time in nanoseconds == +/- 200 milliseconds... It is very slowly... I need around 30 milliseconds by frame.
So, I think this method is execution on CPU, so question if there a method that executing on GPU?
EDIT4
I found out that there is MediaCodec class
https://developer.android.com/reference/android/media/MediaCodec
also I found similar question here MediaCodec get all frames from video
I understood that there is a way to read by bytes, but not by frames...
So, still question - if there is a way to read mp4 video by frames?

The solution would look something like the ExtractMpegFramesTest, in which MediaCodec is used to generate "external" textures from video frames. In the test code, the frames are rendered to an off-screen pbuffer and then saved as PNG. You would just render them directly.
There are a few problems with this:
MPEG video isn't designed to work well as a random-access database.
A common GOP (group of pictures) structure has one "key frame" (essentially a JPEG image) followed by 14 delta frames, which just hold the difference from the previous decoded frame. So if you want frame N, you may have to decode frames N-14 through N-1 first. Not a problem if you're always moving forward (playing a movie onto a texture) or you only store key frames (at which point you've invented a clumsy database of JPEG images).
As mentioned in comments and answers, you're likely to get some visual artifacts. How bad these look depends on the material and your compression rate. Since you're generating the frames, you may be able to reduce this by ensuring that, whenever there's a big change, the first frame is always a key frame.
The firmware that MediaCodec interfaces with may want several frames before it starts producing output, even if you start at a key frame. Seeking around in a stream has a latency cost. See e.g. this post. (Ever wonder why DVRs have smooth fast-forward, but not smooth fast-backward?)
MediaCodec frames passed through SurfaceTexture become "external" textures. These have some limitations vs. normal textures -- performance may be worse, can't use as color buffer in an FBO, etc. If you're just rendering it once per frame at 30fps this shouldn't matter.
MediaMetadataRetriever's getFrameAtTime() method has less-than-desirable performance for the reasons noted above. You're unlikely to get better results by writing it yourself, although you can save a bit of time by skipping the step where it creates a Bitmap object. Also, you passed OPTION_CLOSEST_SYNC in, but that will only produce the results you want if all your frames are sync frames (again, clumsy database of JPEG images). You need to use OPTION_CLOSEST.
If you're just trying to play a movie on a texture (or your problem can be reduced to that), Grafika has some examples. One that may be relevant is TextureFromCamera, which renders the camera video stream on a GLES rect that can be zoomed and rotated. You can replace the camera input with the MP4 playback code from one of the other demos. This'll work fine if you're only playing forward, but if you want to skip around or go backward you'll have trouble.
The problem you're describing sounds pretty similar to what 2D game developers deal with. Doing what they do is probably the best approach.

I can see why it might seem easy to have all your textures in a single file, but this is a really really bad idea.
MP4 is a video codec it is highly optimised for a list of frames which have a high level of similarity to adjacent frames i.e. motion. It is also optimised to be decompressed in sequential order, so using a 'random access' approach will be very inefficient.
To give a bit more detail video codecs store key frames (one a second, but the rate changes) and delta frames the rest of the time. The key frames are independently compressed just like separate images, but the delta frames stored as the difference from one or more other frames. The algorithm assumes this difference will be fairly minimal, after motion compensation has been performed.
So if you want to access a single delta frame you code will have to decompress a nearby key frame and all the delta frames that connect it to the frame you want, this will be much slower than just using single frame JPEG.
In short, use JPEG or PNG to compress your textures and add them all to a single archive file to keep it tidy.

Yes there is way to extract single frames from mp4 video.
In principle, you seem to look for alternative way to load textures, where usual way is GLUtils.texImage2D (which fills texture from a Bitmap).
First, you should consider what others advice, and expect visual artifacts from compression. But assuming that your textures form related textures (e.g. an explosion), getting these from video stream makes sense. For unrelated images you'll get better results using JPG or PNG. And note that mp4 video doesn't have alpha channel, often used in textures.
For the task, you can't use MediaMetadataRetriever, it won't give you needed accuracy to extract all frames.
You'd have to work with MediaCodec and MediaExtractor classes. Android documentation for MediaCodec is detailed.
Actually you'll need to implement kind of customized video player, and add one key function: frame step.
Close thing to this is Android's MediaPlayer, which is complete player, but 1) lacks frame-step, and 2) is rather closed-source because it's implemented by lot of native C++ libraries which are impossible to extend and hard to study.
I advice this with experience of creating a frame-by-frame video player, and I did it by adopting MediaPlayer-Extended, which is written in plain java (no native code), so you can include this in your project and add function that you need. It works with Android's MediaCodec and MediaExtractor.
Somewhere in MediaPlayer class you'd add function for frameStep, and add another signal + function in PlaybackThread to decode just one next frame (in paused mode). However, the implementation of this would be up to you. Result would be that you let decoder to obtain and process single frame, consume the frame, then repeat with next frame. I did it, so I know that this approach works.
Another half of the task is about obtaining the result. A video player (with MediaCodec) outputs frames into a Surface. Your task would be to get the pixels.
I know about way how to read RGB bitmap from such surface: you need to create OpenGL Pbuffer EGLSurface, let MediaCodec render into this surface (Android's SurfaceTexture), then read pixels from this surface. This is another nontrivial task, you need to create shader to render EOS texture (the surface), and use GLES20.glReadPixels to obtain RGB pixels into a ByteBuffer. You'd then upload this RGB bitmaps into your textures.
However, as you want to load textures, you may find optimized way how to render the video frame directly into your textures, and avoid moving pixels around.
Hope this helps, and good luck in implementation.

Actually I want to post my implementation for current time.
Here h file
#include <jni.h>
#include <memory>
#include <opencv2/opencv.hpp>
#include "looper.h"
#include "media/NdkMediaCodec.h"
#include "media/NdkMediaExtractor.h"
#ifndef NATIVE_CODEC_NATIVECODECC_H
#define NATIVE_CODEC_NATIVECODECC_H
//Originally took from here https://github.com/googlesamples/android-
ndk/tree/master/native-codec
//Convert took from here
https://github.com/kueblert/AndroidMediaCodec/blob/master/nativecodecvideo.cpp
class NativeCodec
{
public:
NativeCodec() = default;
~NativeCodec() = default;
void DecodeDone();
void Pause();
void Resume();
bool createStreamingMediaPlayer(const std::string &filename);
void setPlayingStreamingMediaPlayer(bool isPlaying);
void shutdown();
void rewindStreamingMediaPlayer();
int getFrameWidth() const
{
return m_frameWidth;
}
int getFrameHeight() const
{
return m_frameHeight;
}
void getNextFrame(std::vector<unsigned char> &imageData);
private:
struct Workerdata
{
AMediaExtractor *ex;
AMediaCodec *codec;
bool sawInputEOS;
bool sawOutputEOS;
bool isPlaying;
bool renderonce;
};
void Seek();
ssize_t m_bufidx = -1;
int m_frameWidth = -1;
int m_frameHeight = -1;
cv::Size m_frameSize;
Workerdata m_data = {nullptr, nullptr, false, false, false, false};
};
#endif //NATIVE_CODEC_NATIVECODECC_H
Here cc file
#include "native_codec.h"
#include <cassert>
#include "native_codec.h"
#include <jni.h>
#include <cstdio>
#include <cstring>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <cerrno>
#include <climits>
#include "util.h"
#include <android/log.h>
#include <string>
#include <chrono>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
#include <android/log.h>
#include <string>
#include <chrono>
// for native window JNI
#include <android/native_window_jni.h>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
using namespace std;
using namespace std::chrono;
bool NativeCodec::createStreamingMediaPlayer(const std::string &filename)
{
AMediaExtractor *ex = AMediaExtractor_new();
media_status_t err = AMediaExtractor_setDataSource(ex, filename.c_str());;
if (err != AMEDIA_OK)
{
return false;
}
size_t numtracks = AMediaExtractor_getTrackCount(ex);
AMediaCodec *codec = nullptr;
for (int i = 0; i < numtracks; i++)
{
AMediaFormat *format = AMediaExtractor_getTrackFormat(ex, i);
int format_color;
AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_COLOR_FORMAT, &format_color);
bool ok = AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_WIDTH, &m_frameWidth);
ok = ok && AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_HEIGHT,
&m_frameHeight);
if (ok)
{
m_frameSize = cv::Size(m_frameWidth, m_frameHeight);
} else
{
//Asking format for frame width / height failed.
}
const char *mime;
if (!AMediaFormat_getString(format, AMEDIAFORMAT_KEY_MIME, &mime))
{
return false;
} else if (!strncmp(mime, "video/", 6))
{
// Omitting most error handling for clarity.
// Production code should check for errors.
AMediaExtractor_selectTrack(ex, i);
codec = AMediaCodec_createDecoderByType(mime);
AMediaCodec_configure(codec, format, nullptr, nullptr, 0);
m_data.ex = ex;
m_data.codec = codec;
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
m_data.isPlaying = false;
m_data.renderonce = true;
AMediaCodec_start(codec);
}
AMediaFormat_delete(format);
}
return true;
}
void NativeCodec::getNextFrame(std::vector<unsigned char> &imageData)
{
if (!m_data.sawInputEOS)
{
m_bufidx = AMediaCodec_dequeueInputBuffer(m_data.codec, 2000);
if (m_bufidx >= 0)
{
size_t bufsize;
auto buf = AMediaCodec_getInputBuffer(m_data.codec, m_bufidx, &bufsize);
auto sampleSize = AMediaExtractor_readSampleData(m_data.ex, buf, bufsize);
if (sampleSize < 0)
{
sampleSize = 0;
m_data.sawInputEOS = true;
}
auto presentationTimeUs = AMediaExtractor_getSampleTime(m_data.ex);
AMediaCodec_queueInputBuffer(m_data.codec, m_bufidx, 0, sampleSize,
presentationTimeUs,
m_data.sawInputEOS ?
AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM : 0);
AMediaExtractor_advance(m_data.ex);
}
}
if (!m_data.sawOutputEOS)
{
AMediaCodecBufferInfo info;
auto status = AMediaCodec_dequeueOutputBuffer(m_data.codec, &info, 0);
if (status >= 0)
{
if (info.flags & AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM)
{
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM", "AMEDIACODEC_BUFFER_FLAG_END_OF_STREAM :: %s",
//
"output EOS");
m_data.sawOutputEOS = true;
}
if (info.size > 0)
{
// size_t bufsize;
uint8_t *buf = AMediaCodec_getOutputBuffer(m_data.codec,
static_cast<size_t>(status), /*bufsize*/nullptr);
cv::Mat YUVframe(cv::Size(m_frameSize.width, static_cast<int>
(m_frameSize.height * 1.5)), CV_8UC1, buf);
cv::Mat colImg(m_frameSize, CV_8UC3);
cv::cvtColor(YUVframe, colImg, CV_YUV420sp2BGR, 3);
auto dataSize = colImg.rows * colImg.cols * colImg.channels();
imageData.assign(colImg.data, colImg.data + dataSize);
}
AMediaCodec_releaseOutputBuffer(m_data.codec, static_cast<size_t>(status),
info.size != 0);
if (m_data.renderonce)
{
m_data.renderonce = false;
return;
}
} else if (status < 0)
{
getNextFrame(imageData);
} else if (status == AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED)
{
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED", "AMEDIACODEC_INFO_OUTPUT_BUFFERS_CHANGED :: %s", //
"output buffers changed");
} else if (status == AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED)
{
auto format = AMediaCodec_getOutputFormat(m_data.codec);
__android_log_print(ANDROID_LOG_ERROR,
"AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED", "AMEDIACODEC_INFO_OUTPUT_FORMAT_CHANGED :: %s",
//
AMediaFormat_toString(format));
AMediaFormat_delete(format);
} else if (status == AMEDIACODEC_INFO_TRY_AGAIN_LATER)
{
__android_log_print(ANDROID_LOG_ERROR, "AMEDIACODEC_INFO_TRY_AGAIN_LATER",
"AMEDIACODEC_INFO_TRY_AGAIN_LATER :: %s", //
"no output buffer right now");
} else
{
__android_log_print(ANDROID_LOG_ERROR, "UNEXPECTED INFO CODE", "UNEXPECTED
INFO CODE :: %zd", //
status);
}
}
}
void NativeCodec::DecodeDone()
{
if (m_data.codec != nullptr)
{
AMediaCodec_stop(m_data.codec);
AMediaCodec_delete(m_data.codec);
AMediaExtractor_delete(m_data.ex);
m_data.sawInputEOS = true;
m_data.sawOutputEOS = true;
}
}
void NativeCodec::Seek()
{
AMediaExtractor_seekTo(m_data.ex, 0, AMEDIAEXTRACTOR_SEEK_CLOSEST_SYNC);
AMediaCodec_flush(m_data.codec);
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
if (!m_data.isPlaying)
{
m_data.renderonce = true;
}
}
void NativeCodec::Pause()
{
if (m_data.isPlaying)
{
// flush all outstanding codecbuffer messages with a no-op message
m_data.isPlaying = false;
}
}
void NativeCodec::Resume()
{
if (!m_data.isPlaying)
{
m_data.isPlaying = true;
}
}
void NativeCodec::setPlayingStreamingMediaPlayer(bool isPlaying)
{
if (isPlaying)
{
Resume();
} else
{
Pause();
}
}
void NativeCodec::shutdown()
{
m_bufidx = -1;
DecodeDone();
}
void NativeCodec::rewindStreamingMediaPlayer()
{
Seek();
}
So, according to this implementation for format conversion (in my case from YUV to BGR) you need to set up OpenCV, for understand how to do it check this two source
https://www.youtube.com/watch?v=jN9Bv5LHXMk
https://www.youtube.com/watch?v=0fdIiOqCz3o
And also for sample I leave here my CMakeLists.txt file
#For add OpenCV take a look at this video
#https://www.youtube.com/watch?v=jN9Bv5LHXMk
#https://www.youtube.com/watch?v=0fdIiOqCz3o
#Look at the video than compare with this file and make the same
set(pathToProject
C:/Users/tetavi/Downloads/Buffer/OneMoreArNew/arcore-android-
sdk/samples/hello_ar_c)
set(pathToOpenCv C:/OpenCV-android-sdk)
cmake_minimum_required(VERSION 3.4.1)
set(CMAKE VERBOSE MAKEFILE on)
set(CMAKE CXX FLAGS "${CMAKE_CXX_FLAGS} -std=gnu++11")
include_directories(${pathToOpenCv}/sdk/native/jni/include)
# Import the ARCore library.
add_library(arcore SHARED IMPORTED)
set_target_properties(arcore PROPERTIES IMPORTED_LOCATION
${ARCORE_LIBPATH}/${ANDROID_ABI}/libarcore_sdk_c.so
INTERFACE_INCLUDE_DIRECTORIES ${ARCORE_INCLUDE}
)
# Import the glm header file from the NDK.
add_library(glm INTERFACE)
set_target_properties(glm PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES
${ANDROID_NDK}/sources/third_party/vulkan/src/libs/glm
)
# This is the main app library.
add_library(hello_ar_native SHARED
src/main/cpp/background_renderer.cc
src/main/cpp/hello_ar_application.cc
src/main/cpp/jni_interface.cc
src/main/cpp/video_render.cc
src/main/cpp/geometry_loader.cc
src/main/cpp/plane_renderer.cc
src/main/cpp/native_codec.cc
src/main/cpp/point_cloud_renderer.cc
src/main/cpp/frame_manager.cc
src/main/cpp/safe_queue.cc
src/main/cpp/stb_image.h
src/main/cpp/util.cc)
add_library(lib_opencv SHARED IMPORTED)
set_target_properties(lib_opencv PROPERTIES IMPORTED_LOCATION
${pathToProject}/app/src/main/jniLibs/${CMAKE_ANDROID_ARCH_ABI}/libopencv_java3.so)
target_include_directories(hello_ar_native PRIVATE
src/main/cpp)
target_link_libraries(hello_ar_native $\{log-lib} lib_opencv
android
log
GLESv2
glm
mediandk
arcore)
Usage:
You need to create stream media player with this method
NaviteCodec::createStreamingMediaPlayer(pathToYourMP4file);
and then just use
NativeCodec::getNextFrame(imageData);
Feel free to ask

Related

can't debunk eglSwapBuffers function

I am trying to thoroughly track from user-space into kernel-space to find somewhere I can hook my fingers in in kernel-space to pull some information for my CPU driver. While trying to understand the user-space side a little. I am looking to detect frame buffer swaps so that I can track FPS within the kernel (hopefully). I am working with an Odroid XU3 running Android 4.4.4 and a 3.10.9 kernel.
From what I can tell the buffer swap in the egl library is happening in the eglApi.cpp file with the function EGLBoolean eglSwapBuffers(EGLDisplay dpy, EGLSurface draw) pasted below. Now my problem is that I cannot understand how this function is working. It appears to me to call itself recursively as following the return s->cnx->egl.eglSwapBuffers(dp->disp.dpy, s->surface) points me back to the same function due to
struct egl_t {
#include "EGL/egl_entries.in"
};
and from the source then
#define EGL_ENTRY(_r, _api, ...) #_api,
EGL_ENTRY(EGLBoolean, eglSwapBuffers, EGLDisplay, EGLSurface)
The complete function (minus trace stuff) pasted from the source file eglApi.cpp
EGLBoolean eglSwapBuffers(EGLDisplay dpy, EGLSurface draw)
{
ATRACE_CALL();
clearError();
const egl_display_ptr dp = validate_display(dpy);
if (!dp) return EGL_FALSE;
SurfaceRef _s(dp.get(), draw);
if (!_s.get())
return setError(EGL_BAD_SURFACE, EGL_FALSE);
#if EGL_TRACE
...
#endif
egl_surface_t const * const s = get_surface(draw);
if (CC_UNLIKELY(dp->traceGpuCompletion)) {
EGLSyncKHR sync = eglCreateSyncKHR(dpy, EGL_SYNC_FENCE_KHR, NULL);
if (sync != EGL_NO_SYNC_KHR) {
FrameCompletionThread::queueSync(sync);
}
}
if (CC_UNLIKELY(dp->finishOnSwap)) {
uint32_t pixel;
egl_context_t * const c = get_context( egl_tls_t::getContext() );
if (c) {
// glReadPixels() ensures that the frame is complete
s->cnx->hooks[c->version]->gl.glReadPixels(0,0,1,1,
GL_RGBA,GL_UNSIGNED_BYTE,&pixel);
}
}
return s->cnx->egl.eglSwapBuffers(dp->disp.dpy, s->surface);
}
I am missing something blatantly obvious and that someone can point out where this function performs the buffer swap so I can delve down the rabbit hole towards my safe place in kernel-space.

How to play decoded in-memory PCM with Oboe properly?

I use oboe to play sounds in my ndk library, and I use OpenSL with Android extensions to decode wav files into PCM. Decoded signed 16-bit PCM are stored in-memory (std::forward_list<int16_t>), and then they are sent into the oboe stream via a callback. The sound that I can hear from my phone is alike original wav file in volume level, however, 'quality' of such a sound is not -- it bursting and crackle.
I am guessing that I send PCM in audio stream in wrong order or format (sampling rate ?). How can I can use OpenSL decoding with oboe audio stream ?
To decode files to PCM, I use AndroidSimpleBufferQueue as a sink, and AndroidFD with AAssetManager as a source:
// Loading asset
AAsset* asset = AAssetManager_open(manager, path, AASSET_MODE_UNKNOWN);
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
AAsset_close(asset);
// Creating audio source
SLDataLocator_AndroidFD loc_fd = { SL_DATALOCATOR_ANDROIDFD, fd, start, length };
SLDataFormat_MIME format_mime = { SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED };
SLDataSource audio_source = { &loc_fd, &format_mime };
// Creating audio sink
SLDataLocator_AndroidSimpleBufferQueue loc_bq = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 1 };
SLDataFormat_PCM pcm = {
.formatType = SL_DATAFORMAT_PCM,
.numChannels = 2,
.samplesPerSec = SL_SAMPLINGRATE_44_1,
.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16,
.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16,
.channelMask = SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,
.endianness = SL_BYTEORDER_LITTLEENDIAN
};
SLDataSink sink = { &loc_bq, &pcm };
And then I register callback, enqueue buffers and move PCM from buffer to storage until it's done.
NOTE: wav audio file is also 2 channeled signed 16 bit 44.1Hz PCM
My oboe stream configuration is the same:
AudioStreamBuilder builder;
builder.setChannelCount(2);
builder.setSampleRate(44100);
builder.setCallback(this);
builder.setFormat(AudioFormat::I16);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
Audio rendering is working like that:
// Oboe stream callback
audio_engine::onAudioReady(AudioStream* self, void* audio_data, int32_t num_frames) {
auto stream = static_cast<int16_t*>(audio_data);
sound->render(stream, num_frames);
}
// Sound::render method
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
std::advance(iter, cur_frame);
const int32_t rem_size = std::min(num_frames, size - cur_frame);
for(int32_t i = 0; i < rem_size; ++i, std::next(iter), ++cur_frame) {
audio_data[i] += *iter;
}
}

It looks like your render() method is confusing samples and frames.
A frame is a set of simultaneous samples.
In a stereo stream, each frame has TWO samples.
I think your iterator works on a sample basis. In other words next(iter) will advance to the next sample, not the next frame. Try this (untested) code.
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
const int samples_per_frame = 2; // stereo
std::advance(iter, cur_sample);
const int32_t num_samples = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < num_samples; ++i, std::next(iter), ++cur_sample) {
audio_data[i] += *iter;
}
}

In short: essentially, I was experiencing an underrun, because of usage of std::forward_list to store PCM. In such a case (using iterators to retrieve PCM), one has to use a container whose iterator implements LegacyRandomAccessIterator (e.g. std::vector).
I was sure that the linear complexity of methods std::advance and std::next doesn't make any difference there in my sound::render method. However, when I was trying to use raw pointers and pointer arithmetic (thus, constant complexity) with debugging methods that were suggested in the comments (Extracting PCM from WAV with Audacity, then loading this asset with AAssetManager directly into memory), I realized, that amount of "corruption" of output sound was directly proportional to the position argument in std::advance(iter, position) in render method.
So, if the amount of sound corruption was directly proportional to the complexity of std::advance (and also std::next), then I have to make the complexity constant -- by using std::vector as an container. And using an answer from #philburk, I got this as a working result:
class sound {
private:
const int samples_per_frame = 2; // stereo
std::vector<int16_t> pcm_data;
...
public:
render(int16_t* audio_data, int32_t num_frames) {
auto iter = std::next(pcm_data.begin(), cur_sample);
const int32_t s = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < s; ++i, std::advance(iter, 1), ++cur_sample) {
audio_data[i] += *iter;
}
}
}

Android MediaCodec decoder input/output frame count

I'm working on video transcoding in Android, and using the standard method as these samples to extract/decode a video. I test the same process on different devices with different video devices, and I found a problem on the frame count of decoder input/output.
For some timecode issues as in this question, I use a queue to record the extracted video samples, and check the queue when I got a decoder frame output, like the following codes:
(I omit the encoding-related codes to make it clearer)
Queue<Long> sample_time_queue = new LinkedList<Long>();
....
// in transcoding loop
if (is_decode_input_done == false)
{
int decode_input_index = decoder.dequeueInputBuffer(TIMEOUT_USEC);
if (decode_input_index >= 0)
{
ByteBuffer decoder_input_buffer = decode_input_buffers[decode_input_index];
int sample_size = extractor.readSampleData(decoder_input_buffer, 0);
if (sample_size < 0)
{
decoder.queueInputBuffer(decode_input_index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
is_decode_input_done = true;
}
else
{
long sample_time = extractor.getSampleTime();
decoder.queueInputBuffer(decode_input_index, 0, sample_size, sample_time, 0);
sample_time_queue.offer(sample_time);
extractor.advance();
}
}
else
{
DumpLog(TAG, "Decoder dequeueInputBuffer timed out! Try again later");
}
}
....
if (is_decode_output_done == false)
{
int decode_output_index = decoder.dequeueOutputBuffer(decode_buffer_info, TIMEOUT_USEC);
switch (decode_output_index)
{
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
{
....
break;
}
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
{
....
break;
}
case MediaCodec.INFO_TRY_AGAIN_LATER:
{
DumpLog(TAG, "Decoder dequeueOutputBuffer timed out! Try again later");
break;
}
default:
{
ByteBuffer decode_output_buffer = decode_output_buffers[decode_output_index];
long ptime_us = decode_buffer_info.presentationTimeUs;
boolean is_decode_EOS = ((decode_buffer_info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0);
if (is_decode_EOS)
{
// Decoder gives an EOS output.
is_decode_output_done = true;
....
}
else
{
// The frame time may not be consistent for some videos.
// As a workaround, we use a frame time queue to guard this.
long sample_time = sample_time_queue.poll();
if (sample_time == ptime_us)
{
// Very good, the decoder input/output time is consistent.
}
else
{
// If the decoder input/output frame count is consistent, we can trust the sample time.
ptime_us = sample_time;
}
// process this frame
....
}
decoder.releaseOutputBuffer(decode_output_index, false);
}
}
}
In some cases, the queue can "correct" the PTS if the decoder gives error value (e.g. a lot of 0s). However, there are still some issues about the frame count of decoder input/output.
On an HTC One 801e device, I use the codec OMX.qcom.video.decoder.avc to decode the video (with MIME types video/avc). The sample time and PTS is matched well for the frames, except the last one.
For example, if the extractor feeds 100 frames and then EOS to the decoder, the first 99 decoded frames has the exactly same time values, but the last frame is missing and I get output EOS from the decoder. I test different videos encoded by the built-in camera, by ffmpeg muxer, or by a video processing AP on Windows. All of them have the last one frame disappeared.
On some pads with OMX.MTK.VIDEO.DECODER.AVC codec, things becomes more confused. Some videos has good PTS from the decoder and the input/output frame count is correct (i.e. the queue is empty when the decoding is done.). Some videos has consistent input/output frame count with bad PTS in decoder output (and I can still correct them by the queue). For some videos, a lot of frames are missing during the decoding. For example, the extractor get 210 frames in a 7 second video, but the decoder only output the last 180 frames. It is impossible to recover the PTS using the same workaround.
Is there any way to expect the input/output frame count for a MediaCodec decoder? Or more accurately, to know which frame(s) are dropped by the decoder while the extractor gives it video samples with correct sample time?

Same basic story as in the other question. Pre-4.3, there were no tests confirming that every frame fed to an encoder or decoder came out the other side. I recall that some devices would reliably drop the last frame in certain tests until the codecs were fixed in 4.3.
I didn't search for a workaround at the time, so I don't know if one exists. Delaying before sending EOS might help if it's causing something to shut down early.
I don't believe I ever saw a device drop large numbers of frames. This seems like an unusual case, as it would have been noticeable in any apps that exercised MediaCodec in similar ways even without careful testing.

Save recorded audio to file - OpenSL ES - Android

I'm trying to record from the microphone, add some effects, and the save this to a file
I've started with the example native-audio included in the Android NDK.
I'va managed to add some reverb and play it back but I haven't found any examples or help on how to accomplish this.
Any and all help is welcome.

OpenSL is not a framework for file formats and access. If you want a raw PCM file, simply open it for writing and put all buffers from OpenSL callback into the file. But if you want encoded audio, you need your own codec and format handler. You can use ffmpeg libraries, or built-in stagefright.
Update write playback buffers to local raw PCM file
We start with native-audio-jni.c
#include <stdio.h>
FILE* rawFile = NULL;
int bClosing = 0;
...
void bqPlayerCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
assert(bq == bqPlayerBufferQueue);
assert(NULL == context);
// for streaming playback, replace this test by logic to find and fill the next buffer
if (--nextCount > 0 && NULL != nextBuffer && 0 != nextSize) {
SLresult result;
// enqueue another buffer
result = (*bqPlayerBufferQueue)->Enqueue(bqPlayerBufferQueue, nextBuffer, nextSize);
// the most likely other result is SL_RESULT_BUFFER_INSUFFICIENT,
// which for this code example would indicate a programming error
assert(SL_RESULT_SUCCESS == result);
(void)result;
// AlexC: here we write:
if (rawFile) {
fwrite(nextBuffer, nextSize, 1, rawFile);
}
}
if (bClosing) { // it is important to do this in a callback, to be on the correct thread
fclose(rawFile);
rawFile = NULL;
}
// AlexC: end of changes
}
...
void Java_com_example_nativeaudio_NativeAudio_startRecording(JNIEnv* env, jclass clazz)
{
bClosing = 0;
rawFile = fopen("/sdcard/rawFile.pcm", "wb");
...
void Java_com_example_nativeaudio_NativeAudio_shutdown(JNIEnv* env, jclass clazz)
{
bClosing = 1;
...

Pass the raw vector from c to java and encode it in mp3 with mediaRecorder, I don't know if you can set the audio source from a raw vector, but maybe...

How to get MJPG stream video from android IPWebcam using opencv

I am using the IP Webcam program on android and receiving it on my PC by WiFi. What I want is to use opencv in Visual Studio, C++, to get that video stream, there is an option to get MJPG stream by the following URL: http://MyIP:port/videofeed
How to get it using opencv?

Old question, but I hope this can help someone (same as my answer here)
OpenCV expects a filename extension for its VideoCapture argument,
even though one isn't always necessary (like in your case).
You can "trick" it by passing in a dummy parameter which ends in the
mjpg extension:
So perhaps try:
VideoCapture vc;
ipCam.open("http://MyIP:port/videofeed/?dummy=param.mjpg")

Install IP Camera Adapter and configure it to capture the videostream. Then install ManyCam and you'll see "MPEG Camera" in the camera section.(you'll see the same instructions if you go to the link on how to setup IPWebCam for skype)
Now you can access your MJPG stream just like a webcam through openCV. I tried this with OpenCV 2.2 + QT and works well.
Think this helps.

I did a dirty patch to make openCV working with android ipWebcam:
In the file OpenCV-2.3.1/modules/highgui/src/cap_ffmpeg_impl.hpp
In the function bool CvCapture_FFMPEG::open( const char* _filename )
replace:
int err = av_open_input_file(&ic, _filename, NULL, 0, NULL);
by
AVInputFormat* iformat = av_find_input_format("mjpeg");
int err = av_open_input_file(&ic, _filename, iformat, 0, NULL);
ic->iformat = iformat;
and comment:
err = av_seek_frame(ic, video_stream, 10, 0);
if (err < 0)
{
filename=(char*)malloc(strlen(_filename)+1);
strcpy(filename, _filename);
// reopen videofile to 'seek' back to first frame
reopen();
}
else
{
// seek seems to work, so we don't need the filename,
// but we still need to seek back to filestart
filename=NULL;
int64_t ts = video_st->first_dts;
int flags = AVSEEK_FLAG_FRAME | AVSEEK_FLAG_BACKWARD;
av_seek_frame(ic, video_stream, ts, flags);
}
That should work. Hope it helps.

This is the solution (im using IP Webcam on android):
CvCapture* capture = 0;
capture = cvCaptureFromFile("http://IP:Port/videofeed?dummy=param.mjpg");
I am not able to comment, so im posting new post. In original answer is an error - used / before dummy. THX for solution.

Working example for me
// OpenCVTest.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "opencv2/highgui/highgui.hpp"
/**
* #function main
*/
int main( int argc, const char** argv )
{
CvCapture* capture;
IplImage* frame = 0;
while (true)
{
//Read the video stream
capture = cvCaptureFromFile("http://192.168.1.129:8080/webcam.mjpeg");
frame = cvQueryFrame( capture );
// create a window to display detected faces
cvNamedWindow("Sample Program", CV_WINDOW_AUTOSIZE);
// display face detections
cvShowImage("Sample Program", frame);
int c = cvWaitKey(10);
if( (char)c == 27 ) { exit(0); }
}
// clean up and release resources
cvReleaseImage(&frame);
return 0;
}
Broadcast mjpeg from a webcam with vlc, how described at http://tumblr.martinml.com/post/2108887785/how-to-broadcast-a-mjpeg-stream-from-your-webcam-with

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.