Update #6 Discovered I was accessing RGB values improperly. I assumed I was accessing data from an Int[], but was instead accessing byte information from a Byte[]. Changed to accessing from Int[] and get the following image:
Update #5 Adding code used to get RGBA ByteBuffer for reference
private void screenScrape() {
Log.d(TAG, "In screenScrape");
//read pixels from frame buffer into PBO (GL_PIXEL_PACK_BUFFER)
mSurface.queueEvent(new Runnable() {
#Override
public void run() {
Log.d(TAG, "In Screen Scrape 1");
//generate and bind buffer ID
GLES30.glGenBuffers(1, pboIds);
checkGlError("Gen Buffers");
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pboIds.get(0));
checkGlError("Bind Buffers");
//creates and initializes data store for PBO. Any pre-existing data store is deleted
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, (mWidth * mHeight * 4), null, GLES30.GL_STATIC_READ);
checkGlError("Buffer Data");
//glReadPixelsPBO(0,0,w,h,GLES30.GL_RGB,GLES30.GL_UNSIGNED_SHORT_5_6_5,0);
glReadPixelsPBO(0, 0, mWidth, mHeight, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, 0);
checkGlError("Read Pixels");
//GLES30.glReadPixels(0,0,w,h,GLES30.GL_RGBA,GLES30.GL_UNSIGNED_BYTE,intBuffer);
}
});
//map PBO data into client address space
mSurface.queueEvent(new Runnable() {
#Override
public void run() {
Log.d(TAG, "In Screen Scrape 2");
//read pixels from PBO into a byte buffer for processing. Unmap buffer for use in next pass
mapBuffer = ((ByteBuffer) GLES30.glMapBufferRange(GLES30.GL_PIXEL_PACK_BUFFER, 0, 4 * mWidth * mHeight, GLES30.GL_MAP_READ_BIT)).order(ByteOrder.nativeOrder());
checkGlError("Map Buffer");
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
checkGlError("Unmap Buffer");
isByteBufferEmpty(mapBuffer, "MAP BUFFER");
convertColorSpaceByteArray(mapBuffer);
mapBuffer.clear();
}
});
}
Update #4 For reference, here is the original image to compare against.
Update #3 This is the output image after interleaving all U/V data into a single array and passing it to the Image object at inputImagePlanes[1]; inputImagePlanes[2]; is unused;
The next image is the same interleaved UV data, but we load this into inputImagePlanes[2]; instead of inputImagePlanes[1];
Update #2 This is the output image after padding the U/V buffers with a zero in between each byte of 'real' data. uArray[uvByteIndex] = (byte) 0;
Update #1 As suggested by a comment, here are the row and pixel strides I get from calling getPixelStride and getRowStride
Y Plane Pixel Stride = 1, Row Stride = 960
U Plane Pixel Stride = 2, Row Stride = 960
V Plane Pixel Stride = 2, Row Stride = 960
The goal of my application is to read pixels out from the screen, compress them, and then send that h264 stream over WiFi to be played be a receiver.
Currently I'm using the MediaMuxer class to convert the raw h264 stream to an MP4, and then save it to file. However the end result video is messed up and I can't figure out why. Lets walk through some of processing and see if we can find anything that jumps out.
Step 1 Set up the encoder. I'm currently taking screen images once every 2 seconds, and using "video/avc" for MIME_TYPE
//create codec for compression
try {
mCodec = MediaCodec.createEncoderByType(MIME_TYPE);
} catch (IOException e) {
Log.d(TAG, "FAILED: Initializing Media Codec");
}
//set up format for codec
MediaFormat mFormat = MediaFormat.createVideoFormat(MIME_TYPE, mWidth, mHeight);
mFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Flexible);
mFormat.setInteger(MediaFormat.KEY_BIT_RATE, 16000000);
mFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 1/2);
mFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5);
Step 2 Read pixels out from screen. This is done using openGL ES, and the pixels are read out in RGBA format. (I've confirmed this part to be working)
Step 3 Convert the RGBA pixels to YUV420 (IYUV) format. This is done using the following method. Note that I have 2 methods for encoding called at the end of this method.
private void convertColorSpaceByteArray(ByteBuffer rgbBuffer) {
long startTime = System.currentTimeMillis();
Log.d(TAG, "In convertColorspace");
final int frameSize = mWidth * mHeight;
final int chromaSize = frameSize / 4;
byte[] rgbByteArray = new byte[rgbBuffer.remaining()];
rgbBuffer.get(rgbByteArray);
byte[] yuvByteArray = new byte[inputBufferSize];
Log.d(TAG, "Input Buffer size = " + inputBufferSize);
byte[] yArray = new byte[frameSize];
byte[] uArray = new byte[(frameSize / 4)];
byte[] vArray = new byte[(frameSize / 4)];
isByteBufferEmpty(rgbBuffer, "RGB BUFFER");
int yIndex = 0;
int uIndex = frameSize;
int vIndex = frameSize + chromaSize;
int yByteIndex = 0;
int uvByteIndex = 0;
int R, G, B, Y, U, V;
int index = 0;
//this loop controls the rows
for (int i = 0; i < mHeight; i++) {
//this loop controls the columns
for (int j = 0; j < mWidth; j++) {
R = (rgbByteArray[index] & 0xff0000) >> 16;
G = (rgbByteArray[index] & 0xff00) >> 8;
B = (rgbByteArray[index] & 0xff);
Y = ((66 * R + 129 * G + 25 * B + 128) >> 8) + 16;
U = ((-38 * R - 74 * G + 112 * B + 128) >> 8) + 128;
V = ((112 * R - 94 * G - 18 * B + 128) >> 8) + 128;
//clamp and load in the Y data
yuvByteArray[yIndex++] = (byte) ((Y < 16) ? 16 : ((Y > 235) ? 235 : Y));
yArray[yByteIndex] = (byte) ((Y < 16) ? 16 : ((Y > 235) ? 235 : Y));
yByteIndex++;
if (i % 2 == 0 && index % 2 == 0) {
//clamp and load in the U & V data
yuvByteArray[uIndex++] = (byte) ((U < 16) ? 16 : ((U > 239) ? 239 : U));
yuvByteArray[vIndex++] = (byte) ((V < 16) ? 16 : ((V > 239) ? 239 : V));
uArray[uvByteIndex] = (byte) ((U < 16) ? 16 : ((U > 239) ? 239 : U));
vArray[uvByteIndex] = (byte) ((V < 16) ? 16 : ((V > 239) ? 239 : V));
uvByteIndex++;
}
index++;
}
}
encodeVideoFromImage(yArray, uArray, vArray);
encodeVideoFromBuffer(yuvByteArray);
}
Step 4 Encode the data! I currently have two different ways of doing this, and each has a different output. One uses a ByteBuffer returned from MediaCodec.getInputBuffer();, the other uses an Image returned from MediaCodec.getInputImage();
Encoding using ByteBuffer
private void encodeVideoFromBuffer(byte[] yuvData) {
Log.d(TAG, "In encodeVideo");
int inputSize = 0;
//create index for input buffer
inputBufferIndex = mCodec.dequeueInputBuffer(0);
//create the input buffer for submission to encoder
ByteBuffer inputBuffer = mCodec.getInputBuffer(inputBufferIndex);
//clear, then copy yuv buffer into the input buffer
inputBuffer.clear();
inputBuffer.put(yuvData);
//flip buffer before reading data out of it
inputBuffer.flip();
mCodec.queueInputBuffer(inputBufferIndex, 0, inputBuffer.remaining(), presentationTime, 0);
presentationTime += MICROSECONDS_BETWEEN_FRAMES;
sendToWifi();
}
And the associated output image (note: I took a screenshot of the MP4)
Encoding using Image
private void encodeVideoFromImage(byte[] yToEncode, byte[] uToEncode, byte[]vToEncode) {
Log.d(TAG, "In encodeVideo");
int inputSize = 0;
//create index for input buffer
inputBufferIndex = mCodec.dequeueInputBuffer(0);
//create the input buffer for submission to encoder
Image inputImage = mCodec.getInputImage(inputBufferIndex);
Image.Plane[] inputImagePlanes = inputImage.getPlanes();
ByteBuffer yPlaneBuffer = inputImagePlanes[0].getBuffer();
ByteBuffer uPlaneBuffer = inputImagePlanes[1].getBuffer();
ByteBuffer vPlaneBuffer = inputImagePlanes[2].getBuffer();
yPlaneBuffer.put(yToEncode);
uPlaneBuffer.put(uToEncode);
vPlaneBuffer.put(vToEncode);
yPlaneBuffer.flip();
uPlaneBuffer.flip();
vPlaneBuffer.flip();
mCodec.queueInputBuffer(inputBufferIndex, 0, inputBufferSize, presentationTime, 0);
presentationTime += MICROSECONDS_BETWEEN_FRAMES;
sendToWifi();
}
And the associated output image (note: I took a screenshot of the MP4)
Step 5 Convert H264 Stream to MP4. Finally I grab the output buffer from the codec, and use MediaMuxer to convert the raw h264 stream to an MP4 that I can play and test for correctness
private void sendToWifi() {
Log.d(TAG, "In sendToWifi");
MediaCodec.BufferInfo mBufferInfo = new MediaCodec.BufferInfo();
//Check to see if encoder has output before proceeding
boolean waitingForOutput = true;
boolean outputHasChanged = false;
int outputBufferIndex = 0;
while (waitingForOutput) {
//access the output buffer from the codec
outputBufferIndex = mCodec.dequeueOutputBuffer(mBufferInfo, -1);
if (outputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
outputFormat = mCodec.getOutputFormat();
outputHasChanged = true;
Log.d(TAG, "OUTPUT FORMAT HAS CHANGED");
}
if (outputBufferIndex >= 0) {
waitingForOutput = false;
}
}
//this buffer now contains the compressed YUV data, ready to be sent over WiFi
ByteBuffer outputBuffer = mCodec.getOutputBuffer(outputBufferIndex);
//adjust output buffer position and limit. As of API 19, this is not automatic
if(mBufferInfo.size != 0) {
outputBuffer.position(mBufferInfo.offset);
outputBuffer.limit(mBufferInfo.offset + mBufferInfo.size);
}
////////////////////////////////FOR DEGBUG/////////////////////////////
if (muxerNotStarted && outputHasChanged) {
//set up track
mTrackIndex = mMuxer.addTrack(outputFormat);
mMuxer.start();
muxerNotStarted = false;
}
if (!muxerNotStarted) {
mMuxer.writeSampleData(mTrackIndex, outputBuffer, mBufferInfo);
}
////////////////////////////END DEBUG//////////////////////////////////
//release the buffer
mCodec.releaseOutputBuffer(outputBufferIndex, false);
muxerPasses++;
}
If you've made it this far you're a gentleman (or lady!) and a scholar! Basically I'm stumped as to why my image is not coming out properly. I'm relatively new to video processing so I'm sure I'm just missing something.
If you're API 19+, might as well stick with encoding method #2, getImage()/encodeVideoFromImage(), since that is more modern.
Focusing on that method: One problem was, you had an unexpected image format. With COLOR_FormatYUV420Flexible, you know you're going to have 8-bit U and V components, but you won't know in advance where they go. That's why you have to query the Image.Plane formats. Could be different on every device.
In this case, the UV format turned out to be interleaved (very common on Android devices). If you're using Java, and you supply each array (U/V) separately, with the "stride" requested ("spacer" byte in-between each sample), I believe one array ends up clobbering the other, because these are actually "direct" ByteBuffers, and they were intended to be used from native code, like in this answer. The solution I explained was to copy an interleaved array into the third (V) plane, and ignore the U plane. On the native side, these two planes actually overlap each other in memory (except for the first and last byte), so filling one causes the implementation to fill both.
If you use the second (U) plane instead, you'll find things work, but the colors look funny. That's also because of the overlapping arrangement of these two planes; what that does, effectively, is shift every array element by one byte (which puts U's where V's should be, and vice versa.)
...In other words, this solution is actually a bit of a hack. Probably the only way to do this correctly, and have it work on all devices, is to use native code (as in the answer I linked above).
Once the color plane problem is fixed, that leaves all the funny overlapping text and vertical striations. These were actually caused by your interpretation of the RGB data, which had the wrong stride.
And, once that is fixed, you have a decent-looking picture. It's been mirrored vertically; I don't know the root cause of that, but I suspect it's an OpenGL issue.
i have a method that generates an audio signal from a smartphone's speaker, i set the duration time for 1 second. i want to know if the played sound is really played for one second or not.
for that i created a playSound() method and used getTimeInMillis() method to measure the time.
the result was varying from 11 ms to 679 ms. it seemed to me that it just measures the excution time of the method not the whole running time (the time till the playSound() stops).
is there a way to measure that time in java?
void genTone() throws IOException{
double instfreq=0, numerator;
for (int i=0;i<numSample; i++ )
{
numerator=(double)(i)/(double)numSample;
instfreq =freq1+(numerator*(freq2-freq1));
if ((i % 1000) == 0) {
Log.e("Current Freq:", String.format("Freq is: %f at loop %d of %d", instfreq, i, numSample));
}
sample[i]=Math.sin(2*Math.PI*i/(sampleRate/instfreq));
}
int idx = 0;
for (final double dVal : sample) {
// scale to maximum amplitude
final short val = (short) ((dVal * 32767)); // max positive sample for signed 16 bit integers is 32767
// in 16 bit wave PCM, first byte is the low order byte (pcm: pulse control modulation)
generatedSnd[idx++] = (byte) (val & 0x00ff);
generatedSnd[idx++] = (byte) ((val & 0xff00) >>> 8);
}
DataOutputStream dd=new DataOutputStream( new FileOutputStream(Environment.getExternalStorageDirectory().getAbsolutePath()+"/tessst.wav" ));
dd.writeBytes("RIFF");
dd.writeInt(48000); // Final file size not known yet, write 0
dd.writeBytes("WAVE");
dd.writeBytes("fmt ");
dd.writeInt(Integer.reverseBytes(16)); // Sub-chunk size, 16 for PCM
dd.writeShort(Short.reverseBytes((short) 1)); // AudioFormat, 1 for PCM
dd.writeShort(Short.reverseBytes( (short) myChannels));// Number of channels, 1 for mono, 2 for stereo
dd.writeInt(Integer.reverseBytes(sampleRate)); // Sample rate
dd.writeInt(Integer.reverseBytes( (sampleRate*myBitsPerSample*myChannels/8))); // Byte rate, SampleRate*NumberOfChannels*BitsPerSample/8
dd.writeShort(Short.reverseBytes((short) (myChannels*myBitsPerSample/8))); // Block align, NumberOfChannels*BitsPerSample/8
dd.writeShort(Short.reverseBytes( myBitsPerSample)); // Bits per sample
dd.writeBytes("data");
dd.write(generatedSnd,0,numSample);
dd.close();
void playSound(){
AudioTrack audioTrack= null;
try{
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,sampleRate, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT, generatedSnd.length, AudioTrack.MODE_STATIC);
audioTrack.write(generatedSnd, 0, generatedSnd.length);
audioTrack.play();
}
catch(Exception e)
{
System.out.print(e);
}
}
void excute() throws IOException
{
long time=Calendar.getInstance().getTimeInMillis();
playSound();
time=Calendar.getInstance().getTimeInMillis()-time;
Log.e("time", String.format(" "+time+" ms"));
}
I generate a PCM and want to loop the sound.
I follow the documentation, but Eclipse keep telling me that
08-05 15:46:26.675: E/AudioTrack(27686): setLoop invalid value: loopStart 0, loopEnd 44100, loopCount -1, framecount 11025, user 11025
here is my code:
void genTone() {
// fill out the array
for (int i = 1; i < numSamples - 1; i = i + 2) {
sample[i] = Math.sin(2 * Math.PI * i / (sampleRate / -300));
}
// convert to 16 bit pcm sound array
// assumes the sample buffer is normalised.
int idx = 0;
for (double dVal : sample) {
short val = (short) (dVal * 32767);
generatedSnd[idx++] = (byte) (val & 0x00ff);
generatedSnd[idx++] = (byte) ((val & 0xff00) >>> 8);
}
//write it to audio Track.
audioTrack.write(generatedSnd, 0, numSamples);
audioTrack.setLoopPoints(0, numSamples, -1);
//from 0.0 ~ 1.0
audioTrack.setStereoVolume((float)0.5, (float)1); //change amplitude
}
public void buttonPlay(View v) {
audioTrack.reloadStaticData();
audioTrack.play();
}
please help ~~
From the documentation: "endInFrames loop end marker expressed in frames"
The log print indicates that your track contains 11025 frames, which is less than the 44100 that you're trying to specify as the end marker (for 16-bit stereo PCM audio, the frame size would be 4 bytes).
Another thing worth noting is that "the track must be stopped or paused for the position to be changed".
I tried to follow this link:
http://mobilengineering.blogspot.com/2012/06/audio-mix-and-record-in-android.html?showComment=1369622288028#c2333829870074273419
But after mixing audio files, file (mixed.wav) on sdcard can not be played, I do not know why.
Can you help me?. Thank you very much ..
This my code:
public class MainActivity extends Activity {
public static final int FREQUENCY = 44100;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
try {
mixSound();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void mixSound() throws IOException {
AudioTrack audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT, 44100, AudioTrack.MODE_STREAM);
InputStream in1 = getResources().openRawResource(R.raw.media_b);
InputStream in2 = getResources().openRawResource(R.raw.media_c);
byte[] arrayMusic1 = null;
arrayMusic1 = new byte[in1.available()];
arrayMusic1 = createMusicArray(in1);
in1.close();
byte[] arrayMusic2 = null;
arrayMusic2 = new byte[in2.available()];
arrayMusic2 = createMusicArray(in2);
in2.close();
byte[] output = new byte[arrayMusic1.length];
audioTrack.play();
for (int i = 0; i < output.length; i++) {
float samplef1 = arrayMusic1[i] / 128.0f;
float samplef2 = arrayMusic2[i] / 128.0f;
float mixed = samplef1 + samplef2;
// reduce the volume a bit:
mixed *= 0.8;
// hard clipping
if (mixed > 1.0f) mixed = 1.0f;
if (mixed < -1.0f) mixed = -1.0f;
byte outputSample = (byte) (mixed * 128.0f);
output[i] = outputSample;
}
audioTrack.write(output, 0, output.length);
convertByteToFile(output);
}
public static byte[] createMusicArray(InputStream is) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buff = new byte[10240];
int i = Integer.MAX_VALUE;
while ((i = is.read(buff, 0, buff.length)) > 0) {
baos.write(buff, 0, i);
}
return baos.toByteArray(); // be sure to close InputStream in calling function
}
public static void convertByteToFile(byte[] fileBytes) throws FileNotFoundException {
BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(Environment.getExternalStorageDirectory().getPath()+"/mixed.wav"));
try {
bos.write(fileBytes);
bos.flush();
bos.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
What you're outputting is just the PCM data. A valid WAV file also needs a header:
Offset Size Name Description
------------------------------------------------------------------------
0 4 ChunkID Contains the letters "RIFF" in ASCII form
(0x52494646 big-endian form).
4 4 ChunkSize 36 + SubChunk2Size, or more precisely:
4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
This is the size of the rest of the chunk
following this number. This is the size of the
entire file in bytes minus 8 bytes for the
two fields not included in this count:
ChunkID and ChunkSize.
8 4 Format Contains the letters "WAVE"
(0x57415645 big-endian form).
12 4 Subchunk1ID Contains the letters "fmt "
(0x666d7420 big-endian form).
16 4 Subchunk1Size 16 for PCM. This is the size of the
rest of the Subchunk which follows this number.
20 2 AudioFormat PCM = 1 (i.e. Linear quantization)
Values other than 1 indicate some
form of compression.
22 2 NumChannels Mono = 1, Stereo = 2, etc.
24 4 SampleRate 8000, 44100, etc.
28 4 ByteRate == SampleRate * NumChannels * BitsPerSample/8
32 2 BlockAlign == NumChannels * BitsPerSample/8
The number of bytes for one sample including
all channels. I wonder what happens when
this number isn't an integer?
34 2 BitsPerSample 8 bits = 8, 16 bits = 16, etc.
2 ExtraParamSize if PCM, then doesn't exist
X ExtraParams space for extra parameters
36 4 Subchunk2ID Contains the letters "data"
(0x64617461 big-endian form).
40 4 Subchunk2Size == NumSamples * NumChannels * BitsPerSample/8
This is the number of bytes in the data.
You can also think of this as the size
of the read of the subchunk following this
number.
After this you write the PCM data.
(Reference).
My app. is calculating noise level and peak of frequency of input sound.
I used FFT to get array of shorts[] buffer , and this is the code :
bufferSize = 1024, sampleRate = 44100
int bufferSize = AudioRecord.getMinBufferSize(sapleRate,
channelConfiguration, audioEncoding);
AudioRecord audioRecord = new AudioRecord(
MediaRecorder.AudioSource.DEFAULT, sapleRate,
channelConfiguration, audioEncoding, bufferSize);
and this is converting code :
short[] buffer = new short[blockSize];
try {
audioRecord.startRecording();
} catch (IllegalStateException e) {
Log.e("Recording failed", e.toString());
}
while (started) {
int bufferReadResult = audioRecord.read(buffer, 0, blockSize);
/*
* Noise level meter begins here
*/
// Compute the RMS value. (Note that this does not remove DC).
double rms = 0;
for (int i = 0; i < buffer.length; i++) {
rms += buffer[i] * buffer[i];
}
rms = Math.sqrt(rms / buffer.length);
mAlpha = 0.9; mGain = 0.0044;
/*Compute a smoothed version for less flickering of the
// display.*/
mRmsSmoothed = mRmsSmoothed * mAlpha + (1 - mAlpha) * rms;
double rmsdB = 20.0 * Math.log10(mGain * mRmsSmoothed);
Now I want to know if this algorithm works correctly or i'm missing something ?
And I want to know if it was correct and i have sound in dB displayed on mobile , how to test it ?
I need any help please , Thanks in advance :)
The code looks correct but you should probably handle the case where the buffer initially contains zeroes, which could cause Math.log10 to fail, e.g. change:
double rmsdB = 20.0 * Math.log10(mGain * mRmsSmoothed);
to:
double rmsdB = mGain * mRmsSmoothed >.0 0 ?
20.0 * Math.log10(mGain * mRmsSmoothed) :
-999.99; // choose some appropriate large negative value here for case where you have no input signal