I'm reading the YUV values from a android image using the camera2 api. Hence I have the 3 planes.
for (int x = 0; x < imageSheaf[0].Width; x++)
{
for (int y = 0; y < imageSheaf[0].Height; y++)
{
imageYuv[x, y] = new yuv();
}
}
for (int j = 0; bufferY.HasRemaining; j++)
{
for (int i = 0; i < rowStrideY/2; i += 2)
{
if (i > width / 2 - 1 || j > height / 2 - 1)
Log.Info("Processing", "Out of Bounds");
imageYuv[i, j].y = bufferY.Get();
bufferY.Get();//skip a pixel due to 4:2:0 sub sampling
}
for (int i = 0; i < rowStrideY/2; i++)//skip a line due to 4:2:0 sub sampling
{
bufferY.Get();
bufferY.Get();
}
if (!bufferY.HasRemaining)
Log.Debug("Processing", "finished");
}
for (int j = 0; bufferU.HasRemaining; j++)
{
for (int i = 0; i < rowStrideU; i++)
{
if (!bufferU.HasRemaining)
Log.Debug("Processing", "finished");
imageYuv[i, j].u = bufferU.Get();
}
if (!bufferU.HasRemaining)
Log.Debug("Processing", "finished");
}
for (int j = 0; bufferV.HasRemaining; j++)
{
for (int i = 0; i < rowStrideV; i++)
{
if (!bufferV.HasRemaining)
Log.Debug("Processing", "finished");
imageYuv[i, j].v = bufferV.Get();
}
if (!bufferV.HasRemaining)
Log.Debug("Processing", "finished");
}
This is the code that I'm using to get the Y, U and V values from the byte buffers.
The ImageFormat is YUV_420_888, It is my understanding that the 4:2:0 subsampling means that for every U or V pixel there is 4 Y pixels.
My issue is that the size of the byte buffers for the U and V planes are larger than they should be causing array out of bounds exceptions:
[Processing] RowstrideY = 720
[Processing] RowstrideU = 368
[Processing] RowstrideV = 368
[Processing] y.remaining = 345600, u.remaining = 88312, v.remaining = 88312
(the size of the image is 720x480)
YUV420 has 8 bits per pixel for Y, and 8 bits per four-pixel group for U and V. So at 720x480, you'd expect the U-V plane to be 360x240.
However, the actual hardware may have additional alignment or stride restrictions. In this case, it appears the hardware requires the stride to be a multiple of 16, so it increases it from 360 to 368.
You'd expect that to turn into a length of 368*240=88320, but remember, the last eight bytes on every line are simply padding. So the buffer can actually be (368*239)+360 = 88312 bytes without omitting any data. If you're getting array-bounds exceptions it's because you're attempting to read the end-of-row pad bytes from the last line, but that's not allowed. The API only guarantees that you will be able to read the data.
The motivation for this is that, if the padding on the last line happened to cross a page boundary, the system would need to allocate an additional unnecessary page for each buffer.
You can modify your code to copy the data bytes from each row, then have a second loop that just consumes the padding bytes (if any) at the end of the row.
Related
I'm trying to convert an YUV image to grayscale, so basically I just need the Y values.
To do so I wrote this little piece of code (with frame being the YUV image):
imageConversionTime = System.currentTimeMillis();
size = frame.getSize();
byte nv21ByteArray[] = frame.getImage();
int lol;
for (int i = 0; i < size.width; i++) {
for (int j = 0; j < size.height; j++) {
lol = size.width*j + i;
yMatrix.put(j, i, nv21ByteArray[lol]);
}
}
bitmap = Bitmap.createBitmap(size.width, size.height, Bitmap.Config.ARGB_8888);
Utils.matToBitmap(yMatrix, bitmap);
imageConversionTime = System.currentTimeMillis() - imageConversionTime;
However, this takes about 13500 ms. I need it to be A LOT faster (on my computer it takes 8.5 ms in python) (I work on a Motorola Moto E 4G 2nd generation, not super powerful but it should be enough for converting images right?).
Any suggestions?
Thanks in advance.
First of all I would assign size.width and size.height to a variable. I don't think the compiler will optimize this by default, but I am not sure about this.
Furthermore Create a byte[] representing the result instead of using a Matrix.
Then you could do something like this:
int[] grayScalePixels = new int[size.width * size.height];
int cntPixels = 0;
In your inner loop set
grayScalePixels[cntPixels] = nv21ByteArray[lol];
cntPixels++;
To get your final image do the following:
Bitmap grayScaleBitmap = Bitmap.createBitmap(grayScalePixels, size.width, size.height, Bitmap.Config.ARGB_8888);
Hope it works properly (I have not tested it, however at least the shown principle should be applicable -> relying on a byte[] instead of Matrix)
Probably 2 years too late but anyways ;)
To convert to gray scale, all you need to do is set the u/v values to 128 and leave the y values as is. Note that this code is for YUY2 format. You can refer to this document for other formats.
private void convertToBW(byte[] ptrIn, String filePath) {
// change all u and v values to 127 (cause 128 will cause byte overflow)
byte[] ptrOut = Arrays.copyOf(ptrIn, ptrIn.length);
for (int i = 0, ptrInLength = ptrOut.length; i < ptrInLength; i++) {
if (i % 2 != 0) {
ptrOut[i] = (byte) 127;
}
}
convertToJpeg(ptrOut, filePath);
}
For NV21/NV12, I think the loop would change to:
for (int i = ptrOut.length/2, ptrInLength = ptrOut.length; i < ptrInLength; i++) {}
Note: (didn't try this myself)
Also I would suggest to profile your utils method and createBitmap functions separately.
I am building an app that needs to be able to display a real-time spectral analyzer. Here is the version I was able to successfully make on iOS:
I am using Wendykierp JTransforms library to perform the FFT calculations, and have managed to capture audio data and execute the FFT functions. See below:
short sData[] = new short[BufferElements2Rec];
int result = audioRecord.read(sData, 0, BufferElements2Rec);
try
{
//Initiate FFT
DoubleFFT_1D fft = new DoubleFFT_1D(sData.length);
//Convert sample data from short[] to double[]
double[] fftSamples = new double[sData.length];
for (int i = 0; i < sData.length; i++) {
//IMPORTANT: We cannot simply cast the short value to double.
//As a double is only 2 bytes (values -32768 to 32768)
//We must divide by 32768 before we cast to Double.
fftSamples[i] = (double) sData[i] / 32768;
}
//Perform fft calcs
fft.realForward(fftSamples);
//TODO - Convert FFT data into 20 "bands"
} Catch (Exception e)
{
}
In iOS, I was using a library (Tempi-FFT) which had built in functionality for calculating magnitude, frequency, and providing averaged data for any given number of bands (I am using 20 bands as you can see in the image above). It seems I don't have that luxury with this library and I need to calculate this myself.
Looking for any good examples or tutorials on how to interperate the data returned by the FFT calculations. Here is some sample data I am receiving:
-11387.0, 183.0, -384.9121475854448, -224.66315714636642, -638.0173005872095, -236.2318653974911, -1137.1498541119106, -437.71599514435786, 1954.683405957685, -2142.742125980924 ...
Looking for simple explanation of how to interpret this data. Some other questions I have looked at that I was either unable to understand, or did not provide information on how to determine a given number of bands:
Power Spectral Density from jTransforms DoubleFFT_1D
How to develop a Spectrum Analyser from a realtime audio?
Your question can be split into two parts: finding the magnitude of all frequencies (interpreting the output) and averaging the frequencies into bands
Finding the magnitude of all frequencies:
I won't go into the intricacies of the Fast Fourier Transform/Discrete Fourier Transform (if you would like to gain a basic understanding see this video), but know that there is a real and an imaginary part of each output.
The documentation of the realForward function describes where both the imaginary and the real parts are located in the output array (I'm assuming you have an even sample size):
a[2*k] = Re[k], 0 <= k < n / 2
a[2*k+1] = Im[k], 0 < k < n / 2
a[1] = Re[n/2]
a is equivalent to your fftSamples, which means we can translate this documentation into code as follows (I've changed Re and Im to realPart and imaginaryPart respectively):
int n = fftSamples.length;
double[] realPart = new double[n / 2];
double[] imaginaryPart = new double[n / 2];
for(int k = 0; k < n / 2; k++) {
realPart[k] = fftSamples[k * 2];
imaginaryPart[k] = fftSamples[k * 2 + 1];
}
realPart[n / 2] = fftSamples[1];
Now we have the real and imaginary parts of each frequency. We could plot these on an x-y coordinate plane using the real part as the x value and the imaginary part as the y value. This creates a triangle, and the length of the triangle's hypotenuse is the magnitude of the frequency. We can use the pythagorean theorem to get this magnitude:
double[] spectrum = new double[n / 2];
for(int k = 1; k < n / 2; k++) {
spectrum[k] = Math.sqrt(Math.pow(realPart[k], 2) + Math.pow(imaginaryPart[k], 2));
}
spectrum[0] = realPart[0];
Note that the 0th index of the spectrum doesn't have an imaginary part. This is the DC component of the signal (we won't use this).
Now, we have an array with the magnitudes of each frequency across your spectrum (If your sampling frequency is 44100Hz, this means you now have an array with the magnitudes of the frequencies between 0Hz and 44100Hz, and if you have 441 values in your array, then each index value represents a 100Hz step.)
Averaging the frequencies into bands:
Now that we've converted the FFT output to data that we can use, we can move on to the second part of your question: finding the averages of different bands of frequencies. This is relatively simple. We just need to split the array into different bands and find the average of each band. This can be generalized like so:
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = (n / 2) / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
Final Code:
And that's it! You now have an array called bands with the average magnitude of each band of frequencies. The code above is purposefully not optimized in order to show how each step works. Here is a shortened and optimized version:
int numFrequencies = fftSamples.length / 2;
double[] spectrum = new double[numFrequencies];
for(int k = 1; k < numFrequencies; k++) {
spectrum[k] = Math.sqrt(Math.pow(fftSamples[k*2], 2) + Math.pow(fftSamples[k*2+1], 2));
}
spectrum[0] = fftSamples[0];
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = numFrequencies / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
//Use bands in view!
This has been a really long answer, and I haven't tested the code yet (though I do plan to). Feel free to comment if you find any mistakes.
I have a function which receives camera frames and makes contrast/brightness adjustments to them. When I have...
void applyContrastBrightnessToFrame(Mat &frame, float contrast, int brightness)
{
for (int i = 0; i < frame.rows; i++) {
uchar *basePixel = frame.ptr(i);
for (int j = 0; j != frame.cols * frame.channels(); j += frame.channels()) {
int channelsToBlend = min(3, frame.channels()); //never adjust alpha channel
for (int c = 0; c < channelsToBlend; c++) {
basePixel[j + c] = saturate_cast<uchar>(basePixel[j + c] * contrast + brightness);
}
}
}
}
It works perfectly.
But when I convert the image to HLS so that I can do these adjustments without ruining the the saturation, pixel manipulations fail...
void applyContrastBrightnessToFrame(Mat &frame, float contrast, int brightness)
{
cvtColor(frame, frame, CV_RGBA2RGB);
cvtColor(frame, frame, CV_RGB2HLS);
assert(frame.channels() == 3);
for (int i = 0; i < frame.rows; i++) {
uchar *basePixel = frame.ptr(i);
for (int j = 0; j != frame.cols * frame.channels(); j += frame.channels()) {
int lumaChannel = 1;
//all pixel manipulations fail....
basePixel[j + lumaChannel] = 0; //setting to a constant
saturate_cast<uchar>(basePixel[j + lumaChannel] + brightness); //adjusting
}
}
cvtColor(frame, frame, CV_HLS2RGB);
cvtColor(frame, frame, CV_BGR2RGBA);
assert(frame.channels() == 4);
}
Here's what I know: The conversions are successful. When I capture an image from the camera and run it through the same function, the pixel manipulations succeed - this is especially weird since the processing of frames and captured images is identical.
What could be going wrong?
I can see that you are trying to alter brightness/contrast of a frame, pixel-wise.
So instead of iterating through every pixel from all channels of the frame, you can first split the HLS channels, perform operations and merge them back.
void applyContrastBrightnessToFrame(Mat &frame, float contrast, int
brightness)
{
cvtColor(frame, frame, CV_RGBA2RGB);
cvtColor(frame, frame, CV_RGB2HLS);
vector<Mat> hlsChannels(3);
split(frame, hlsChannels);
hlsChannels[1] += brightness; //adding brightness to channel 2(lightness channel)
merge(hlschannels, frame);
cvtColor(frame, frame, CV_HLS2RGB);
cvtColor(frame, frame, CV_BGR2RGBA);
}
You can also try looping over the pixels in the lightness channel alone.
Hope this helps!
I am working with Android Renderscript to analyze preview frames received from Camera2API. I intend to analyze each pixel and based on some rules(dependent on intensity and location of the pixel) I need to update a counter. I intend to use a ForEach loop but how do I get the pixel coordinates.
An example java loop would be.
for (int i = 0; i < 240; i++)
{
for (int j = 0; j < 320; j++)
{
tempPixelIntensity = image.getPixel(i,j);
x = i;
y = j ;
if(tempPixelIntensity=zzz&x<zzzandy<zzz)
{
counter++;
}
}
}
How would I go about doing the same in a renderscript? Thanks
You might try something like this:
#pragma rs_fp_relaxed // needed for some GPUs
uint32_t counter;
void RS_KERNEL process(uchar tempPixelIntensity, uint32_t x, uint32_t y)
{
if(tempPixelIntensity=zzz&x<zzzandy<zzz)
{
rsAtomicInc(&counter);
}
}
RS kernels are SPMD (single program multiple data). So you write only the inner part of your loop for a single pixel element and the framework does the looping.
In the java side you will do something like:
Type.Builder tb = new Type.Builder(rs, Element.U8(rs));
tb.setX(320);
tb.setY(240);
Allocation input = Allocation.createTyped(rs, tb.create(), Allocation.USAGE_SCRIPT);
script.forEach_process(input);
So the dimensions of the input allocation determines the bounds that the kernel will operate over. In this case x will vary from [0,319] and y will vary from [0,239]. The x,y parameters to the kernel are special parameters that are filled in by the RS runtime, similarly the tempPixelIntensity value will be populated by the value of the input allocation pixel at the given x,y coordinate.
In my Android project,
Here is my code.
for (int x = 0; x < targetBitArray.length; x += weight) {
for (int y = 0; y < targetBitArray[x].length; y += weight) {
targetBitArray[x][y] = bmp.getPixel(x, y) == mSearchColor;
}
}
but this code wastes a lot of time.
So I need to find way faster than bitmap.getPixel().
I'm trying to get pixel color using byte array converted from bitmap, but I can't.
How to replace Bitmap.getPixel()?
Each Bitmap.getPixel method invocation requires a lot of resources, so you need to avoid the amount of requests in order to improve the performace of your code.
My suggestion is:
Read the image data row-by-row with Bitmap.getPixels method into a local array
Iterate along your local array
e.g.
int [] rowData= new int [bitmapWidth];
for (int row = 0; row < bitmapHeight; row ++) {
// Load row of pixels
bitmap.getPixels(rowData, 0, bitmapWidth, 0, row, bitmapWidth, 1);
for (int column = 0; column < bitmapWidth; column ++) {
targetBitArray[column][row] = rowData(column) == mSearchColor;
}
}
This will be a great improvement for the performace of your code