I'm getting a strange glitch in a FFT graph for white noise:
I've checked with reference program and while noise file seems to be fine.
Is it a bug in implementation?
void four1(float data[], int nn, int isign) {
int n, mmax, m, j, istep, i;
float wtemp, wr, wpr, wpi, wi, theta;
float tempr, tempi;
n = nn << 1;
j = 1;
for (int i = 1; i < n; i += 2) {
if (j > i) {
tempr = data[j];
data[j] = data[i];
data[i] = tempr;
tempr = data[j + 1];
data[j + 1] = data[i + 1];
data[i + 1] = tempr;
}
m = n >> 1;
while (m >= 2 && j > m) {
j -= m;
m >>= 1;
}
j += m;
}
mmax = 2;
while (n > mmax) {
istep = 2 * mmax;
theta = TWOPI / (isign * mmax);
wtemp = sin(0.5 * theta);
wpr = -2.0 * wtemp * wtemp;
wpi = sin(theta);
wr = 1.0;
wi = 0.0;
for (m = 1; m < mmax; m += 2) {
for (i = m; i <= n; i += istep) {
j = i + mmax;
tempr = wr * data[j] - wi * data[j + 1];
tempi = wr * data[j + 1] + wi * data[j];
data[j] = data[i] - tempr;
data[j + 1] = data[i + 1] - tempi;
data[i] += tempr;
data[i + 1] += tempi;
}
wr = (wtemp = wr) * wpr - wi * wpi + wr;
wi = wi * wpr + wtemp * wpi + wi;
}
mmax = istep;
}
}
Apart from a few minor changes, this code appears to be taken out of the 2nd edition of Numerical Recipes in C. The documentation for this function (taken from the book) states:
Replaces data[1..2*nn] by its discrete Fourier transform, if isign is input as 1; or replaces data[1..2*nn] by nn times its inverse discrete Fourier transform, if isign is input as −1.
data is a complex array of length nn or, equivalently, a real array of length 2*nn. nn MUST be an integer power of 2 (this is not checked for!).
This implementation yields correct results, given an input array with 1-based indexing. You can choose to use the same indexing convention by allocating a C array of size 2*nn+1 and filling your array starting at index 1. Alternatively you could pass an array of size 2*nn which has been fill starting at index 0, but calling four1(data-1, nn, isign) (notice the -1 offset on the data array).
Related
Im trying to render a video frame using android NDK.
Im using this sample of google Native-Codec NDK sample code and modified it so I can manually display each video frame (non-tunneled).
so I added this code to get the output buffer which is in YUV.
ANativeWindow_setBuffersGeometry(mWindow, bufferWidth, bufferHeight,
WINDOW_FORMAT_RGBA_8888
uint8_t *decodedBuff = AMediaCodec_getOutputBuffer(d->codec, status, &bufSize);
auto format = AMediaCodec_getOutputFormat(d->codec);
LOGV("VOUT: format %s", AMediaFormat_toString(format));
AMediaFormat *myFormat = format;
int32_t w,h;
AMediaFormat_getInt32(myFormat, AMEDIAFORMAT_KEY_HEIGHT, &h);
AMediaFormat_getInt32(myFormat, AMEDIAFORMAT_KEY_WIDTH, &w);
err = ANativeWindow_lock(mWindow, &buffer, nullptr);
and these codes to convert the YUV to RGB and display it using native window.
if (err == 0) {
LOGV("ANativeWindow_lock()");
int width =w;
int height=h;
int const frameSize = width * height;
int *line = reinterpret_cast<int *>(buffer.bits);
for (int y= 0; y < height; y++) {
for (int x = 0; x < width; x++) {
/*accessing YUV420SP elements*/
int indexY = y * width + x;
int indexU = (size + (y / 2) * (width ) + (x / 2) *2);
int indexV = (int) (size + (y / 2) * (width) + (x / 2) * 2 + 1);
/*todo; this conversion to int and then later back to int really isn't required.
There's room for better work here.*/
int Y = 0xFF & decodedBuff[indexY];
int U = 0xFF & decodedBuff[indexU];
int V = 0xFF & decodedBuff[indexV];
/*constants picked up from http://www.fourcc.org/fccyvrgb.php*/
int R = (int) (Y + 1.402f * (V - 128));
int G = (int) (Y - 0.344f * (U - 128) - 0.714f * (V - 128));
int B = (int) (Y + 1.772f * (U - 128));
/*clamping values*/
R = R < 0 ? 0 : R;
G = G < 0 ? 0 : G;
B = B < 0 ? 0 : B;
R = R > 255 ? 255 : R;
G = G > 255 ? 255 : G;
B = B > 255 ? 255 : B;
line[buffer.stride * y + x] = 0xff000000 + (B << 16) + (G << 8) + R;
}
}
ANativeWindow_unlockAndPost(mWindow);
Finally I was able to display a video on my device. Now my problem is the video does not scale to fit the surface view :(
Your thoughts are very much appreciated.
I am using Google Map API to get lines on the map in my application. I am loading the nodes of the lines from a database using following code:
// Add polyline "walks voda"
List<WalkLine> dbwalknodes = dbclass.queryWalksFromDatabase(this); // list of latlng
for (int i = 0; i < dbwalknodes.size() - 1 ; i++) {
WalkLine source = dbwalknodes.get(i);
WalkLine destination = dbwalknodes.get(i+1);
Polyline line = mMap.addPolyline(new PolylineOptions()
.add(new LatLng(source.getLat(), source.getLon()),
new LatLng(destination.getLat(), destination.getLon()))
.width(16)
.color(Color.parseColor("#1b9e77"))
.geodesic(true));
line.setZIndex(1000);
}
Do you have any idea how to create the lines smoother while it bends than on the picture bellow? Is it possible?
https://www.dropbox.com/s/6waic988mj90kdk/2014-10-22%2012.48.04.png?dl=0
You should not create a polyline for each two points, it should be a connected polyline with mulitple points, something like this:
public void drawRoute(List<LatLng> location) {
polylineOptions = new PolylineOptions().width(MAPS_PATH_WIDTH).color(routeColor).addAll(location);
polyLine = map.addPolyline(destinationRoutePolyLineOptions);
polyLine.setPoints(location);
}
This will make it much smoother.
Use the following code based on bSpline algorithm, it worked for me on Android.
public List<LatLng> bspline(List<LatLng> poly) {
if (poly.get(0).latitude != poly.get(poly.size()-1).latitude || poly.get(0).longitude != poly.get(poly.size()-1).longitude){
poly.add(new LatLng(poly.get(0).latitude,poly.get(0).longitude));
}
else{
poly.remove(poly.size()-1);
}
poly.add(0,new LatLng(poly.get(poly.size()-1).latitude,poly.get(poly.size()-1).longitude));
poly.add(new LatLng(poly.get(1).latitude,poly.get(1).longitude));
Double[] lats = new Double[poly.size()];
Double[] lons = new Double[poly.size()];
for (int i=0;i<poly.size();i++){
lats[i] = poly.get(i).latitude;
lons[i] = poly.get(i).longitude;
}
double ax, ay, bx, by, cx, cy, dx, dy, lat, lon;
float t;
int i;
List<LatLng> points = new ArrayList<>();
// For every point
for (i = 2; i < lats.length - 2; i++) {
for (t = 0; t < 1; t += 0.2) {
ax = (-lats[i - 2] + 3 * lats[i - 1] - 3 * lats[i] + lats[i + 1]) / 6;
ay = (-lons[i - 2] + 3 * lons[i - 1] - 3 * lons[i] + lons[i + 1]) / 6;
bx = (lats[i - 2] - 2 * lats[i - 1] + lats[i]) / 2;
by = (lons[i - 2] - 2 * lons[i - 1] + lons[i]) / 2;
cx = (-lats[i - 2] + lats[i]) / 2;
cy = (-lons[i - 2] + lons[i]) / 2;
dx = (lats[i - 2] + 4 * lats[i - 1] + lats[i]) / 6;
dy = (lons[i - 2] + 4 * lons[i - 1] + lons[i]) / 6;
lat = ax * Math.pow(t + 0.1, 3) + bx * Math.pow(t + 0.1, 2) + cx * (t + 0.1) + dx;
lon = ay * Math.pow(t + 0.1, 3) + by * Math.pow(t + 0.1, 2) + cy * (t + 0.1) + dy;
points.add(new LatLng(lat, lon));
}
}
return points;
}
I have a problem when implementing a FFT algorithm in Android.
Let´s say that I have a wav file of 8.000 bytes length.
I am aware that you have to select a size of the FFT algorithm (and also has to be a power of 2). My problem is that I am not really sure about how to further proceed from now on.
Lets say that I have chosen a size of the FFT of N=1024.
I have basically to options on my mind:
1) Apply the FFT algorithm directly to the whole array of 8.000 bytes
2) Divide the 8000 byte array wav file in chunks of 1024 bytes (and fill with 0´s the last chunk untill having 8 exact chunks),
then apply the fft to each of this chunks and finally collate all the different chunks again to have one single byte array to represent.
8000*2*1 sec = 8192
I think it´s the option 2 but I am not completely sure.
Here is the fft array thaT I am using:
package com.example.acoustics;
public class FFT {
int n, m;
// Lookup tables. Only need to recompute when size of FFT changes.
double[] cos;
double[] sin;
public FFT(int n) {
this.n = n;
this.m = (int) (Math.log(n) / Math.log(2));
// Make sure n is a power of 2
if (n != (1 << m))
throw new RuntimeException("FFT length must be power of 2");
// precompute tables
cos = new double[n / 2];
sin = new double[n / 2];
for (int i = 0; i < n / 2; i++) {
cos[i] = Math.cos(-2 * Math.PI * i / n);
sin[i] = Math.sin(-2 * Math.PI * i / n);
}
}
/***************************************************************
* fft.c
* Douglas L. Jones
* University of Illinois at Urbana-Champaign
* January 19, 1992
* http://cnx.rice.edu/content/m12016/latest/
*
* fft: in-place radix-2 DIT DFT of a complex input
*
* input:
* n: length of FFT: must be a power of two
* m: n = 2**m
* input/output
* x: double array of length n with real part of data
* y: double array of length n with imag part of data
*
* Permission to copy and use this program is granted
* as long as this header is included.
****************************************************************/
public void fft(double[] x, double[] y) {
int i, j, k, n1, n2, a;
double c, s, t1, t2;
// Bit-reverse
j = 0;
n2 = n / 2;
for (i = 1; i < n - 1; i++) {
n1 = n2;
while (j >= n1) {
j = j - n1;
n1 = n1 / 2;
}
j = j + n1;
if (i < j) {
t1 = x[i];
x[i] = x[j];
x[j] = t1;
t1 = y[i];
y[i] = y[j];
y[j] = t1;
}
}
// FFT
n1 = 0;
n2 = 1;
for (i = 0; i < m; i++) {
n1 = n2;
n2 = n2 + n2;
a = 0;
for (j = 0; j < n1; j++) {
c = cos[a];
s = sin[a];
a += 1 << (m - i - 1);
for (k = j; k < n; k = k + n2) {
t1 = c * x[k + n1] - s * y[k + n1];
t2 = s * x[k + n1] + c * y[k + n1];
x[k + n1] = x[k] - t1;
y[k + n1] = y[k] - t2;
x[k] = x[k] + t1;
y[k] = y[k] + t2;
}
}
}
}
}
I think that you can use the entire array with the FFT. There is not problem with that, you can use 2^13 = 8192 and complete the array with zeros, this processing is also called zero padding and is used in more than one implementation of the FFT. If your procedure works well there is not problem with run the entire array, but if you use section of size 1024 for compute the FFT, then you will have a segmented Fourier transform that not describe well the entire spectrum of the signal, because the FFT use all the positions in the array to compute each value in the new transformed array, then you not get the correct answer in the position one for example if you don't use the entire array of the signal.
This is my analysis of your question I am not hundred percent sure but my knowledge about Fourier series tell me that this is almost that is going to do if you compute a segmented form of the Fourier Transform instead the entire serie.
I have created a method which performs a sobel edge detection.
I use the Camera yuv byte array to perform the detection on.
Now my problem is that I only get 5fps or something, which is really low.
I know it can be done faster because there are other apps on the market who are able to do it at good fps on good quality.
I pass images in a 800x400 resolution.
Can anyone check if my algorithm can be made shorter or more performant?
I already put the algorithm in native code but there seems to be no difference in fps.
public void process() {
progress=0;
index = 0;
// calculate size
// pixel index
size = width*(height-2) - 2;
// pixel loop
while (size>0)
{
// get Y matrix values from YUV
ay = input[index];
by = input[index+1];
cy = input[index+2];
gy = input[index+doubleWidth];
hy = input[index+doubleWidth+1];
iy = input[index+doubleWidth+2];
// get X matrix values from YUV
ax = input[index];
cx = input[index+2];
dx = input[index+width];
fx = input[index+width+2];
gx = input[index+doubleWidth];
ix = input[index+doubleWidth+2];
// 1 2 1
// 0 0 0
// -1 -2 -1
sumy = ay + (by*2) + cy - gy - (2*hy) - iy;
// -1 0 1
// -2 0 2
// -1 0 1
sumx = -ax + cx -(2*dx) + (2*fx) - gx + ix;
total[index] = (int) Math.sqrt(sumx*sumx+sumy*sumy);
// Math.atan2(sumx,sumy);
if(max < total[index])
max = total[index];
// sum = - a -(2*b) - c + g + (2*h) + i;
if (total[index] <0)
total[index] = 0;
// clamp to 255
if (total[index] >255)
total[index] = 0;
sum = (int) (total[index]);
output[index] = 0xff000000 | (sum << 16) | (sum << 8) | sum;
size--;
// next
index++;
}
//ratio = max/255;
}
Thx in Advance !
greetings
So I have two things:
I would consider loosing the Math.sqrt() expression: If you
are only interested in edge detection, I see no need for the this,
as the sqrt function is monotonic and it is really costly to
calculate.
I would consider another algorithm, especially I have had good results with a seperated convolution-filter: http://www.songho.ca/dsp/convolution/convolution.html#separable_convolution as this might bring down the number of arithmetic floating-point operations (which is probably your bottleneck).
I hope this helps, or at least sparks some inspiration. Good luck.
If you are using your algorithm in real-time, call it less often, maybe every ~20 frames instead of every frame.
Do more work per iteration, 800x400 in your algorithm is 318,398 iterations. Each iteration is pulling from the input array in a (to the processor) random way which causes issues with caching. Try pulling ay, ay2, by, by2, cy, cy2 and do twice the calculations per loop, you'll notice that the variables in the next iteration will relate to the previous. ay is now ay2 etc...
Here's a rewrite of your algo, doing twice the work per iteration. It saves a bit in redundant memory access, and ignores square root mentioned in another answer.
public void process() {
progress=0;
index = 0;
// calculate size
// pixel index
size = width*(height-2) - 2;
// do FIRST iteration outside of loop
// grab input avoid redundant memory accesses
ay = ax = input[index];
by = ay2 = ax2 = input[index+1];
cy = by2 = cx = input[index+2];
cy2 = cx2 = input[index+3];
gy = gx = input[index+doubleWidth];
hy = gy2 = gx2 = input[index+doubleWidth+1];
iy = hy2 = ix = input[index+doubleWidth+2];
iy2 = ix2 = input[index+doubleWidth+3];
dx = input[index+width];
dx2 = input[index+width+1];
fx = input[index+width+2];
fx2 = input[index+width+3];
//
sumy = ay + (by*2) + cy - gy - (2*hy) - iy;
sumy2 = ay2 + (by2*2) + cy2 - gy2 - (2*hy2) - iy2;
sumx = -ax + cx -(2*dx) + (2*fx) - gx + ix;
sumx2 = -ax2 + cx2 -(2*dx2) + (2*fx2) - gx2 + ix2;
// ignore the square root
total[index] = fastSqrt(sumx*sumx+sumy*sumy);
total[index+1] = fastSqrt(sumx2*sumx2+sumy2*sumy2);
max = Math.max(max, Math.max(total[index], total[index+1]));
// skip the test for negative value it can never happen
if(total[index] > 255) total[index] = 0;
if(total[index+1] > 255) total[index+1] = 0;
sum = (int) (total[index]);
sum2 = (int) (total[index+1]);
output[index] = 0xff000000 | (sum << 16) | (sum << 8) | sum;
output[index+1] = 0xff000000 | (sum2 << 16) | (sum2 << 8) | sum2;
size -= 2;
index += 2;
while (size>0)
{
// grab input avoid redundant memory accesses
ay = ax = cy;
by = ay2 = ax2 = cy2;
cy = by2 = cs = input[index+2];
cy2 = cx2 = input[index+3];
gy = gx = iy;
hy = gy2 = gx2 = iy2;
iy = hy2 = ix = input[index+doubleWidth+2];
iy2 = ix2 = input[index+doubleWidth+3];
dx = fx;
dx2 = fx2;
fx = input[index+width+2];
fx2 = input[index+width+3];
//
sumy = ay + (by*2) + cy - gy - (2*hy) - iy;
sumy2 = ay2 + (by2*2) + cy2 - gy2 - (2*hy2) - iy2;
sumx = -ax + cx -(2*dx) + (2*fx) - gx + ix;
sumx2 = -ax2 + cx2 -(2*dx2) + (2*fx2) - gx2 + ix2;
// ignore the square root
total[index] = fastSqrt(sumx*sumx+sumy*sumy);
total[index+1] = fastSqrt(sumx2*sumx2+sumy2*sumy2);
max = Math.max(max, Math.max(total[index], total[index+1]));
// skip the test for negative value it can never happen
if(total[index] >= 65536) total[index] = 0;
if(total[index+1] >= 65536) total[index+1] = 0;
sum = (int) (total[index]);
sum2 = (int) (total[index+1]);
output[index] = 0xff000000 | (sum << 16) | (sum << 8) | sum;
output[index+1] = 0xff000000 | (sum2 << 16) | (sum2 << 8) | sum2;
size -= 2;
index += 2;
}
}
// some faster integer only implementation of square root.
public static int fastSqrt(int x) {
}
Please note, the above code was not tested, it was written inside the browser window and may contain syntax errors.
EDIT You could try using a fast integer only square root function to avoid the Math.sqrt.
http://atoms.alife.co.uk/sqrt/index.html
I am currently working on an Android application that processes camera frames retrieved from Camera.PreviewCallback.onPreviewFrame(). These frames are encoded in YUV420SP format and provided as a byte array.
I need to downsize the full frame and its contents, let's say by a factor of 2, from 640x480 px to 320x240. I guess, for downsizing the luminance part, I could just run a loop copying every second value from the byte[] frame to a new, smaller array, but what about the chrominance part? Does anyone know more about the structure of a YUV420SP frame?
Many thanks in advance!
Here is a code to get a half size RGBA image from yuv420sp bytes:
//byte[] data;
int frameSize = getFrameWidth() * getFrameHeight();
int[] rgba = new int[frameSize / 4];
for (int i = 0; i < getFrameHeight() / 2; i++)
for (int j = 0; j < getFrameWidth() / 2; j++) {
int y1 = (0xff & ((int) data[2 * i * getFrameWidth() + j * 2]));
int y2 = (0xff & ((int) data[2 * i * getFrameWidth() + j * 2 + 1]));
int y3 = (0xff & ((int) data[(2 * i + 1) * getFrameWidth() + j * 2]));
int y4 = (0xff & ((int) data[(2 * i + 1) * getFrameWidth() + j * 2 + 1]));
int y = (y1 + y2 + y3 + y4) / 4;
int u = (0xff & ((int) data[frameSize + i * getFrameWidth() + j * 2 + 0]));
int v = (0xff & ((int) data[frameSize + i * getFrameWidth() + j * 2 + 1]));
y = y < 16 ? 16 : y;
int r = Math.round(1.164f * (y - 16) + 1.596f * (v - 128));
int g = Math.round(1.164f * (y - 16) - 0.813f * (v - 128) - 0.391f * (u - 128));
int b = Math.round(1.164f * (y - 16) + 2.018f * (u - 128));
r = r < 0 ? 0 : (r > 255 ? 255 : r);
g = g < 0 ? 0 : (g > 255 ? 255 : g);
b = b < 0 ? 0 : (b > 255 ? 255 : b);
rgba[i * getFrameWidth() / 2 + j] = 0xff000000 + (b << 16) + (g << 8) + r;
}
Bitmap bmp = Bitmap.createBitmap(getFrameWidth()/2, getFrameHeight()/2, Bitmap.Config.ARGB_8888);
bmp.setPixels(rgba, 0/* offset */, getFrameWidth()/2 /* stride */, 0, 0, getFrameWidth()/2, getFrameHeight()/2);