I have an Android Project with OpenCV4.0.1 and TFLite installed.
And I want to make an inference with a pretrained MobileNetV2 of an cv::Mat which I extracted and cropped from a CameraBridgeViewBase (Android style).
But it's kinda difficult.
I followed this example.
That does the inference about a ByteBuffer variable called "imgData" (line 71, class: org.tensorflow.lite.examples.classification.tflite.Classifier)
That imgData looks been filled on the method called "convertBitmapToByteBuffer" from the same class (line 185), adding pixel by pixel form a bitmap that looks to be cropped little before.
private int[] intValues = new int[224 * 224];
Mat _croppedFace = new Mat() // Cropped image from CvCameraViewFrame.rgba() method.
float[][] outputVal = new float[1][1]; // Output value from my MobileNetV2 // trained model (i've changed the output on training, tested on python)
// Following: https://stackoverflow.com/questions/13134682/convert-mat-to-bitmap-opencv-for-android
Bitmap bitmap = Bitmap.createBitmap(_croppedFace.cols(), _croppedFace.rows(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(_croppedFace, bitmap);
convertBitmapToByteBuffer(bitmap); // This call should be used as the example one.
// runInference();
_tflite.run(imgData, outputVal);
But, it looks that the input_shape of my NN is not correct, but I'm following the MobileNet example because my NN it's a MobileNetV2.
I've solved the error, but I'm sure that it isn't the best way to do it.
Keras MobilenetV2 input_shape is: (nBatches, 224, 224, nChannels).
I just want to predict a single image, so, nBaches == 1, and I'm working on RGB mode, so nChannels == 3
// Nasty nasty, but works. nBatches == 2? -- _cropped.shape() == (244, 244), 3 channels.
float [][][][] _inputValue = new float[2][_cropped.cols()][_cropped.rows()][3];
// Fill the _inputValue
for(int i = 0; i < _croppedFace.cols(); ++i)
for (int j = 0; j < _croppedFace.rows(); ++j)
for(int z = 0; z < 3; ++z)
_inputValue [0][i][j][z] = (float) _croppedFace.get(i, j)[z] / 255; // DL works better with 0:1 values.
/*
Output val, has this shape, but I don't really know why.
I'm sure that one's of that 2's is for nClasses (I'm working with 2 classes)
But I don't really know why it's using the other one.
*/
float[][] outputVal = new float[2][2];
// Tensorflow lite interpreter
_tflite.run(_inputValue , outputVal);
On python has the same shape:
Python prediction:
[[XXXXXX, YYYYY]] <- Sure for the last layer that I made, this is just a prototype NN.
Hope some one got help, and also that someone can improve the answer because this is not very optimized.
Related
I'm currently facing a problem I simply don't understand.
I employ ARCore for an inside out tracking task. Since I need to do some additional image processing I use Unitys capability to load a native c++ plugin. At the very end of each frame I pass the image in YUV_420_888 format as raw byte array to my native plugin.
A texture handle is created right at the beginning of the components initialization:
private void CreateTextureAndPassToPlugin()
{
Texture2D tex = new Texture2D(640, 480, TextureFormat.RGBA32, false);
tex.filterMode = FilterMode.Point;
tex.Apply();
debug_screen_.GetComponent<Renderer>().material.mainTexture = tex;
// Pass texture pointer to the plugin
SetTextureFromUnity(tex.GetNativeTexturePtr(), tex.width, tex.height);
}
Since I only need the grayscale image I basically ignore the UV part of the image and only use the y coordinates as displayed in the following:
uchar *p_out;
int channels = 4;
for (int r = 0; r < image_matrix->rows; r++) {
p_out = image_matrix->ptr<uchar>(r);
for (int c = 0; c < image_matrix->cols * channels; c++) {
unsigned int idx = r * y_row_stride + c;
p_out[c] = static_cast<uchar>(image_data[idx]);
p_out[c + 1] = static_cast<uchar>(image_data[idx]);
p_out[c + 2] = static_cast<uchar>(image_data[idx]);
p_out[c + 3] = static_cast<uchar>(255);
}
}
then each frame the image data is put into a GL texture:
GLuint gltex = (GLuint)(size_t)(g_TextureHandle);
glBindTexture(GL_TEXTURE_2D, gltex);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 640, 480, GL_RGBA, GL_UNSIGNED_BYTE, current_image.data);
I know that I use way too much memory by creating and passing the texture as RGBA but since GL_R8 is not supported by OpenGL ES3 and GL_ALPHA always lead to internal OpenGL errors I just pass the greyscale value to each color component.
However in the end the texture is rendered as can be seen in the following image:
At first I thought, that the reason for this may lie in the other channels having the same values, however setting all other channels than the first one to any value does not have any impact.
Am I missing something OpenGL texture creation wise?
YUV_420_888 is a multiplane texture, where the luminance plane only contains a single channel per pixel.
for (int c = 0; c < image_matrix->cols * channels; c++) {
unsigned int idx = r * y_row_stride + c;
Your loop bounds assume c is in multiple of 4 channels, which is right for the output surface, but you then use it also when computing the input surface index. The input surface plane you are using only contains one channel, so idx is wrong.
In general you are also over writing the same memory multiple times - the loop increments c by one each iteration but you then write to c, c+1, c+2, and c+3 so overwrite three of the values you wrote last time.
Shorter answer - your OpenGL ES code is fine, but I think you're filling the texture with bad data.
Untested, but I think you need:
for (int c = 0; c < image_matrix->cols * channels; c += channels) {
unsigned int idx = (r * y_row_stride) + (c / channels);
I'm trying to convert an YUV image to grayscale, so basically I just need the Y values.
To do so I wrote this little piece of code (with frame being the YUV image):
imageConversionTime = System.currentTimeMillis();
size = frame.getSize();
byte nv21ByteArray[] = frame.getImage();
int lol;
for (int i = 0; i < size.width; i++) {
for (int j = 0; j < size.height; j++) {
lol = size.width*j + i;
yMatrix.put(j, i, nv21ByteArray[lol]);
}
}
bitmap = Bitmap.createBitmap(size.width, size.height, Bitmap.Config.ARGB_8888);
Utils.matToBitmap(yMatrix, bitmap);
imageConversionTime = System.currentTimeMillis() - imageConversionTime;
However, this takes about 13500 ms. I need it to be A LOT faster (on my computer it takes 8.5 ms in python) (I work on a Motorola Moto E 4G 2nd generation, not super powerful but it should be enough for converting images right?).
Any suggestions?
Thanks in advance.
First of all I would assign size.width and size.height to a variable. I don't think the compiler will optimize this by default, but I am not sure about this.
Furthermore Create a byte[] representing the result instead of using a Matrix.
Then you could do something like this:
int[] grayScalePixels = new int[size.width * size.height];
int cntPixels = 0;
In your inner loop set
grayScalePixels[cntPixels] = nv21ByteArray[lol];
cntPixels++;
To get your final image do the following:
Bitmap grayScaleBitmap = Bitmap.createBitmap(grayScalePixels, size.width, size.height, Bitmap.Config.ARGB_8888);
Hope it works properly (I have not tested it, however at least the shown principle should be applicable -> relying on a byte[] instead of Matrix)
Probably 2 years too late but anyways ;)
To convert to gray scale, all you need to do is set the u/v values to 128 and leave the y values as is. Note that this code is for YUY2 format. You can refer to this document for other formats.
private void convertToBW(byte[] ptrIn, String filePath) {
// change all u and v values to 127 (cause 128 will cause byte overflow)
byte[] ptrOut = Arrays.copyOf(ptrIn, ptrIn.length);
for (int i = 0, ptrInLength = ptrOut.length; i < ptrInLength; i++) {
if (i % 2 != 0) {
ptrOut[i] = (byte) 127;
}
}
convertToJpeg(ptrOut, filePath);
}
For NV21/NV12, I think the loop would change to:
for (int i = ptrOut.length/2, ptrInLength = ptrOut.length; i < ptrInLength; i++) {}
Note: (didn't try this myself)
Also I would suggest to profile your utils method and createBitmap functions separately.
I have a problem detecting speed traffic signs with opencv 2.4 for Android.
I do the following:
"capture frame -> convert it to HSV -> extract red areas -> detect signs with ellipse detection"
So far ellipse detection works perfect as long as picture is good quality.
But as you see in pictures bellow, that red extraction does not work OK, because of poor quality of picture frames, by my opinion.
Converting original image to HSV:
Imgproc.cvtColor(this.source, this.source, Imgproc.COLOR_RGB2HSV, 3);
Extracting red colors:
Core.inRange(this.source, new Scalar(this.h,this.s,this.v), new Scalar(230,180,180), this.source);
So my question is is there another way of detecting traffic sign like this or extracting red areas out of it, which by the way can be very faint like in last picture ?
This is the original image:
This is converted to HSV, as you can see red areas look the same color as nearby trees. Thats how I'm suppose to know it's red but I can't.
Converted to HSV:
This is with red colors extracted. If colors would be correct I should get almost perfect circle/ellipse around sign, but it is incomplet due to false colors.
Result after extraction:
Ellipse method:
private void findEllipses(Mat input){
Mat thresholdOutput = new Mat();
int thresh = 150;
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
MatOfInt4 hierarchy = new MatOfInt4();
Imgproc.threshold(source, thresholdOutput, thresh, 255, Imgproc.THRESH_BINARY);
//Imgproc.Canny(source, thresholdOutput, 50, 180);
Imgproc.findContours(source, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
RotatedRect minEllipse[] = new RotatedRect[contours.size()];
for(int i=0; i<contours.size();i++){
MatOfPoint2f temp=new MatOfPoint2f(contours.get(i).toArray());
if(temp.size().height > minEllipseSize && temp.size().height < maxEllipseSize){
double a = Imgproc.fitEllipse(temp).size.height;
double b = Imgproc.fitEllipse(temp).size.width;
if(Math.abs(a - b) < 10)
minEllipse[i] = Imgproc.fitEllipse(temp);
}
}
detectedObjects.clear();
for( int i = 0; i< contours.size(); i++ ){
Scalar color = new Scalar(180, 255, 180);
if(minEllipse[i] != null){
detectedObjects.add(new DetectedObject(minEllipse[i].center));
DetectedObject detectedObj = new DetectedObject(minEllipse[i].center);
Core.ellipse(source, minEllipse[i], color, 2, 8);
}
}
}
Problematic sign:
You can find a review of traffic signs detection methods here and here.
You'll see that there are 2 ways you can achieve this:
Color-based (like what you're doing now)
Shape-based
In my experience, I found that shape-based methods works pretty good, because the color may change a lot under different lighting conditions, camera quality, etc.
Since you need to detect speed traffic signs, which I assume are always circular, you can use an ellipse detector to find all circular objects in your image, and then apply some validation to determine if it's a traffic sign or not.
Why ellipse detection?
Well, since you're looking for perspective distorted circles, you are in fact looking for ellipses. Real-time ellipse detection is an interesting (although limited) research topic. I'll point you out to 2 papers with C++ source code available (which you can use in you app through native JNI calls):
L. Libuda, I. Grothues, K.-F. Kraiss, Ellipse detection in digital image
data using geometric features, in: J. Braz, A. Ranchordas, H. Arajo,
J. Jorge (Eds.), Advances in Computer Graphics and Computer Vision,
volume 4 of Communications in Computer and Information Science,
Springer Berlin Heidelberg, 2007, pp. 229-239. link, code
M. Fornaciari, A. Prati, R. Cucchiara,
"A fast and effective ellipse detector for embedded vision applications", Pattern Recognition, 2014 link, code
UPDATE
I tried the method 2) without any preprocessing. You can see that at least the sign with the red border is detected very good:
Referencing to your text:
This is converted to HSV, as you can see red areas look the same color
as nearby trees. Thats how I'm suppose to know it's red but I can't.
I want to show you my result of basically what you did (simple operations should be easily transferable to android openCV):
// convert to HSV
cv::Mat hsv;
cv::cvtColor(input,hsv,CV_BGR2HSV);
std::vector<cv::Mat> channels;
cv::split(hsv,channels);
// opencv = hue values are divided by 2 to fit 8 bit range
float red1 = 25/2.0f;
// red has one part at the beginning and one part at the end of the range (I assume 0° to 25° and 335° to 360°)
float red2 = (360-25)/2.0f;
// compute both thresholds
cv::Mat thres1 = channels[0] < red1;
cv::Mat thres2 = channels[0] > red2;
// choose some minimum saturation
cv::Mat saturationThres = channels[1] > 50;
// combine the results
cv::Mat redMask = (thres1 | thres2) & saturationThres;
// display result
cv::imshow("red", redMask);
These are my results:
From your result, please mind that findContours alters the input image, so maybe you extracted the ellipse but just don't see it in the image anymore, if you saved the image AFTER findContours.
private void findEllipses(Mat input){
Mat thresholdOutput = new Mat();
int thresh = 150;
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
MatOfInt4 hierarchy = new MatOfInt4();
Imgproc.threshold(source, thresholdOutput, thresh, 255, Imgproc.THRESH_BINARY);
//Imgproc.Canny(source, thresholdOutput, 50, 180);
Imgproc.findContours(source, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
// source = thresholdOutput;
RotatedRect minEllipse[] = new RotatedRect[contours.size()];
for(int i=0; i<contours.size();i++){
MatOfPoint2f temp=new MatOfPoint2f(contours.get(i).toArray());
if(temp.size().height > minEllipseSize && temp.size().height < maxEllipseSize){
double a = Imgproc.fitEllipse(temp).size.height;
double b = Imgproc.fitEllipse(temp).size.width;
if(Math.abs(a - b) < 10)
minEllipse[i] = Imgproc.fitEllipse(temp);
}
}
detectedObjects.clear();
for( int i = 0; i< contours.size(); i++ ){
Scalar color = new Scalar(180, 255, 180);
if(minEllipse[i] != null){
detectedObjects.add(new DetectedObject(minEllipse[i].center));
DetectedObject detectedObj = new DetectedObject(minEllipse[i].center);
Core.ellipse(source, minEllipse[i], color, 2, 8);
}
}
}
have you tried using opencv ORB? it works really well.
I created a haar cascade for a traffic sign (roundabout in my case) and used opencv ORB to match features and remove any false positives.
For image recognition used Google's tensorflow and results were spectacular.
Ok so been racking my brain on this one all day. Trying to figure out how I can convert a Bitmap from canvas to a 1bpp (bit per pixel) Bitmap file in Android and physically save it as such.
So far I've iterated through the bitmap and created an int[] of the resulting pixel values as 1s or 0s. However, my next question is what do I do with that?
What I tried to do was something like
int[] bits = // populated earlier
byte[] bmp = new byte[bits.length / 8];
int byteindex = 0;
int bitindex = 0;
for (int i=0; i<bits.length; i++) {
if (bits[i] == 1)
// set to 1
else
// set to 0
if (bitindex++ == 8) {
bitindex = 0;
byteindex++;
}
}
OutputStream out = new FileOutputStream("/mnt/sdcard/dynbmp.bmp");
out.write(bmp);
out.close();
I get a file out of it but it's obviously not a valid bmp file. Who knows what it is. You'll have to forgive me for my lack of bit-byte and imaging knowledge, but where am I screwing up? Do I the idea completely wrong? Am I missing some header info or something?
Yes, you are missing several things. It's a little bit more complicated... Look here:
http://en.wikipedia.org/wiki/BMP_file_format
We have been dealing with OpenCV for two weeks to make it work on Android.
Do you know where can we find an Android implementation of optical flow? It would be nice if it's implemented using OpenCV.
Openframeworks has openCV baked in, as well as many other interesting libraries. It has a very elegant strucutre, and I have used it with android to make a virtual mouse of the phone using motion estimation from the camera.
See the ports to android here http://openframeworks.cc/setup/android-studio/
Seems they recently added support for android studio, otherwise eclipse works great.
Try this
#Override
public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
mRgba = inputFrame.rgba();
if (mMOP2fptsPrev.rows() == 0) {
//Log.d("Baz", "First time opflow");
// first time through the loop so we need prev and this mats
// plus prev points
// get this mat
Imgproc.cvtColor(mRgba, matOpFlowThis, Imgproc.COLOR_RGBA2GRAY);
// copy that to prev mat
matOpFlowThis.copyTo(matOpFlowPrev);
// get prev corners
Imgproc.goodFeaturesToTrack(matOpFlowPrev, MOPcorners, iGFFTMax, 0.05, 20);
mMOP2fptsPrev.fromArray(MOPcorners.toArray());
// get safe copy of this corners
mMOP2fptsPrev.copyTo(mMOP2fptsSafe);
}
else
{
//Log.d("Baz", "Opflow");
// we've been through before so
// this mat is valid. Copy it to prev mat
matOpFlowThis.copyTo(matOpFlowPrev);
// get this mat
Imgproc.cvtColor(mRgba, matOpFlowThis, Imgproc.COLOR_RGBA2GRAY);
// get the corners for this mat
Imgproc.goodFeaturesToTrack(matOpFlowThis, MOPcorners, iGFFTMax, 0.05, 20);
mMOP2fptsThis.fromArray(MOPcorners.toArray());
// retrieve the corners from the prev mat
// (saves calculating them again)
mMOP2fptsSafe.copyTo(mMOP2fptsPrev);
// and save this corners for next time through
mMOP2fptsThis.copyTo(mMOP2fptsSafe);
}
/*
Parameters:
prevImg first 8-bit input image
nextImg second input image
prevPts vector of 2D points for which the flow needs to be found; point coordinates must be single-precision floating-point numbers.
nextPts output vector of 2D points (with single-precision floating-point coordinates) containing the calculated new positions of input features in the second image; when OPTFLOW_USE_INITIAL_FLOW flag is passed, the vector must have the same size as in the input.
status output status vector (of unsigned chars); each element of the vector is set to 1 if the flow for the corresponding features has been found, otherwise, it is set to 0.
err output vector of errors; each element of the vector is set to an error for the corresponding feature, type of the error measure can be set in flags parameter; if the flow wasn't found then the error is not defined (use the status parameter to find such cases).
*/
Video.calcOpticalFlowPyrLK(matOpFlowPrev, matOpFlowThis, mMOP2fptsPrev, mMOP2fptsThis, mMOBStatus, mMOFerr);
cornersPrev = mMOP2fptsPrev.toList();
cornersThis = mMOP2fptsThis.toList();
byteStatus = mMOBStatus.toList();
y = byteStatus.size() - 1;
for (x = 0; x < y; x++) {
if (byteStatus.get(x) == 1) {
pt = cornersThis.get(x);
pt2 = cornersPrev.get(x);
Core.circle(mRgba, pt, 5, colorRed, iLineThickness - 1);
Core.line(mRgba, pt, pt2, colorRed, iLineThickness);
}
}
return mRgba;
}