I've seen some questions here that is related to my error such as this and this and I know that I can't execute Imgproc.matchTemplate() method if the image and the template don't have the same datatype. But I'm still confused on how to know what type of Mat I'm using.
Below is my code which I adapted from example here:
for (int i = 0; i < 24; i++) {
arrDraw[i] = getResources().getIdentifier("let" + i, "drawable", getPackageName());
}
Mat mImage = input.submat(bigRect);
for (int i = 0; i < 24; i++) {
Mat mTemplate = Utils.loadResource(this, arrDraw[i], Highgui.CV_LOAD_IMAGE_COLOR);
Mat mResult = new Mat(mImage.rows(), mImage.cols(), CvType.CV_32FC1);
Imgproc.matchTemplate(mImage, mTemplate, mResult, match_method);
Core.normalize(mResult, mResult, 0, 1, Core.NORM_MINMAX, -1, new Mat());
... // further process
}
So basically what I'm trying to do is take a mImage from submat of inputFrame and do match template process with 24 other pictures and decide which has the best value (either lowest or highest). Yet the error shows this.
OpenCV Error: Assertion failed ((img.depth() == CV_8U || img.depth() == CV_32F) && img.type() == templ.type()) in void cv::matchTemplate(cv::InputArray, cv::InputArray, cv::OutputArray, int), file /home/reports/ci/slave_desktop/50-SDK/opencv/modules/imgproc/src/templmatch.cpp, line 249
I tried to initialize the mImage and mTemplate first with the same type but still no luck. Any advice? Thanks before.
The error is telling you that image and template have different types.
Assertion failed ... img.type() == templ.type() ....
I'd be willing to bet (a small amount) that mTemplate is CV_8UC3 BGR ordered.
From your code posted, it's not possible to tell what mImage's type is though if it's extracted from a camera frame, and if you did something like :
public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
Mat inputFrame = inputFrame.rgba();
....
}
then it's likely to be CV_8UC4 BGRA ordered. Which is not the same type.
Also, I'm not sure what the behaviour of submat() is one a 3D or 4D input matrix, I think it's designed to operate only on 2D matrices so you may find that it returns either a 2D matrix (CV_8UC2) or some undefined weirdness.
I'd suggest that you try dumping the type() and depth() or both image and template before your matchTemplate( ... ) call.
Related
The question comes 1st: I am looking for FAST approach to match images.
Now, the use case: I am developing a detector to detect orb on a 6x5 Match-3 game board for android platform. I have an array of the orb icon with transparent background, but the orb on the screen (screenshot) has different background color, probably different size too. I have to compare each orb on the screen with my array of icons (69 icons specifically) so it's a 69x30=2070 steps. I tried lazy implementation and group almost similar icon together to reduce the steps but still take a long time (10s at most) for computation. I also tried checking the channel and depth of image, resizing the images to have same size and tweaking the threshold value but still no luck.
I have tried Histogram Matching (seperate channel, grayscale), Template Matching (CCOEFF, SQDIFF, CCORR), AKAZE, ORB(unbounded, bounded), PHash all using OpenCV but histogram matching and PHash give me erroneous result (too much false positive), Template Matching consume 10s+ (considered too slow for user to wait) while AKAZE and ORB give better result than all other methods but still needs 6s+ per try. Is there any other method that can helps me cut down the computation time down to somewhere near 1s and can give better result considering the worst case scenario is 2070 steps?
Referrences that I have read that compares the performances of different feature matching algorithms:
A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK. It shows that ORB and BRISK should be averagely better than the other approach compared while AKAZE is moderately good for most cases. I deleted my Histogram comparison code as it is not really helpful but you may find the rest of it below.
Mat source = Utils.loadResource(this, R.drawable.orb_icon, Imgcodecs.CV_LOAD_IMAGE_UNCHANGED);
Mat tmp = new Mat();
Bitmap cropped_img = Bitmap.createBitmap(screenshot, x, y, width, height);
Utils.bitmapToMat(cropped_img, tmp);
//template matching code
int r_rows = source.rows() - tmp.rows() + 1;
int r_cols = source.cols() - tmp.cols() + 1;
Mat result = new Mat();
result.create(r_rows, r_cols, CvType.CV_32F);
Imgproc.matchTemplate(source, tmp, result, Imgproc.TM_CCOEFF_NORMED);
Core.MinMaxLocResult mmr = Core.minMaxLoc(result);
double maxVal = mmr.maxVal;
return maxVal;
//AKAZE
MatOfKeyPoint kp1 = new MatOfKeyPoint();
MatOfKeyPoint kp2 = new MatOfKeyPoint();
Mat desc1 = new Mat();
Mat desc2 = new Mat();
AKAZE akaze = AKAZE.create();
DescriptorMatcher matcher = DescriptorMatcher.create(DescriptorMatcher.BRUTEFORCE_HAMMING);
akaze.detectAndCompute(source, new Mat(), kp1, desc1);
akaze.detectAndCompute(tmp, new Mat(), kp2, desc2);
List<MatOfDMatch> knnMatches = new ArrayList<>();
matcher.knnMatch(desc1, desc2, knnMatches, 2);
float threshold = 0.7f;
int count = 0;
for(int i=0; i<knnMatches.size(); i++) {
if(knnMatches.get(i).rows() > 1) {
DMatch[] matches = knnMatches.get(i).toArray();
if(matches[0].distance < threshold * matches[1].distance) {
count++;
}
}
}
//ORB
ORB orb = ORB.create();
DescriptorMatcher matcher = DescriptorMatcher.create(DescriptorMatcher.BRUTEFORCE_HAMMING);
MatOfKeyPoint kp1 = new MatOfKeyPoint();
MatOfKeyPoint kp2 = new MatOfKeyPoint();
Mat desc1 = new Mat();
Mat desc2 = new Mat();
orb.detectAndCompute(source, new Mat(), kp1, desc1);
orb.detectAndCompute(tmp, new Mat(), kp2, desc2);
List<MatOfDMatch> knnMatches = new ArrayList<>();
matcher.knnMatch(desc1, desc2, knnMatches, 2);
float threshold = 0.8f;
int count = 0;
for(int i=0; i<knnMatches.size(); i++) {
if(knnMatches.get(i).rows() > 1) {
DMatch[] matches = knnMatches.get(i).toArray();
if(matches[0].distance < threshold * matches[1].distance) {
count++;
}
}
}
//PHash
Mat hash_source = new Mat();
Mat hash_tmp = new Mat();
Img_hash.pHash(tmp, hash_tmp);
Img_hash.pHash(source, hash_source);
Core.norm(source, tmp, Core.NORM_HAMMING);
Edit: As suggested, below is the game board, icon image, and orb screenshot sample.
ICON vs orb screenshot
Also, you may observe the simulation result of each approach by comparing the result(overlay smaller icon) on top of the orb on board:
Histogram Matching
,
Template Matching
and
AKAZE (similar to ORB)
After moving the variable initialization out of my comparison function to base class, detect keypoint and PHash of source icon images on class initialization, run detect and compute function in batch using List to reduce individual function call. It still takes up 4s+ for the image matching process. Time consumption is reduced but accuracy is still a major problem. You may observe my heap stack on below.
I have an Android Project with OpenCV4.0.1 and TFLite installed.
And I want to make an inference with a pretrained MobileNetV2 of an cv::Mat which I extracted and cropped from a CameraBridgeViewBase (Android style).
But it's kinda difficult.
I followed this example.
That does the inference about a ByteBuffer variable called "imgData" (line 71, class: org.tensorflow.lite.examples.classification.tflite.Classifier)
That imgData looks been filled on the method called "convertBitmapToByteBuffer" from the same class (line 185), adding pixel by pixel form a bitmap that looks to be cropped little before.
private int[] intValues = new int[224 * 224];
Mat _croppedFace = new Mat() // Cropped image from CvCameraViewFrame.rgba() method.
float[][] outputVal = new float[1][1]; // Output value from my MobileNetV2 // trained model (i've changed the output on training, tested on python)
// Following: https://stackoverflow.com/questions/13134682/convert-mat-to-bitmap-opencv-for-android
Bitmap bitmap = Bitmap.createBitmap(_croppedFace.cols(), _croppedFace.rows(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(_croppedFace, bitmap);
convertBitmapToByteBuffer(bitmap); // This call should be used as the example one.
// runInference();
_tflite.run(imgData, outputVal);
But, it looks that the input_shape of my NN is not correct, but I'm following the MobileNet example because my NN it's a MobileNetV2.
I've solved the error, but I'm sure that it isn't the best way to do it.
Keras MobilenetV2 input_shape is: (nBatches, 224, 224, nChannels).
I just want to predict a single image, so, nBaches == 1, and I'm working on RGB mode, so nChannels == 3
// Nasty nasty, but works. nBatches == 2? -- _cropped.shape() == (244, 244), 3 channels.
float [][][][] _inputValue = new float[2][_cropped.cols()][_cropped.rows()][3];
// Fill the _inputValue
for(int i = 0; i < _croppedFace.cols(); ++i)
for (int j = 0; j < _croppedFace.rows(); ++j)
for(int z = 0; z < 3; ++z)
_inputValue [0][i][j][z] = (float) _croppedFace.get(i, j)[z] / 255; // DL works better with 0:1 values.
/*
Output val, has this shape, but I don't really know why.
I'm sure that one's of that 2's is for nClasses (I'm working with 2 classes)
But I don't really know why it's using the other one.
*/
float[][] outputVal = new float[2][2];
// Tensorflow lite interpreter
_tflite.run(_inputValue , outputVal);
On python has the same shape:
Python prediction:
[[XXXXXX, YYYYY]] <- Sure for the last layer that I made, this is just a prototype NN.
Hope some one got help, and also that someone can improve the answer because this is not very optimized.
tl;dr My KNearest training data and real data don't have the same dimensions and cause my app to crash. I suspect that either my preProces method of the way I instantiate my training data (drawable resource => bitmap => opencv matrix) is the reason for failure. Does any of you know a solution?
I've been trying to get a working demo of a simple OCR app with OpenCV for Android. I use the build in KNearest to recognize the characters. Before a KNearest object is capable of detecting anything, it has to be trained. For the training I use several character outlines.
This is one of them (its a zero).
The training seems to work unsurprisingly it is capable to detect the supposed values of the training images. I wish it did that with other images as well (or at leas not crash my app). This is what I did to train the KNearest model:
Map<Character, Integer> images = new HashMap<>();
images.put('0', R.drawable.training0);
// Prepare two sets of data, the images and their values.
Mat trainingImages = new Mat();
Mat trainingLabels = new Mat();
for (int i = 0; i < 50; i++) {
for (Map.Entry<Character, Integer> entry : images.entrySet()) {
Bitmap bitmapImage = BitmapFactory.decodeResource(
this.getResources(), entry.getValue());
Mat matImage = new Mat();
Utils.bitmapToMat(bitmapImage, matImage);
trainingLabels.push_back(new MatOfInt(entry.getKey() - '0'));
trainingImages.push_back(
preProces(
matImage, new Rect(0, 0, matImage.width(), matImage.height())));
}
}
mKNearest.train(trainingImages, Ml.ROW_SAMPLE, trainingLabels);
The preProces method does nothing more than normalizing a matrix. This is what my preProces method looks like:
private Mat preProces(Mat image, Rect poi) {
Mat cutout = new Mat(image, poi);
Mat resized = new Mat(10, 10, CvType.CV_32F);
Mat converted = new Mat();
Imgproc.resize(cutout, resized, resized.size());
resized.reshape(1, 1).convertTo(converted, CvType.CV_32F);
return converted;
}
Segmenting the image to find (possible) characters was not that difficult, I was able to draw rectangles around the (possible) characters. Once that is done I just pass every point of interest through my preProces method before I pass it into the mKNearest.findNeareset(...) method. This is when the crash happens. The training data and the real data don't seem to have the same dimensions, something the preProces method should solve.
My guess is that either my preProces method fails or that loading drawable resources as bitmap and then converting them to matrices is the reason why it fails. I'd like to know if some of you had similar problems and how you've solved it.
Update: It seems there is quite a bit of noise in the matrices which where created out of a bitmap. Could this be the problem, if so how does one remove the noise?
It seems the answer to this question was pretty simple. I used Imgproc.canny() to detect the edges of the real data but not on the training data. The problem was solved once I passed the training data through Imgproc.canny().
...
Bitmap bitmapImage = BitmapFactory.decodeResource(
this.getResources(), entry.getValue());
Mat matImage = new Mat();
Utils.bitmapToMat(bitmapImage, matImage);
// This was all I had to add to the training data preparation.
Mat cannyImage = new Mat();
Imgproc.Canny(matImage, cannyImage, 1.0, 255.0);
trainingLabels.push_back(new MatOfInt(entry.getKey() - '0'));
trainingImages.push_back(
preProces(
cannyImage, new Rect(0, 0, cannyImage.width(), cannyImage.height())));
}
...
The dst = signum(src) function set the values of all positive elements in src to 1, and the values of all negative elements to -1.
However, it seems that it is not possible to implement the signum() function by applying the OpenCV function threshold(). I do not want to traverse src neither, because it is inefficient.
I don't know which language you are using, but in OpenCV C++, signum function can be implemented as follows:
Mat signum(Mat src)
{
Mat dst = (src >= 0) & 1;
dst.convertTo(dst,CV_32F, 2.0, -1.0);
return dst;
}
Of-course, the returned matrix would have floating point or a signed type to store the value of -1.
Update:
The previous implementation returns only 1 or -1 depending on the input values, but according to signum definition, 0 should remain 0 in the output. So getting reference from this answer, the standard signum function can be implemented as follows using OpenCV:
Mat signum(Mat src)
{
Mat z = Mat::zeros(src.size(), src.type());
Mat a = (z < src) & 1;
Mat b = (src < z) & 1;
Mat dst;
addWeighted(a,1.0,b,-1.0,0.0,dst, CV_32F);
return dst;
}
We have been dealing with OpenCV for two weeks to make it work on Android.
Do you know where can we find an Android implementation of optical flow? It would be nice if it's implemented using OpenCV.
Openframeworks has openCV baked in, as well as many other interesting libraries. It has a very elegant strucutre, and I have used it with android to make a virtual mouse of the phone using motion estimation from the camera.
See the ports to android here http://openframeworks.cc/setup/android-studio/
Seems they recently added support for android studio, otherwise eclipse works great.
Try this
#Override
public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
mRgba = inputFrame.rgba();
if (mMOP2fptsPrev.rows() == 0) {
//Log.d("Baz", "First time opflow");
// first time through the loop so we need prev and this mats
// plus prev points
// get this mat
Imgproc.cvtColor(mRgba, matOpFlowThis, Imgproc.COLOR_RGBA2GRAY);
// copy that to prev mat
matOpFlowThis.copyTo(matOpFlowPrev);
// get prev corners
Imgproc.goodFeaturesToTrack(matOpFlowPrev, MOPcorners, iGFFTMax, 0.05, 20);
mMOP2fptsPrev.fromArray(MOPcorners.toArray());
// get safe copy of this corners
mMOP2fptsPrev.copyTo(mMOP2fptsSafe);
}
else
{
//Log.d("Baz", "Opflow");
// we've been through before so
// this mat is valid. Copy it to prev mat
matOpFlowThis.copyTo(matOpFlowPrev);
// get this mat
Imgproc.cvtColor(mRgba, matOpFlowThis, Imgproc.COLOR_RGBA2GRAY);
// get the corners for this mat
Imgproc.goodFeaturesToTrack(matOpFlowThis, MOPcorners, iGFFTMax, 0.05, 20);
mMOP2fptsThis.fromArray(MOPcorners.toArray());
// retrieve the corners from the prev mat
// (saves calculating them again)
mMOP2fptsSafe.copyTo(mMOP2fptsPrev);
// and save this corners for next time through
mMOP2fptsThis.copyTo(mMOP2fptsSafe);
}
/*
Parameters:
prevImg first 8-bit input image
nextImg second input image
prevPts vector of 2D points for which the flow needs to be found; point coordinates must be single-precision floating-point numbers.
nextPts output vector of 2D points (with single-precision floating-point coordinates) containing the calculated new positions of input features in the second image; when OPTFLOW_USE_INITIAL_FLOW flag is passed, the vector must have the same size as in the input.
status output status vector (of unsigned chars); each element of the vector is set to 1 if the flow for the corresponding features has been found, otherwise, it is set to 0.
err output vector of errors; each element of the vector is set to an error for the corresponding feature, type of the error measure can be set in flags parameter; if the flow wasn't found then the error is not defined (use the status parameter to find such cases).
*/
Video.calcOpticalFlowPyrLK(matOpFlowPrev, matOpFlowThis, mMOP2fptsPrev, mMOP2fptsThis, mMOBStatus, mMOFerr);
cornersPrev = mMOP2fptsPrev.toList();
cornersThis = mMOP2fptsThis.toList();
byteStatus = mMOBStatus.toList();
y = byteStatus.size() - 1;
for (x = 0; x < y; x++) {
if (byteStatus.get(x) == 1) {
pt = cornersThis.get(x);
pt2 = cornersPrev.get(x);
Core.circle(mRgba, pt, 5, colorRed, iLineThickness - 1);
Core.line(mRgba, pt, pt2, colorRed, iLineThickness);
}
}
return mRgba;
}