I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.
The Context:
Am using Android OpenCV along with Tesseract so I can read the MRZ area in the passport. When the camera is started I pass the input frame to an AsyncTask, the frame is processed, the MRZ area is extracted succesfully, I pass the extracted MRZ area to a function prepareForOCR(inputImage) that takes the MRZ area as gray Mat and Will output a bitmap with the thresholded image that I will pass to Tesseract.
The problem:
The problem is while thresholding the Image, I use adaptive thresholding with blockSize = 13 and C = 15, but the result given is not always the same depending on the lighting of the image and the conditions in general from which the frame is taken.
What I have tried:
First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values
//toOcr contains the extracted MRZ area
Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);
Mat inputFrame = new Mat();
Mat toOcr = new Mat();
Utils.bitmapToMat(toOCRBitmap, inputFrame);
Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);
TesseractResult lastResult = null;
for (int B = 11; B < 70; B++) {
for (int C = 11; C < 70; C++){
if (IsPrime(B) && IsPrime(C)){
Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);
Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);
TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");
if (result.getMeanConfidence()> 70) {
if (MrzParser.tryParse(result.getText())){
Log.d("Main2Activity", "Best result with " + B + " : " + C);
return result;
}
}
}
}
}
Using the code below, the thresholded result image is a black on white image which gives a confidence greater than 70, I can't really post the whole image for privacy reasons, but here's a clipped one and a dummy password one.
Using the MrzParser.tryParse function which adds checks for the character position and its validity within the MRZ, am able to correct some occurences like a name containing a 8 instead of B, and get a good result but it takes so much time, which is normal because am thresholding almost 255 images in the loop, adding to that the Tesseract call.
I already tried getting a list of C and B values which occurs the most but the results are different.
The question:
Is there a way to define a C and blocksize value so that it s always giving the same result, maybe adding more OpenCV calls so The input image like increasing contrast and so on, I searched the web for 2 weeks now I can't find a viable solution, this is the only one that is giving accurate results
You can use a clustering algorithm to cluster the pixels based on color. The characters are dark and there is a good contrast in the MRZ region, so a clustering method will most probably give you a good segmentation if you apply it to the MRZ region.
Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.
I use color images, apply some smoothing, convert to Lab color space, then cluster the a, b channel data using kmeans (k=2). The code is in python but you can easily adapt it to java. Due to the randomized nature of the kmeans algorithm, the segmented characters will have label 0 or 1. You can easily sort it out by inspecting cluster centers. The cluster-center corresponding to characters should have a dark value in the color space you are using.
I just used the Lab color space here. You can use RGB, HSV or even GRAY and see which one is better for you.
After segmenting like this, I think you can even find good values for B and C of your adaptive-threshold using the properties of the stroke width of the characters (if you think the adaptive-threshold gives a better quality output).
import cv2
import numpy as np
im = cv2.imread('mrz1.png')
# convert to Lab
lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)
im32f = np.array(im[:, :, 1:3], dtype=np.float32)
k = 2 # 2 clusters
term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]),
k, None, term_crit, 10, 0)
# segmented image
labels = labels.reshape([im.shape[0], im.shape[1]]) * 255
Some results:
Related
Hello stackoverflow community I would like if someone can guide me a little regarding my next question, I want to make an application that takes a photo when it detects a sheet with 3 marks (black squares in the corners) similar to what a QR would have. I have read a little about opencv that I think could help me more however I am not very clear yet.
Here my example
Once you obtain your binary image, you can find contours and filter using contour approximation and contour area. If the approximated contour has a length of four then it must be a square and if it is within a lower and upper area range then we have detected a mark. We keep a counter of the mark and if there are three marks in the image, we can take the photo. Here's the visualization of the process.
We Otsu's threshold to obtain a binary image with the objects to detect in white.
From here we find contours using cv2.findContours and filter using contour approximation cv2.approxPolyDP in addition to contour area cv2.contourArea.
Detected marks highlighted in teal
I implemented it in Python but you can adapt the same approach
Code
import cv2
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours and filter using contour approximation and contour area
marks = 0
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.04 * peri, True)
if len(approx) == 4 and area > 250 and area < 400:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (200,255,12), 2)
marks += 1
# Sheet has 3 marks
if marks == 3:
print('Take photo')
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
Not sure if this is the right way to ask, but please help. I have an image of a dented car. I have to process it and highlight the dents and return the number of dents. I was able to do it reasonably well with the following result:
The matlab code is:
img2=rgb2gray(i1);
imshow(img2);
img3=imtophat(img2,strel('disk',15));
img4=imadjust(img3);
layer=img4(:,:,1);
img5=layer>100 & layer<250;
img6=imfill(img5,'holes');
img7=bwareaopen(img6,5);
[L,ans]=bwlabeln(img7);
imshow(img7);
I=imread(i1);
Ians=CarDentIdentification(I);
However, when I try to do this using opencv, I get this:
With the following code:
Imgproc.cvtColor(source, middle, Imgproc.COLOR_RGB2GRAY);
Imgproc.equalizeHist(middle, middle);
Imgproc.threshold(middle, middle, 150, 255, Imgproc.THRESH_OTSU);
Please tell me how can I obtain better results in opencv, and also how to count the dents? I tried findcontour() but it gives a very large number. I tried on other images as well, but I'm not getting proper results.
Please help.
So you basically from the MATLAB site, imtophat does - Top-hat filtering computes the morphological opening of the image (using imopen) and then subtracts the result from the original image.
You could do this in OpenCV with the following steps:
Step 1: Get the disk structuring element
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
Step 2: Compute opening of the image and then subtract the result from the original image
tophat = cv2.morphologyEx(v, cv2.MORPH_TOPHAT, kernel)
This gives following result -
Step 3 - Now you could just manually threshold it or use Otsu -
ret, thresh = cv2.threshold(tophat, 17, 255, 0)
which gives you the following image -
Since the OP wants the code in Java, here is the probable code in Java:
private Mat topHat(Mat image)
{
Mat element = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(15, 15), new Point (0, 0));
Mat dst = new Mat;
Imgproc.morphologyEx(image, dst, Imgproc.MORPH_TOPHAT, element, new Point(0, 0));
return dst;
}
Make sure you do this on a gray scale image (CvType.8UC1) and then you can threshold suitably.
I want to remove black borders around License Plate. I am using opencv + android.
Please reply with code using which i can remove the borders.
I have also attached the image.image 1
You can perform (DoG) Difference of Gaussians to detect the high frequency details in your image. By high frequency in an image I mean distinct edges and corners.
Here is the code as requested. The explanations are placed as comments by the side:
import cv2
img = cv2.imread('number_plate.jpg') #---Reading the image---
img1 = img.copy() #----The final contour will be drawn on the copy of the original image---
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #---converting to gray scale---
Before performing DoG, I enhanced the gray sale image by applying Adaptive histogram equalization:
clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8,8))
enhanced = clahe.apply(gray_img)
cv2.imshow(enhanced_gray_img', enhanced)
Now I performed Gaussian blur using two separate kernels and subtracted the resulting images as follows:
blur1 = cv2.GaussianBlur(enhanced, (15, 15), 0)
blur2 = cv2.GaussianBlur(enhanced, (25, 25), 0)
difference = blur2 - blur1
cv2.imshow('Difference_of_Gaussians', difference)
Then I performed binary threshold on the image above and found contours. I drew the contour having the largest area:
ret, th = cv2.threshold(difference, 127,255, 0) #---performed binary threshold ---
_, contours, hierarchy = cv2.findContours(th, cv2.RETR_EXTERNAL, 1) #---Find contours---
cnts = contours
max = 0 #----Variable to keep track of the largest area----
c = 0 #----Variable to store the contour having largest area---
for i in range(len(contours)):
if (cv2.contourArea(cnts[i]) > max):
max = cv2.contourArea(cnts[i])
c = i
rep = cv2.drawContours(img1, contours[c], -1, (0,255,0), 3) #----Draw the contour having the largest area on the image---
cv2.imshow('Final_Image.jpg', rep)
And voila!!! There you go.
Now you can obtain bounding rectangles for the contours you found and fed those coordinates as regions to the OCR to extract the text present
I'm new to openCV, I've been getting into the samples provided for Android.
My goals is to detect color-blobs so I started with color-blob-detection sample.
I'm converting color image to grayscale and then thresholding using a binary threshold.
The background is white, blobs are black. I want to detect those black blobs. Also, I would like to draw their contour in color but I'm not able to do it because image is black and white.
I've managed to accomplish this in grayscale but I don't prefer how the contours are drawn, it's like color tolerance is too high and the contour is bigger than the actual blob (maybe blobs are too small?). I guess this 'tolerance' I talk about has something to do with setHsvColor but I don't quite understand that method.
Thanks in advance! Best Regards
UPDATE MORE INFO
The image I want to track is of ink splits. Imagine a white piece of paper with black ink splits. Right now I'm doing it in real-time (camera view). The actual app would take a picture and analyse that picture.
As I said above, I took color-blob-detection sample (android) from openCV GitHub repo. And I add this code in the onCameraFrame method (in order to convert it to black and white in real-time) The convertion is made so I don't mind if ink is black, blue, red:
mRgba = inputFrame.rgba();
/**************************************************************************/
/** BLACK AND WHITE **/
// Convert to Grey
Imgproc.cvtColor(inputFrame.gray(), mRgba, Imgproc.COLOR_GRAY2RGBA, 4);
Mat blackAndWhiteMat = new Mat ( H, W, CvType.CV_8U, new Scalar(1));
double umbral = 100.0;
Imgproc.threshold(mRgba, blackAndWhiteMat , umbral, 255, Imgproc.THRESH_BINARY);
// convert back to bitmap for displaying
Bitmap resultBitmap = Bitmap.createBitmap(mRgba.cols(), mRgba.rows(), Bitmap.Config.ARGB_8888);
blackAndWhiteMat.convertTo(blackAndWhiteMat, CvType.CV_8UC1);
Utils.matToBitmap(blackAndWhiteMat, resultBitmap);
/**************************************************************************/
This may not be the best way but it works.
Now I want to detect black blobs (ink splits). I guess they are detected because the Logcat (log entry of sample app) throws the number of contours detected, but I'm not able to see them because the image is black and white and I want the contour to be red, for example.
Here's an example image:-
And here is what I get using RGB (color-blob-detection as is, not black and white image). Notice how small blobs are not detected. (Is it possible to detect them? or are they too small?)
Thanks for your help! If you need more info I would gladly update this question
UPDATE: GitHub repo of color-blob-detection sample (second image)
GitHub Repo of openCV sample for Android
The solution is based on a combination of adaptive Image thresholding and use of the connected-component algorithm.
Assumption - The paper is the most lit area of the image whereas the ink spots on the paper are darkest regions.
from random import Random
import numpy as np
import cv2
def random_color(random):
"""
Return a random color
"""
icolor = random.randint(0, 0xFFFFFF)
return [icolor & 0xff, (icolor >> 8) & 0xff, (icolor >> 16) & 0xff]
#Read as Grayscale
img = cv2.imread('1-input.jpg', 0)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
# Gaussian to remove noisy region, comment to see its affect.
img = cv2.medianBlur(img,5)
#Find average intensity to distinguish paper region
avgPixelIntensity = cv2.mean( img )
print "Average intensity of image: ", avgPixelIntensity[0]
# Generate mask to distinguish paper region
#0.8 - used to ignore ill-illuminated region of paper
mask = cv2.inRange(img, avgPixelIntensity[0]*0.8, 255)
mask = 255 - mask
cv2.imwrite('2-maskedImg.jpg', mask)
#Approach 1
# You need to choose 4 or 8 for connectivity type(border pixels)
connectivity = 8
# Perform the operation
output = cv2.connectedComponentsWithStats(mask, connectivity, cv2.CV_8U)
# The first cell is the number of labels
num_labels = output[0]
# The second cell is the label matrix
labels = output[1]
# The third cell is the stat matrix
stats = output[2]
# The fourth cell is the centroid matrix
centroids = output[3]
cv2.imwrite("3-connectedcomponent.jpg", labels)
print "Number of labels", num_labels, labels
# create the random number
random = Random()
for i in range(1, num_labels):
print stats[i, cv2.CC_STAT_LEFT], stats[i, cv2.CC_STAT_TOP], stats[i, cv2.CC_STAT_WIDTH], stats[i, cv2.CC_STAT_HEIGHT]
cv2.rectangle(cimg, (stats[i, cv2.CC_STAT_LEFT], stats[i, cv2.CC_STAT_TOP]),
(stats[i, cv2.CC_STAT_LEFT] + stats[i, cv2.CC_STAT_WIDTH], stats[i, cv2.CC_STAT_TOP] + stats[i, cv2.CC_STAT_HEIGHT]), random_color(random), 2)
cv2.imwrite("4-OutputImage.jpg", cimg)
The Input Image
Masked Image from thresholding and invert operation.
Use of connected component.
Overlaying output of connected component on input image.
I am writing an Android application that must paint determined parts of a loaded bitmap image according to received events.
I need to paint (or change the current color) of a single part of a bitmap image, without changing the rest of the image.
Let's say I have a car, which is divided by many parts: door, windows, wheels, etc.
Each time an event (received from the network) arrives, I need to change the color of that particular part with the color specified by the event data.
What would be the best technique to achieve that?
I first thought on FloodFill, as suggested on many threads in SO, but given that the messages are received quite fast (several per second) I fear it would drag performance down, as it seem to be very CPU intensive algorithm.
I also thought about having multiple segments of the same image, each colored with a different color and show the right one at the right time, but the car has at least 10 different parts and each one could be painted with 4-6 colors, so I would end up with dozens of images and that would be impractical to handle, not to mention the waste of memory.
So, is there any other approach?
The fastest way to do it is with a shader. You'll need to use OpenGL ES 2 for that (some Androids only support ES 1). You'll need a temporary bitmap the same size as the image you want to change. Set it as the target. In the shader, retrieve a pixel from the sampler which is bound to the image you want to change. If it's within a small tolerance of the colour you want to change, set gl_FragColor to the new colour, otherwise just set gl_FragColor to the colour you retrieved from the sampler. You'll need to pass the desired colour and the new colour into the shader as vec4s with al_set_shader_float_vector. The fastest way to do this is to keep 2 bitmaps and swap between them as the "main one" that you're using each time a colour changes.
If you can't use a shader, then you'll have to lock the bitmap and replace the colour. Use al_lock_bitmap to lock it, then you can use al_get_pixel and al_put_pixel to change colours. Then al_unlock_bitmap when you're done. You can also avoid using al_get_pixel/al_put_pixel and access the memory manually which will be faster. If you lock the bitmap with the format ALLEGRO_PIXEL_FORMAT_ABGR_8888_LE then the memory is laid out like so:
int w = al_get_bitmap_width(bitmap);
int h = al_get_bitmap_height(bitmap);
for (int y = 0; y < h; y++) {
unsigned char *p = locked_region->data + locked_region->pitch * y;
for (int x = 0; x < w; x++) {
unsigned char r = p[0];
unsigned char g = p[1];
unsigned char b = p[2];
unsigned char a = p[3];
/* change r, g, b, a here if they match */
p[0] = r;
p[1] = g;
p[2] = b;
p[3] = a;
p += 4;
}
}
It's recommended that you lock the image in the format it was created in. That means pick an easy one like the one I mentioned, or else the inner part of the loop gets more complicated. The ABGR_8888 part of the pixel format describes the layout of the data. ABGR tells the order of the components. If you were to read a pixel into a single storage unit (an int in this case but it works the same with a short) then the bit pattern would be AAAAAAAABBBBBBBBGGGGGGGGRRRRRRRR. However, when you're reading a byte at a time, most machine are little endian so that means the small end comes first. That's why in my sample code p[0] is red. The 8888 part tells how many bits per component.