I have recently started with JavaCv using Android for camera preview image processing.
Basically, I take the camera preview, do some processing, convert it to HSV to modify some colors, and then I want to convert it to RGBA to fill a bitmap.
Everything works normally, but quite slow. In order to find the slowest part I made some measurements, and to my surprise found this line:
cvCvtColor( hsvimage, imageBitmap, CV_HSV2RGB); //<-- 50msecs
where hsvimage is a 3-channel IplImage, and imageBitmap is 4 channel.image. (The conversion is good and leaves the alpha channel to 255, giving an opaque bitmap as expected)
Just for comparison, the following two lines only take 3msec
cvCvtColor(yuvimage, bgrimage, CV_YUV2BGR_NV21);
cvCvtColor(bgrimage, hsvimage, CV_BGR2HSV);
(yuvimage is 1 channel IplImage, bgrimage and hsvimage are 3 channel IplImages)
It seems as if the first conversion (HSV2RGB) isn't so much optimized as others. Also tested it with a 3-channel destination image, just in case, but with the same results.
I would like to find a way to make it as fast as BGR2HSV. Possible ways:
Find if there is another "equivalent" constant to CV_HSV2RGB which is
Get direct access to the H-S-V byte arrays and make my own "fast" conversion
in C.
Any idea to solve this issue will be welcome
All this is happening with a small 320*240 image and running on a Xiaomi Redmi Note 4. Most of the operations such as converting color from RGB to HSV take less than 1 msec. Canny takes 5msec, Floodfill takes about 5 or 6 msec. It is only this conversion HSV2RGB which gives such strange results.
Will try to use OpenCV directly (not JavaCV) to see if this behaviour disappears.
I was using an old JavaCV version (0.11) Now I have updated to 1.3 and results are nearly the same
long startTime=System.currentTimeMillis();
cvCvtColor(hsvimage, imageBitmap, CV_HSV2RGB);
Log.w(LOG_TAG, "Time:" + String.valueOf(System.currentTimeMillis() - startTime)); //<-- From 45 to 50msec
Log.w(LOG_TAG,"Channels:"+imageBitmap.nChannels()); // <-- returns 4
I can fill a 32bit/pixel android bitmap with the result
Mat mim4C= new Mat(imageBitmap);
Mat mhsvimage = new Mat(hsvimage);
long startTime**strong text**=System.currentTimeMillis();
CvtColor(mhsvimage, mim4C, CV_HSV2RGB);
Log.w(LOG_TAG, "Time:" + String.valueOf(System.currentTimeMillis() - startTime)); //<-- From 45 to 50mse
IplImage iim4C=new IplImage(mim4C);
Log.w(LOG_TAG,"Channels:"+iim4C.nChannels()); // <-- returns 3!!!
In this second case, if I try to fill a 32bits/pixel android bitmap (after converting back mim4C to IplImage), it crashes since it has 3 channels
The Problem
I have been working on implementing a super resolution model with Tensorflow Lite. I have an empty bitmap 4x the size of the input bitmap (which is bmp):
Bitmap out = Bitmap.createBitmap(bmp.getWidth() * 4, bmp.getHeight() * 4, Bitmap.Config.ARGB_8888);
And I converted both bitmaps to TensorImages
TensorImage originalImage = TensorImage.fromBitmap(bmp);
TensorImage superImage = TensorImage.fromBitmap(out);
However, when I run the model (InterpreterApi tflite):
tflite.run(originalImage.getBuffer(), superImage.getBuffer());
The bitmap from superImage has not changed, and it holds the blank bitmap I made at the start.
Things I've tried
I looked at basic examples and documentation, most are geared toward classification but they all seemed to do it this way.
I fed the input bitmap to the output, and my app showed the input, so I know that the file picking and preview works.
I tested with different datatypes to store the output, and they either left it blank or weren't compatible with Tensorflow.
What I think
I suspect the problem has something to do with tflite.run() changing a separate instance of superImage, and I get left with the old one. I may also need a different data format that I haven't tried yet.
Thank you for your time.
I am kind of stuck with this problem, and I know there are so many questions about it on stack overflow but in my case. Nothing gives the expected result.
The Context:
Am using Android OpenCV along with Tesseract so I can read the MRZ area in the passport. When the camera is started I pass the input frame to an AsyncTask, the frame is processed, the MRZ area is extracted succesfully, I pass the extracted MRZ area to a function prepareForOCR(inputImage) that takes the MRZ area as gray Mat and Will output a bitmap with the thresholded image that I will pass to Tesseract.
The problem:
The problem is while thresholding the Image, I use adaptive thresholding with blockSize = 13 and C = 15, but the result given is not always the same depending on the lighting of the image and the conditions in general from which the frame is taken.
What I have tried:
First I am resizing the image to a specific size (871,108) so the input image is always the same and not dependant on which phone is used.
After resizing, I try with different BlockSize and C values
//toOcr contains the extracted MRZ area
Bitmap toOCRBitmap = Bitmap.createBitmap(bitmap);
Mat inputFrame = new Mat();
Mat toOcr = new Mat();
Utils.bitmapToMat(toOCRBitmap, inputFrame);
Imgproc.cvtColor(inputFrame, inputFrame, Imgproc.COLOR_BGR2GRAY);
TesseractResult lastResult = null;
for (int B = 11; B < 70; B++) {
for (int C = 11; C < 70; C++){
if (IsPrime(B) && IsPrime(C)){
Imgproc.adaptiveThreshold(inputFrame, toOcr, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, B ,C);
Bitmap toOcrBitmap = OpenCVHelper.getBitmap(toOcr);
TesseractResult result = TesseractInstance.extractFrame(toOcrBitmap, "ocrba");
if (result.getMeanConfidence()> 70) {
if (MrzParser.tryParse(result.getText())){
Log.d("Main2Activity", "Best result with " + B + " : " + C);
return result;
Using the code below, the thresholded result image is a black on white image which gives a confidence greater than 70, I can't really post the whole image for privacy reasons, but here's a clipped one and a dummy password one.
Using the MrzParser.tryParse function which adds checks for the character position and its validity within the MRZ, am able to correct some occurences like a name containing a 8 instead of B, and get a good result but it takes so much time, which is normal because am thresholding almost 255 images in the loop, adding to that the Tesseract call.
I already tried getting a list of C and B values which occurs the most but the results are different.
The question:
Is there a way to define a C and blocksize value so that it s always giving the same result, maybe adding more OpenCV calls so The input image like increasing contrast and so on, I searched the web for 2 weeks now I can't find a viable solution, this is the only one that is giving accurate results
You can use a clustering algorithm to cluster the pixels based on color. The characters are dark and there is a good contrast in the MRZ region, so a clustering method will most probably give you a good segmentation if you apply it to the MRZ region.
Here I demonstrate it with MRZ regions obtained from sample images that can be found on the internet.
I use color images, apply some smoothing, convert to Lab color space, then cluster the a, b channel data using kmeans (k=2). The code is in python but you can easily adapt it to java. Due to the randomized nature of the kmeans algorithm, the segmented characters will have label 0 or 1. You can easily sort it out by inspecting cluster centers. The cluster-center corresponding to characters should have a dark value in the color space you are using.
I just used the Lab color space here. You can use RGB, HSV or even GRAY and see which one is better for you.
After segmenting like this, I think you can even find good values for B and C of your adaptive-threshold using the properties of the stroke width of the characters (if you think the adaptive-threshold gives a better quality output).
import cv2
import numpy as np
im = cv2.imread('mrz1.png')
# convert to Lab
lab = cv2.cvtColor(cv2.GaussianBlur(im, (3, 3), 1), cv2.COLOR_BGR2Lab)
im32f = np.array(im[:, :, 1:3], dtype=np.float32)
k = 2 # 2 clusters
term_crit = (cv2.TERM_CRITERIA_EPS, 30, 0.1)
ret, labels, centers = cv2.kmeans(im32f.reshape([im.shape[0]*im.shape[1], -1]),
k, None, term_crit, 10, 0)
# segmented image
labels = labels.reshape([im.shape[0], im.shape[1]]) * 255
Some results:
I am working on a project in android in which i am using OpenCV to detect faces from all the images which are in the gallery. The process of getting faces from the images is performing in the service. Service continuously working till all the images are processed. It is storing the detected faces in the internal storage and also showing in the grid view if activity is opened.
My code is:
CascadeClassifier mJavaDetector=null;
public void getFaces()
for (int i=0 ; i<size ; i++)
File file=new File(urls.get(i));
defaultBitmap=BitmapFactory.decodeFile(file, bitmapFatoryOptions);
mJavaDetector = new CascadeClassifier(FaceDetector.class.getResource("lbpcascade_frontalface").getPath());
Mat image = new Mat (defaultBitmap.getWidth(), defaultBitmap.getHeight(), CvType.CV_8UC1);
MatOfRect faceDetections = new MatOfRect();
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));
catch(Exception e)
Everything is fine but it is detection faces very slow. The performance is very slow. When i debug the code then i found the line which is taking time is:
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));
I have checked multiple post for this problem but i didn't get any solution.
Please tell me what should i do to solve this problem.
Any help would be greatly appreciated. Thank you.
You should pay attention to the parameters of detectMultiScale():
scaleFactor – Parameter specifying how much the image size is reduced at each image scale. This parameter is used to create a scale pyramid. It is necessary because the model has a fixed size during training. Without pyramid the only size to detect would be this fix one (which can be read from the XML also). However the face detection can be scale-invariant by using multi-scale representation i.e., detecting large and small faces using the same detection window.
scaleFactor depends on the size of your trained detector, but in fact, you need to set it as high as possible while still getting "good" results, so this should be determined empirically.
Your 1.1 value can be a good value for this purpose. It means, a relative small step is used for resizing (reduce size by 10%), you increase the chance of a matching size with the model for detection is found. If your trained detector has the size 10x10 then you can detect faces with size 11x11, 12x12 and so on. But in fact a factor of 1.1 requires roughly double the # of layers in the pyramid (and 2x computation time) than 1.2 does.
minNeighbors – Parameter specifying how many neighbours each candidate rectangle should have to retain it.
Cascade classifier works with a sliding window approach. By applying this approach, you slide a window through over the image than you resize it and search again until you can not resize it further. In every iteration the true outputs (of cascade classifier) are stored but unfortunately it actually detects many false positives. And to eliminate false positives and get the proper face rectangle out of detections, neighbourhood approach is applied. 3-6 is a good value for it. If the value is too high then you can lose true positives too.
minSize – Regarding to the sliding window approach of minNeighbors, this is the smallest window that cascade can detect. Objects smaller than that are ignored. Usually cv::Size(20, 20) are enough for face detections.
maxSize – Maximum possible object size. Objects bigger than that are ignored.
Finally you can try different classifiers based on different features (such as Haar, LBP, HoG). Usually, LBP classifiers are a few times faster than Haar's, but also less accurate.
And it is also strongly recommended to look over these questions:
Recommended values for OpenCV detectMultiScale() parameters
OpenCV detectMultiScale() minNeighbors parameter
Instead reading images as Bitmap and then converting them to Mat via using Utils.bitmapToMat(defaultBitmap,image) you can directly use Mat image = Highgui.imread(imagepath); You can check here for imread() function.
Also, below line takes too much time because the detector is looking for faces with at least having Size(20, 20) which is pretty small. Check this video for visualization of face detection using OpenCV.
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));
In order to minimize the memory usage of bitmaps, yet still try to maximize the quality of them, I would like to ask a simple question:
Is there a way for me to check if a given image file (.png file) has transparency using the API, without checking every pixel in it?
If the image doesn't have any transparency, it would be the best to use a different bitmap format that uses only the RGB values.
The problem is that Android also doesn't have a format for just the 3 colors. Only RGB_565, which they say that degrade the quality of the image and that should have dithering feature enabled.
Is there also a way to read only the RGB values and be able to show them?
For me bitmap.hasAlpha() works fine to check first if the bitmap has alpha values. Afterwards you have to run through the pixels and create a second bitmap with no alpha I would suggest.
Let's start a bit off-topic
the problem is that android also doesn't have a format for just the 3 colors . only RGB_565 , which they say that degrade the quality of the image and that should have dithering feature enabled.
The reason for that problem is not really Android specific. It's about performance while drawing images. You get the best performance if the pixeldata fits exactly in 1 32bit memory cell.
So the most obvious good pixel format is the ARGB_8888 format which uses exactly 32bit (24 for the color 8 for alpha). While drawing you don't need to do anything but to loop over the image data and each cell you read can be drawn directly. The only downside is the required memory to work with such images, both when they just sit in memory and while displaying them since the graphic hardware has to transfer more data.
The second best option is to use a format where several pixels fit into 1 cell. Using 2 pixels in 32bit you have 16bit per pixel left and one of the formats using 16bit is the 565 format. 5bit red, 6bit green, 5bit blue. While drawing this you can still work on memory cells separately and all you have to do is to split 1 cell in parts. Due to the smaller memory size required for images, drawing can sometimes be even faster than using 32bit colors. Since in the beginning of android memory was a much bigger problem they chose this format to be the default.
And the worst category of formats are those where pixels don't fit into those cells. If you take just the 3 colors you get 24 bit and those need to be distributed over 2 cells in 3 out of 4 cases. For example the second pixel would use the remaining 8 bit from the first cell & the first 16bit of the next cell. The extra work required to work with 24bit colors is so big that it is not used. And when drawing images you usually have alpha at some point anyways and if not you simply use 32bit but ignore the alpha bits.
So the 16bit approach looks ugly & the 24 bit approach does not make sense. And since the memory limitations of Android are not as tight as they were and the hardware got faster, Android has switched it's default to 32bit (explained in even more details in http://www.curious-creature.org/2010/12/08/bitmap-quality-banding-and-dithering/)
Back to your real question
is there a way for me to check if a given image file (png file) has transparency using the API , without checking every pixel in it?
I don't know. But JPEG images don't support alpha and PNG images usually have alpha. You could simply abuse the file extension to get it right in most cases.
But I would suggest you don't bother with all that and simply use ARGB_8888 and apply the nice image loading techniques detailed in the Android Training documentation about Displaying Bitmaps Efficiently.
The reason people run into memory problems is usually either that they have way more images loaded in memory than they currently display or they use giant images that can't be displayed on the small screen of a phone. And in my opinion it makes more sense to add good memory management than complicating your code to downgrade the image quality.
There is a way to check if a PNG file has transparency, or at least if it supports it:
public final static int COLOR_GREY = 0;
public final static int COLOR_TRUE = 2;
public final static int COLOR_INDEX = 3;
public final static int COLOR_GREY_ALPHA = 4;
public final static int COLOR_TRUE_ALPHA = 6;
private final static int DECODE_BUFFER_SIZE = 16 * 1024;
private final static int HEADER_DECODE_BUFFER_SIZE = 1024;
/** given an inputStream of a png file , returns true iff found that it has transparency (in its header) */
private static boolean isPngInputStreamContainTransparency(final InputStream pngInputStream) {
try {
// skip: png signature,header chunk declaration,width,height,bitDepth :
pngInputStream.skip(12 + 4 + 4 + 4 + 1);
final byte colorType = (byte) pngInputStream.read();
switch (colorType) {
return true;
return false;
return true;
} catch (final Exception e) {
return false;
Other than that, I don't know if such a thing is possible.
i've found the next links which could be helpful for checking if the png file has transparency . sadly, it's a solution only for png files . rest of the files (like webP , bmp, ...) need to have a different parser .
I am trying to grab consecutive frames from android using opencv VideoCapture class. Actually I want to implement optical flow on android for which i need 2 frames. I implemented optical flow in C first where I grabbed the frames using using cvQueryFrame and every thing work fine. But in android when I call
Log.i(TAG, "first frame retrived");
Log.i(TAG, "2nd frame retrived");
and then subtract the matrices using Imgproc.subtract(mRgba,mRgba2,output) and then display the output it give me black image indicating that mRgba and mRgba2 are image frames with same data. Can any one help how to grab two different images. According to opencv documentation mRgba and mRgba2 should be different.
This question is an exact duplicate of
read successive frames OpenCV using cvQueryframe
You have to copy the image to another memory block, because the capture always returns the same pointer.