Tesseract getutf8text performance

Tesseract getutf8text performance - android

I have been working with an app that uses the Tesseract API in order to support OCR. This is done by using a Surfaceview which shows the camera output (Camera2 API) and a ImageReader instance which is used to get the images from the camera. The camera is setup to be of the type setRepeatingRequest so new images are available very frequent. When I make a call to the getutf8text() method to get the readable text in images it makes the preview of the camera which is showed on the Surfaceview lag.
Are there any settings in the Tesseract API which can be set so it speeds up the getutf8text() method call or anything else I can do in order to get the preview Surfaceview to not lag?
Any help or guidance is appreciated!

Most of the things that you can do to speed up performance occur separately from the Tesseract API itself:
Run the OCR on a separate, non-UI thread
Grab a new image to start OCR on after OCR has finished on the last image. Try capture instead of setRepeatingRequest.
Downsample the image before OCR, so that it's smaller
Experiment with different Tesseract page segmentation modes to see what's the fastest on your data
Re-train the Tesseract trained data file to use fewer characters and a smaller dictionary, depending on what your app is used for
Modify Tesseract to perform only recognition pass #1
Don't forget to consider OpenCV or other approaches altogether
You didn't say what Tesseract API settings you're using now, and you didn't describe what your app does in a general sense, so it's hard to tell you where to start, but these points should get you started.

There are a few other things which you can try.
Init tesseract with OEM_TESSERACT_ONLY
Instead of using full-blown training data, use a faster alternative from https://github.com/tesseract-ocr/tessdata_fast.
Move the recognition to the computation thread.

Related

Detecting pathways in a video using BoofCV on Android

For my application I have been looking into using BoofCV to detect if I am on a pathway or not. The pathway is just gravel so it is the color of a standard roadway. I'm not sure exactly what image processing technique to use. The BoofCV demo app has a lot of features, but I would like to know which one is appropriate for what I'm trying to do.
Ultimately I'd like to have a toast appear on the screen when I am on a pathway.

From your question, I'm guessing that you' re using a regular camera, as real time input from a moving object. In that case you may need to:
Calibrate and Stabilize your input frames (since your pathway is made from gravel). BoofCV provides libraries.
Adjust exposure, contrast or brightness (for night/low light vision cameras or low contrast frames).
Use BoofCV's Binary Image Ops, according to your app's needs (Image Thresholding, Binary Labeling etc).
Use a classifier for 2 classes ("inside pathway", "outside pathway").
Process your output and feedback results to your "decision operator", to make a choice and guide your moving object.
More details about your project may help for a better answer.

First steps in creating a chroma key effect using android camera

I'd like to create a chroma key effect using the android camera. I don't need a step by step, but I'd like to know the best way to hijack the android camera and apply the filters. I've checked out the API and haven't found anything super definitive on how to manipulate data coming from the camera. At first I looked into using a surface texture, but I'm not fully aware how that helps or how to even use it. Then I checked out using a GLSurfaceView, which may be the right direction, but not really sure.
Also, to add to my question, how would I handle both preview and saving of the image? Would I process the image at minimum, twice? Once while previewing and once while saving? I think that's probably the best solution.
Lastly, would it make sense to create a C/++ wrapper to handle the processing to optimize speed?
Any help at all would be greatly appreciated. A link to some examples would also be greatly appreciated.
Thanks.

The only real chance is to use openGL ES and fragment shader (it will require at least openGL ES 2.0) and do the chroma key effect on GPU. The shader itself will be quite easy (google).
But to do that, you need to display camera preview with callback. You will have to implement Camera.PreviewCallback, create a buffer for image data and use setPreviewCallbackWithBuffer method. You can get the basic idea from my answer to a similar question. Note that there is a significant problem with performance of this custom camera preview, but it might work on hardware that supports ES 2.0.
To display the preview with openGL, you will need to extend GLSurfaceView and also implement GLSurfaceView.Renderer. Then you will bind the camera preview frame as a texture with glTexImage2D to some simple rectangle and the rest will be handled by shaders. See how to use shaders in ES here or if you have no experience with shaders, this tutorial might be a good start.
To the other question: you could save the current image from the preview, but the preview has lower resolution than a taken picture, so you will probably want to take a picture and then process it separately (you could use the same shader for it).
As for the C++, it's a lot of additional effort with questionable output. But it can improve performance if done right. Try to check this article, it's on a similar topic, it describes how to use NDK to process camera preview and display it in openGL. But if you were thinking about doing the chroma key effect in C++, it would be significantly slower than shaders.

You can check this library: https://github.com/cyberagent/android-gpuimage.
It provides a framework to do image processing on device's GPU using GL shaders.
There is also a sample showing how to use the library with a camera.

There is a Chroma-Key-Project on Google-Code: http://code.google.com/p/chroma-key-project/ It includes a way to upload pictures that are token using chroma-key:
After an exhaustive search online, I have failed to find any open source projects working >with Chroma-keying for Android devices. The aim of this project is to provide a useful >Chroma-key library, that will make it easy to implement applications and games that can take >pictures in front of a Green or Blue screen, and apply the pictures on a chosen background. >Furthermore, the application will also allow the user to upload the picture using Intent.

how to measure the time it takes the android camera to capture a single frame at real-time?

I'm doing some performance testing of real-time possibilities on Android, because I need to show if acquiring a frame using OpenCV Java methods are more efficient and faster than the android SDK methods. So I want to know (if possible) as you can measure the exact time it takes the camera android sensor taking a single frame?
The initial code I had was just getting the time before and after the time takePicture after, subtract and save that time in an array then make averages, etc. But my code did not work because it was inside a while loop.
My interest is not to show the images in preview, my interest is to show the time it takes the sensor to acquire a frame or capture. I know that the frame is acquired in raw form and then the Android SDK provides the tools to transform it into a jpeg image but that does not interest me.
p.d. I do the same tests using Java methods and then the methods of C. Something like Android-OpenCV Java methods versus Android SDK for android Java methods. And the same in C, C functions of OpenCV for Android versus native C functions in android.
The refference links are:
Capturing Multiple Photos
But I'm not sure.
Thanks all

Simulating an Android Camera

I am testing imaging algorithms using a android phone's camera as input, and need a way to consistently test the algorithms. Ideally I want to take a pre-recorded video feed and have the phone 'pretend' that the video feed is a live video from the camera.
My ideal solution would be where the app running the algorithms has no knowledge that the video is pre-recorded. I do not want to load the video file directly into the app, but rather read it in as sensor data if at all possible.
Is this approach possible? If so, any pointers in the right direction would be extremely helpful, as Google searches have failed me so far
Thanks!
Edit: To clarify, my understanding is that the Camera class uses a camera service to read video from the hardware. Rather than do something application-side, I would like to create a custom camera service that reads from a video file instead of the hardware. Is that doable?

When you are doing processing on a live android video feed you will need to build your own custom camera application that feeds you individual frames via the PreviewCallback interface that Android provides.
Now, simulating this would be a little bit tricky seen as the format for the preview frames will generally be in the NV21 format. If you are using a pre-recorded video, I don't think there is any clear way of reading frames one by one unless you try the getFrameAtTime method which will give you bitmaps in an entirely different format.
This leads me to suggest that you could probably test with these Bitmaps (though I'm really not sure what you are trying to do here) from the getFrameAtTime method. In order for this code to then work on a live camera preview, you would need to have to convert your NV21 frames from the PreviewCallback interface into the same format as the Bitmaps from getFrameAtTime, or you could then adapt your algorithm to process NV21 format frames. NV21 is a pretty neat format, presenting color and luminance data separately, but it can be tricky to use.

Real Time Image Processing in Android using the NDK

Using an Android (2.3.3) phone, I can use the camera to retrieve a preview with the onPreviewFrame(byte[] data, Camera camera) method to get the YUV image.
For some image processing, I need to convert this data to an RGB image and show it on the device. Using the basic java / android method, this runs at a horrible rate of less then 5 fps...
Now, using the NDK, I want to speed things up. The problem is: How do I convert the YUV array to an RGB array in C? And is there a way to display it (using OpenGL perhaps?) in the native code? Real-time should be possible (the Qualcomm AR demos showed us that).
I cannot use the setTargetDisplay and put an overlay on it!
I know Java, recently started with the Android SDK and have zero experience in C

Have you considered using OpenCV's Android port? It can do a lot more than just color conversion, and it's quite fast.

A Google search returned this page for a C implementation of YUV->RGB565. The author even included the JNI wrapper for it.

You can also succeed by staying with Java. I did this for the imagedetectíon of the androangelo-app.
I used the sample code which you find here by searching "decodeYUV".
For processing the frames, the essential part to consider is the image-size.
Depending on the device you may get quite large images. i.e. for the Galaxy S2
the smallest supported previewsize is 640*480. This is a big amount of pixels.
What I did, is to use only every second row and every second column, after yuvtorgb decoding. So processing a 320*240 image works quite well and allowed me to get frame-rates of 20fps. (including some noise-reduction, a color-conversion from rgb to hsv and a circledetection)
In addition You should carefully check the size of the image-buffer provided to the setPreview function. If it is too small, the garbage-collection will spoil everything.
For the result you can check the calibration-screen of the androangelo-app. There I had an overlay of the detected image over the camera-preview.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.