I have user tesseract ocr for my android project to recognize text from an image taken from the camera. But the results are not accurate. I want to optimize the image using opencv. I want to achieve the following for the captured image which is decoded in Bitmap.Config.ARGB_8888 format:
Detect the objects in the resized image.
Once the object is identified, compute its border w.r.t original image. (This is for removing the camera angle effect)
Extract the object from original image, by applying perspective transform.
Apply white balance to remove lightening effects.
In the example provided by with the tess_two api, they are using Leptonica for the image manipulations like drawing the bounding boxes around the words..But in my case I want to use OpenCV...Your guidance will be highly appreciated...
That's a lot you are asking for, and depending on the object may be impossible. You should check out the tutorials on 2D feature detection and object detection (http://docs.opencv.org/doc/tutorials/features2d/table_of_content_features2d/table_of_content_features2d.html and http://docs.opencv.org/doc/tutorials/objdetect/table_of_content_objdetect/table_of_content_objdetect.html) to see if there is something you can use.
White balance does not do anything to lighting, you should do adaptive thresholding or some kind of high pass filtering instead.
Related
I am trying to implement something like the technique described in this (old) paper to use the phone camera's video frames to create an illusion of environment mapping in an AR app.
I want to take the camera frame, divide it into sub-areas and then use those as faces on the cube map. The division of the camera frame would look something like this:
Now the X area is easy, I can use glCopyTexImage2D to copy that square area to my cubemap texture. But I need help with the trapezoid shaped areas around X (forget about the trianlges for now).
How can I take those trapezoidal areas and distort them into square textures? I think I need the opposite transformation of the later occurring perspective projection, so that the two will cancel each other out in the final render if I render the cubemap as a skybox around my camera (does that explain what I want?).
Before doing this I tried a simpler step of putting the square X area on every side of the cubemap just to see if glCopyTexImage2D can even be used for this. It can, but the results are not rotated right, some faces are "upside down" when I render the cubemap as a skybox. The question is similar: How can I rotate them before using them as textures?
I also thought about solving the problem from the other side and modifying the "texture coordinates" to make the necessary adjustments, but that also does not seem easy since the lookup in the fragment shader with "textureCube" is more complicated than a normal texture lookup.
Any ideas?
I'm trying to do this in my AR app on Android with OpenGL ES 2.0 but I guess more general OpenGL advice might also be useful.
Update
I have come to the conclusion that this is not worth pursuing anymore. The paper makes it look nice, but my experiments with a phone camera have shown a major contradiction. If you want to reflect the environment in an object rendered in AR, the camera view is very limited. When the camera is far away from the tracked object you have enough environment information for a good reflection, but you will barely see it because the camera is far away. But when you bring the camera closer to see the awesome reflection in detail, the tracked object will fill most of the camera's field of view and you barely have any environment to reflect anymore. So in either case you lose and the result is not worth the effort.
It seems that you need to create mesh with UV mapping described in article and render it with texture from camera to another texture. Then use it as cubemap.
I want make app that show something above longest straight line in image.
i know should convert RGB Image to GrayScale.
also know should use edge detections algorithm and(sobel,canny,...)
Sobel Edge Detection in Android
but i don't know how can i find largest straight line in image,line may be part of rectangle or any shape,i just want find longest line position in image but no gradient(or small level of gradient)
how can i implement it with no external library (or lightweight libraries)
The Hough Transform is the most commonly used algorithm to find lines in an image. Once you run the transform and find lines, it's just a matter of sorting them by length and then crawling along the lines to check for the constraints your application might have.
RANSAC is also a very quick and reliable solution for finding lines once you have the edge image.
Both these algorithms are fairly easy to implement on your own if you don't want to use an external library.
i have been working with object detection / recognition in images captured from an android device camera recently.
the object i am trying to detect are all kinds of buttons that look like this:
Picture of buttons
so far i have been trying with OpenCV and also with the metaio SDK. results:
OpenCV was always detecting something, but gave lots of false hits. also it is too much work to collect all the pictures for what i have in mind. i have tried three ways with OpenCV:
FeatureDetection (SURF, ORB and so on) -> was way too slow and not enough features on my objects.
Template Matching -> seems to only work when the template is exactly a part out of the scene image
Training classifiers -> this worked the best so far, but is too much work for my goal, and still gives too many false detections.
metaioSDK was working ok when i took my reference images (the icon part of each button) out of a picture like shown above, then printed the full image and pointed my android device camera at the printed picture. but when i tried with the real buttons (not a picture of them) then almost nothing got detected anymore. in the metaio documentation it is said that the reference images need to have lots of features and color differences and also should not only consist of white text. well, as you see my reference images are exactly the opposite from what they should be. but thats just how the buttons look ;)
so, my question would be: does any of you have a suggestion about what else i could try to detect and recognize each of those buttons when i point my android camera at them?
As a suggestion can you try the following approach:
Class-Specific Hough Forest for Object Detection
they provide a C code implementation. Compile and run it and see the results, then replace positive and negative training images with the ones you have according the following rules:
In a car you will need to define the following 3 areas:
target region (the image you provided is a good representation of a target region)
nearby working area (this area have information regarding you target relative location) I would recommend: area 3-5 times the target regions, around the target, can be a good working area
everything outside the above can be used as negative images
then,
Use "many" positive images (100-1000) at different viewing angles (-30 - +30 degrees) and various distances.
You will have to make assumptions at which viewing angles and distances your users will use the application. The more strict they are the better performance you will get. A simple "hint" camera overlay can give a good idea to people what you expect the working area to be.
Use few times (3-5) more different negative image set which includes pictures of things that might be in the camera but should not contribute any target position information.
Do not use big images, somewhere around 100-300px in width should be enough
Assemble the database, and modify the configuration file that the code comes with. Run the program, see if performance is OK for your needs.
The program will return a voting map cloud of the object you are looking fore. Add gaussian blur to it, and apply some threshold to it (you will have to make another assumption for this threshold value).
Extracted mask will define the area you are looking for. The size of the masked region can give you good estimate of the object scale. Given this information it will be much easier to select proper template and perform template matching.
(Also some thoughts) You can also try to do a small trick by using goodFeaturesToTrack function with the mask you got, to get a set of locations and compare them with the corresponding locations on a template. Constuct an SSD and solve it for rotation, scale and transition parameters, by mimizing alignment error (but not sure if this approach will work)
I need an algorithm to be able to contrast detect edges in a photo.
Users will roughly paint the mask over an object in an image with their fingers on android phone and then i want to refine the selection mask with the code that detects the edges and adjusts the mask to the edges.
please read about hough transform also canny and you can decide which are better/
this not an easy thing to do.
i would suggest you try to understand how do they work with matlab first this is very C-like language.
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html
i am trying to implement some photo effects i try different effect like sketch painting effect, Emboss effect,
and now i m trying to implement Oil Painting Effect.
i found this link
http://supercomputingblog.com/graphics/oil-painting-algorithm/
but at my level this to hard to understand plz help me in this, or any other reference link for it.
Download JHLabs Library for Android from the following link.
https://code.google.com/p/android-jhlabs/
https://code.google.com/p/android-jhlabs/downloads/list
There are the effects given for oil painting, Emboss nad many more.
You can create pencil sketch effect from DoG Filter followed by GrayScale given in the library.
I think oil paintings are simulated best using a technique called "Stroke Based Rendering" (SBR) pioneered by Aaron Hertzmann. It's been around for a long time. Whether you do this in android or any other os doesn't make much of a difference.
What you need is a function that takes a rectangle and an orientation, and place a brush stroke on the current canvas. The brush itself is defined as a set of 2 texture grayscale images: one for the opacity and one for the height. You need one for the height so that you can use create a bump map alongside the canvas (the rendered image). Now, the tough part is to get good texture maps for your brushes so that it looks realistic. That's where you need to experiment quite a bit and see what you like best. Everybody has its own idea of what looks best.
To define the rectangle and its orientation, you can use image moments. The end result is that your brush strokes will kinda follow the contours of objects, which is usually what artists do (not always though).
In any case, this methodology is better explained here (this is link to my blog):
http://3dstereophoto.blogspot.com/2018/07/non-photorealistic-rendering-software.html
You can try the software called "The Painter" which I wrote (free and works on windows 64 bit) to see what can be done using SBR. Maybe it's not what you want at all. Here's the link to the software (also includes toon shading and watercolor rendering):
http://3dstereophoto.blogspot.com/p/painting-software.html
Again, this is a link to my blog which deals primarily with 3d photography. I happen to also like painting a lot.