How do I Take a picture, Scan it, then Populate EditText?

How do I Take a picture, Scan it, then Populate EditText? - android

I know a few apps do this but i have no idea where to start looking for examples or tutorials.
As an example, think of a check. You want to take a picture, have the app scan the check and then input the data into various EditText boxes for the user to look over and approve before sending.
TurboTax does this currently with your W2 on Android. Any help, or pointing in the right direction would be helpful.
Where should i be learning how to do this?

You might consider commercial software libraries, like that from A2iA Corporation (http://www.a2ia.com/en/a2ia-checkreader-0), or the open source Tesseract library (http://code.google.com/p/tesseract-ocr/) for basic OCR operations.
Assuming you want to read the MICR (magnetic ink character recognition) characters at the bottom of the check, A2iA is a good start, as I'm unsure if Tesseract will work well with the E13B font used for this.

Related

Question regarding Arabic on Google Mlkit

I have a question about harakah on Arabic language, the result that i get from MLKIT doesn't provide arabic with harakah in its text, only arabic without it.
I tried making a feature for my flutter app, and it needs to recognize arabic words from gesture that user paint. So I use google_mlkit_digital_ink_recognition, The example works great, i downloaded arabic model but the problem is it cannot read arabic word with harakah that i input (paint) in the screen. I tried reading documentation and search about this but no dice.
The result that i get is only arabic without harakah, and the feature i plan to make need it. Backend will send a text (1 to 4 arabic alphabet with or without harakah) and user need to paint it on the screen, MLKIT will recognize the strokes user made and then the app will decide if what user paint is correct or not, this is what i plan to make. Is there a way for MLKIT digital ink to input and output arabic with harakah with this plugin? Is there alternative or a better way to handle this? or do i need to make and train a custom modal for this? or am i missing something that needs to be done first?
There are some app that i see in google play that's a bit similiar, i think i can achieve similiar result like the apps i see with clipper and path but they don't have harakah on it, the app need to know if it's correct or not, and the text that i need to match comes from backend which may vary and more complicated. There's other alternatives that i thought, instead of text backend send me an image, use scribble to generate image on what user paint, then match it with image from backend with image_compare, but i'm not sure since i haven't tried it yet. The chance of getting them right when matching might be low and this will burden other team since they need to make the image one by one instead of just text. Right now, the fastest way i can think of is using MLKIT since i need to work on this feature this late month or early next month. I hope you guys can help, Thanks.

Font Recognition From free Hand drawing

I have been working on an application that involves font recognition based on a users free hand drawing characters in Android Canvas.
In this application the user is asked to enter some predefined characters in a predefined order (A,a,B,c). Based on this, is there any way to show the very similar font which matches the user's hand writing.
I have researched on this topic found some papers & articles but most of them are recognizing font from a captured image. In that case they are having a lot of problems by segmenting paragraphs, individual letters and so on. But in my scenario I know what letter the user is drawing.
I have some knowledge in OpenCV and Machine Learning. Need help on how to proceed with this problem.

It is not exactly clear to me what you want to accomplish with your application but I assume that you are trying to output a font from a database of fonts that matches a users handwriting the most.
In Machine Learning this would be a classification problem. The number of classes will by equal to the number of different fonts in your database.
You could solve this with the help of a Convolutional neural network which are widely used for image and video recognition related tasks. If you've never implemented a CNN before I would suggest that you look up this resources to learn about Torch which is a easy-to-start-with toolkit to implement CNN's. (Of course there are more Frameworks such as: Tensor Flow, Caffe, Lasagne, ...)
Torch Homepage
Deep learning with Torch: 60 minutes blitz
Torch Cheatsheet
The main obstacle you will face is that Neural Networks need thousands of images (>100.000) to properly train them and to achieve satisfying results. Furthermore you do not only need the images but also a correct label for each image. Will say, you would need a training image such as a handwritten character and the corresponding font it matches the most out of your database as its label.
I would suggest that you read about so called transfer learning which can give you an initial boost as you do not need to set up a CNN model completely by yourself. In addition people have pre-trained such a model for a related task so that you safe extra time as you would not need to train it for many hours on a GPU. (see CUDA)
A great resource to start with is the paper: How transferable are features in deep neural networks?, which could be helpful for the stated reasons.
To get tons of training and testing data you can look up the following open datasets that provide all types of characters that can be helpful for your task:
Artificial Characters Data Set
UJI Pen Characters Data Set
The Chars74K dataset
Hand written - Datasets
A New Benchmark Dataset for Handwritten Character Recognition
For access to a lot of fonts and maybe even the possibility to create further datasets on your own you can have a look at Google Fonts.

You might find this article very interesting : https://erikbern.com/2016/01/21/analyzing-50k-fonts-using-deep-neural-networks/
Seems like a pretty straightforward deep learning supervised learning problem.
Generate a ton of randomly deformed samples for letters of each target font type, and train a convnet on that set?
The ideal would be to have a huge set of labeled, handwriting to font data, but that feels unlikely.
You could also use the generated, progressive to font code to take a bunch of handwritten samples, and transform them to look more like the font of your choice, as a dataset.
This is good place to start : https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py
Digit letter recognition with convnets.
This is quite a bit of work though if you haven't worked with that stuff before.

I would suggest using OCR library tesseract. Very well developed and mature. It also has support for training with other languages which you can use to train over a set of font.
Approach
Training:-
Take all 26(per alphabet) images for n fonts. Train tessaract over 26 A's, then 26 B's and soon.
Testing:-
Take a sentence and separate all characters.
For each character, find certainty score(supported in library) from Tesseract. Note, for character 'a, use the trained model on all 'a''s from different fonts.
For all characters, find best font using some metric (average, median, etc). For example: You can sum certainty score each font received for all characters and use the font which got max result.

Algorithm for very simple voice/speech recognition

I'm writing a game for Google Glass, but unfortunately SpeechRecognizer API isn't available on the current builds on Google Glass GDK.
So I've been thinking about implementing an algorithm for a very simple voice recognition.
Let's say I want to recognize only: "Yes" and "No".
Do you know any example code or any helpful resources to help me in implementing this ?
Is it so hard that I should drop the idea and go with big frameworks like CMUSphinx ?
What about recognizing: up, down, right, left or numbers from 1 to 10 ?

As I know, there often used transition to the frequency domain by fast Fourier transform (FFT) and it analyzing. Also need some dictionary of speeched words for frequency correlation.
Please see this links:
CMU Sphinx have java implementation.
David Wagner have a good article and matlab implementation.
P.S. Ohh, if you speak in russian, why you don't read this article - very simple, with java examples.
P.P.S. Honestly, I never use this framework, but if you have only a superficial knowledge about speech recognition, robust and easyest way is to use existing complete solutions like frameworks or libraries, otherwise you need spend time to possess the necessary knowledge threshold. In this case you can read this article.

Object Detection for android with tesseract or OpenCV

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.
All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!
Following are the 2 posts that I made:
https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504
Extracting information from captured image in android
I am not sure whether to go ahead with tesseract or use openCV

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):
Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.
Now, there are also two general settings in which OCR is applied:
Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.
Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.
If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.
There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection
As many have noted, keep in mind that recognition is still an open research challenge.

The solution to improving the OCR output is to
either use more training data to train it better
filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)
In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.
Some of the links posted were
Improving input for OCR
How to train Tesseract
Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.
OCR Classification

How can implement captcha Using android

Can any one give some suggestions to implement captcha in android

A little bit of shameless self-promotion, but I recently had this same question, and after combing through StackOverflow.com realized that all of the solutions out there are Remote API based (which would require my simple app to require Internet permissions), and either were super difficult to implement on Android, or were limited by use based numbers and you could not get that number raised.
So I started a project, and put it on GitHub, Click here
Current features are limited, but I hope they will grow with the open-source community:
Simple Text Captcha, editable length.
Simple Math Captcha, editable math operators.
The biggest feature of all is ease of use!

Take a look to this project, they have a some android's examples
https://labs.ericsson.com/apis/captcha/downloads
https://labs.ericsson.com/apis/captcha/documentation
or maybe use this
http://simplecaptcha.sourceforge.net/
to get some ideas.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.