How to build the android native SDK for image to text recognition. (I have done well with some APIs from web services. But this time, I just want to make the app without any Internet Connection, no APIs, and no Web Services. Just an offline OCR app).
So my question here is
how to crop each and every word containing in the image?
how to compare the cropped text with the alphabets and characters?
You said you didn't want to use an API, however I suggest you use the recently released OCR API by Google:
https://developers.google.com/vision/text-overview
Just add the following line to your dependecies:
compile 'com.google.android.gms:play-services-vision:9.2.0'
Note: Upon first use it will have to download some files from a google server for it to be able to work. Make sure to add this check .isOperational(). Afterwards you can use it without an internet connection.
I guess u can use Tesseract OCR Tool, an open source alternative by Google. How to integrate that in Android is simple via Tesseract Android Tools
Have a look at the tess-two project on github, it's very easy to use and gives good OCR results
You can use ML Kit for Image to Text Recognition:
https://firebase.google.com/docs/ml-kit/android/recognize-text
Related
I am exploring the APIs provided by Google. Firstly, I was experimenting with Google Cloud Vision API with Python in PyCharm in order to try to perform Optical Character Recognition with various texts.
So I wrote a basic program in Python in PyCharm which was calling this API, I gave to it as an input an image which included text e.g. the image/photo of an ice-cream bucket and then takes the text written on this bucket as an output.
Now I want to test the barcode scanner of Google Mobile Vision API. So ideally I would like to call the Google Mobile Vision API in a python program in PyCharm which calls this API, give as an input an image/photo of a barcode and take as an output the details saved in this barcode.
My question is if this can be (easily) done with PyCharm or if I should download Android Studio to do this simple task?
In other words, can I call easily a mobile API in an IDE which is not for mobile app development like Android Studio but in an IDE for desktop applications like Pycharm?
It may be a very basic question but I do not know if I missing something important.
The mobile vision API is designed only for Android and iOS. As far as I know, Pycharm does not work well with Java, so I would say that you would have to create an Android/iOS project in order to test it (It would be a lot harder trying to make it work with python than simply installing Android studio and cloning a mock project).
I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done
Use google cloud speech api
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Can someone please suggest me which of the above might best suit my use case?
I have a limited set of commands and speed matters the most to me.
I am really confused and thus putting this question. Thanks in advance.
Use google cloud speech api
Very expensive since you have to pay for every request.
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Proper development is always an effort.
I'm trying to develop an android application using android studio which will recognize Arabic text from an image. I tried Tesseract OCR but unfortunately the result were inaccurate at all, so I wish to try ABBYY cloud OCR SDK. But i'm not able to find any useful tutorials or examples of how to use it with android. can someone recommend some tutorials/examples or guide me how to start using it?
Detail of integration of Abbyy OCR sdk available on
GitHub
I just read an article that google created a speech recognition that works offline on Android.
Is there a way to use that in my project that is not based on Android?
I'm sure that I'm not the first one who thinks about to use it in none Android projects, but I couldn't find anyone who did it and put it into the world wide web.
I am thinking of capturing some text from documents using my android phone and was looking for an ideal OCR app on android. I just happened to read today that Google introduced OCR for scanning documents that can be edited in Google Docs. I was wondering if I could use the OCR for things other than converting the documents to Google Docs - say, like taking a picture of a certificate and capturing the names and dates of birth of the candidates or taking a photo of a license plate and be able to get the info as text that can be stored.
If anyone has an idea of how to achieve this on Android using Google's OCR, that would be great to know. I did read about Tesseract/Tesjeract but it seems very difficult to implement what I want using it - maybe I didnt fully understand how to use it through Java. Here's the link to the new app that uses OCR to scan documents - Google Docs on Android
We have tried Google Docs API a wile ago, but it is very weak in terms of accuracy. Looks like it is based on some outdated version of tesseract. I suppose you would get more accuracy if you try tesseract. However, you will need to manage special preprocessing of images taken by camera since they introduce additional challenges. Google Docs API does not have that.
On running Tesseract on Android look here:
Using tesseract on android
Commertial alternative to tesseract for OCR on mobile phone:
http://www.abbyy.com/mobileocr/
However, if you are looking not into just capturing text, but also exptracting data, then you may need additional technology to parse text output. That means writing even more code. Or there is alternative to license existing commertial Data Capture API from ABBYY. That was already discussed here:
Recognise text in certain position using the Iphone camera
Disclaimer: I work for ABBYY
You can use Google Docs (now called Google Drive) to OCR an image by uploading the image to Google Drive. Later you can pull this Google Document back as a text/rtf/doc/html file. This data now - you may use however you like in your app. This can be achieved directly without user intervention using Google Drive APIs. Here are some Google App API references:
To upload with OCR: see
Fail to upload a image file into Google Doc via java api with ?convert=true
To download a file from Google Drive:
https://developers.google.com/google-apps/documents-list/#downloading_documents_and_files
https://docs.google.com/feeds/download/documents/Export?docID=__INSERT-ID__&exportFormat=txt&format=txt
Beware there should be a quota in place to use the OCR service.