How to extract particular text from Image - android

From the following image, I want to extract number below text Arzt-Nr (654321161).
I've used OCR reader but it is extracting texts randomly not in a sequence, making it difficult to add a logic to extract no below "Arzt-Nr".
I've used following code but texts are not in sequence.
Is there any way to achieve this?
String text = "";
for (int i = 0; i < detectedItems.size(); i++) {
TextBlock item = detectedItems.valueAt(i);
String detectedText = item.getValue();
List<Line> lines = (List<Line>) item.getComponents();
for (Line line : lines) {
List<Element> elements = (List<Element>) line.getComponents();
for (Element element : elements) {
String word = element.getValue();
text = text + " " + word;
}
text += "\n";
}
}

Try to check a fixed length to the words after "Arzt-Nr" position, try also to check the pattern of the word founded.. for example if you need only numbers ecc...

Extract tsv output of image using tesseract and find the nearest text below the location of keyword. Also have a look at page segmentation modes of tesseract.
Link to Generating tsv
Link to use page segmentation

Related

How to substract single string from a sparseArray?

I've been working with Android Mobile Vision OCR API for a while. Everything is work perfectly until i found that i need to extract just single words from the whole SparseArray (Mobile Vision API default return is a TextBlocks which defined in a SparseArray)
SparseArray<TextBlock> textBlocks = textRecognizer.detect(imageFrame);
for (int i = 0; i < textBlocks.size(); i++) {
TextBlock textBlock = textBlocks.get(textBlocks.keyAt(i));
List<Line> lines = (List<Line>) textBlock.getComponents();
for (Line line : lines) {
List<Element> elements = (List<Element>)
line.getComponents();
for (Element element : elements) {
word = element.getValue();
Log.d(TAG, "word Read : " + word);
}
}
}
When i check
Log.d(TAG, "word Read : " + word);
it print out repeatedly all element in the SparseArray
It seems that i'm asking a not-so-obvious question. But can i extract just a single or couple word from those "words" printed above ? For example, i want to extract the word which has character above 12 and has number in it.
Any help or hints will much Appreciated.
You could add logical expression to filter result like below:
word = element.getValue();
if (word .length() > 12 && word .matches("[0-9]+")) {
Log.d(TAG, "word Read : " + word);
}
You are running word in a loop that's why it's printing all the values. When you run it only once according to the answer of #navylover you will get a single string. Just remove the for loop

Spliting a Text on multiple conditions

I am writing an app with Android Studio and I want to split a text into different values.
I have following text in result
*"Name: Peter;Age: 25; City: Chicago"*
I want to get:
*Name = Peter;
Age = 25;
City = Chicago;*
I used the search function and found these solutions: Android Split string but for my problem it seems to be too complicated.
The easiest way is to use split() method.
String s1="Name: Peter;Age: 25; City: Chicago";
String[] words=s1.split(";");
//using java foreach loop to print elements of string array
for(String w:words)
{
Log.i("Words: ", w);
}

Android edit/insert into string

I was wondering how I could programmatically edit strings in android. I am displaying strings from my device to my website, and the apostrophes ruin the PHP output. so in order to fix this, I needed to add character breaks, ie: the backslash '\'.
For example, if I have this string: I love filiberto's!
I need android to edit it to: I love filiberto\'s!
However, each string is going to be different, and there will also be other characters that I have to escape from . How can I do this?
I was wondering how I could programmatically edit strings in android. I am displaying strings from my device to my website, and the apostrophes ruin the PHP output. so in order to fix this, I needed to add character breaks, ie: the backslash '\'.
This is what I have so far, thanks to ANJ for base code...:
if(title.contains("'")) {
int i;
int len = title.length();
char[] temp = new char[len + 1]; //plus one because gotta add new
int k = title.indexOf("'"); //location of apostrophe
for (i = 0; i < k; i++) { //all the letters before the apostrophe
temp[i] = title.charAt(i); //assign letters to array based on index
}
temp[k] = 'L'; // the L is for testing purposes
for (i = k+1; i == len; i++) { //all the letters after apostrophe, to end
temp[i] = title.charAt(i); //finish the original string, same array
}
title = temp.toString(); //output array to string (?)
Log.d("this is", title); //outputs gibberish
}
Which outputs random characters.. not even similar to my starting string. Does anyone know what could be causing this? For example, the string "Lol'ok" turns into >> "%5BC%4042ed0380"
I am assuming you are storing the string somewhere. Lets say the string is: str.
You can use a temporary array to add the '/'. For a single string:
int len = str.length();
char [] temp = new char[len+1]; //Temporary Array
int k = str.indexOf("'"), i; //Finding index of "'"
for(i=0; i<k-1; i++)
{
temp[i] = str.charAt(i); //Copying the string before '
}
temp[k] = '/'; //Placing "/" before '
for(i=k; j<len; j++)
{
temp[i+1] = str.charAt(i); //Copying rest of the string
}
String newstr = temp.toString(); //Converting array to string
You can use the same for multiple strings. Just make it as a function and call it whenever you want.
The String API has a number of API calls that could help, for example String.replaceAll. But...
apostrophes ruin the PHP output
Then fix the PHP code rather than require "clean" input. Best option would be to select a well supported transport format (say JSON or XML) and let the Json API on each end handle escape code.

How to display Html table in textview?

I want to display text inside textview using this code:
Html.fromHtml("<html><body><table style=width:100%><tr><td><B>No</td><td><B>Product Name</td><td><B>Qty</td><td><B>Amount</td></tr></body></html>");
But result is not in correct format result look like this:
NoPRoductNameQtyAmount
please suggest what i am doing wrong in this code.
fromHtml() does not support <table> and related tags. Your choices are:
Reformat your text to avoid tables
Use WebView to render your HTML table
Use native widgets and containers (e.g., TableLayout) for your table
Instead of using html table inside a TextView I've solved formatting normal text into a table like text, adding white spaces into the text to have a tidy structure.
This is the code:
String lines[] = getItem(position).toString().split("\n");
String print = "";
String parts[] = null;
String tmp = "";
int weight = 0;
for (int i=0; i < lines.length; i++) {
if (!lines[i].equalsIgnoreCase("")) {
parts = lines[i].split(":", 2);
tmp = "";
Paint textPaint = text.getPaint();
float width = textPaint.measureText(parts[0]);
float wslength = textPaint.measureText(" ");
for (int j = 0; j < (220 - Math.round(width))/Math.round(wslength); j++) {
tmp = tmp + " ";
}
if (print.equalsIgnoreCase("")) {
print = print + parts[0] + tmp + Html.fromHtml(""+parts[1]+"");
} else {
print = print + "\n" + parts[0] + tmp + parts[1];
}
}
}
You can find more explanations here:
http://blog.blupixelit.eu/convert-text-to-table-in-android-sdk/
Hope this helps.
How about change the way to just render table by webview, the other tags render by textview ? Recently i finish a demo to overcome this. So we first need separate<table> with others supported tags, which will like change "<p>**<p> blabala <table>balabla </table> blabla <p>**<p>" to three separated strings
<p>**<p> blabala
<table>balabla </table>
blabla <p>**<p>
Then only <table> included tags render by webview, the others by textview
And the result in android will be like:
<ScrollView>
<LinearLayout>
<TextView>
<WebView> --- let the webview ATMOST measured
<TextView>
So every thing goes fine as i think. check this commit for detail
HtmlTextView recently added basic support for HTML tables. It's limited, but will do the trick if all you have to worry about is <table>, <td>, <tr>, and <th>.

Android BreakIterator hyphenated words?

I using breakIterator to get each word from a sentence and there is problem when a sentence like "my mother-in-law is coming for a visit" where i am not able to get mother-in-law as a single word.
BreakIterator iterator = BreakIterator.getWordInstance(Locale.ENGLISH);
for (int end = iterator.next(); end != BreakIterator.DONE; start = end, end = iterator.next())
{
String possibleWord = sentence.substring(start, end);
if (Character.isLetterOrDigit(possibleWord.charAt(0)))
{
// grab the word
}
}
As I'm seeing in your code what are you trying to do is to check if the first character in every word are a character or a digit. Every time you use the BreakIterator.getWordInstance() you will always get all the words depending on the boundary rules of the Locale and it is a little hard to accomplish what you want to do with the use of this class until I know, so my advice is this:
String text = "my mother-in-law is coming for a visit";
String[] words = text.split(" ");
for (String word : words){
if (Character.isLetterOrDigit(word.charAt(0))){
// grab the word
}
}

Categories

Resources