Android: counting occurrence of words on file on SD Card - android

This should be straight forward but for some reason when I try to count words in a file after I download it to my SD Card, the number seems to be off. Also the more occurrences there are, the further my result seems to be off. I use Microsoft Word to verify the number of occurrences (using ignore case and whole word only). To test the number of occurrences, I use the "the_counter" variable below. I also verified there is nothing wrong with download & the FULL file is downloaded to my SD card. This is driving me nuts -- I'm thinking Word cannot be wrong here so what could possibly be wrong with my code below?
Could it be white space or special chars in the file causing the problem --is there a way to clean the file to verify this?
//Find the directory for the SD Card using the API
File sdcard = Environment.getExternalStorageDirectory();
//Get the text file
File file = new File(sdcard,TEMP_FILE);
//Read text from file
//StringBuilder text = new StringBuilder();
m_tree = new Tree();
int i=0;
BufferedReader br = null;
long the_counter=0;
try {
br = new BufferedReader(new FileReader(file));
String line;
String []arLine;
while ((line = br.readLine()) != null) {
//get each word in line
if(line.length()==0)
continue;
arLine = line.split("\\s+");
//now add each word to search tree
for(i=0;i< arLine.length;++i){
m_tree.insert(arLine[i]);
if(arLine[i].equalsIgnoreCase("a"))
++the_counter;
}
}
m_sTest = Long.toString(the_counter) ;
br.close();
I edited my code to read in each character per line and create words manually. and I STILL GET THE SAME RESULT.
br = new BufferedReader(new FileReader(file));
String line;
String []arLine;
StringBuilder word = new StringBuilder();
while ((line = br.readLine()) != null) {
//check for word at end of last line
if(word.length()>0){
m_tree.insert(word.toString());
word.setLength(0);
}
char[] lineChars = new char [line.length()];
line.getChars(0,line.length(),lineChars,0);
for(char c: lineChars){
if(c== ' '){
//if we have a word then store and clear then move on
if(word.length()>0){
m_tree.insert(word.toString());
word.setLength(0);
}
}
else{
word.append(c);
}
}

This is issue was that I was not accounting for special characters in between words: i.e:
this-is-four-words and not one . I'm not even sure that is proper grammar or writing but it was in this file and it certainly threw off my count.

Related

Reading big string from file Android

So I was trying to read a text file of approx 40KB using Buffered Reader in android. The thing is one of the line (mostly last line) from the file exceeds 9000 characters which is difficult to store in String and to log it.
I tried this approach below but as characters exceed it discards parsing the remaining part from the line.
try {
File root = android.os.Environment.getExternalStorageDirectory();
File file = new File (root.getAbsolutePath() + "/" + "new.txt");
BufferedReader r = null;
r = new BufferedReader(new FileReader(file));
StringBuilder total = new StringBuilder();
String line;
while((line = r.readLine()) != null) {
Log.e("Line",line);
total.append(line);
}
r.close();
} catch (Exception e) {
e.printStackTrace();
}
To which I thought to change String line to String line = new String(new byte[1024*1024]) could solve my problem. But Android Studio is highlighting this as reductant code. The thing is I need to apply some regex stuff on each line in while loop.
Is there any workaround I can use. By the way here is my 40 KB file link https://www.dropbox.com/s/hp7vn6vt86adv6g/new.txt?dl=0
Edit: The file I am trying to parse is an html file.
Updated
I was wrong, the string in the line is not omitting the rest part as suggested by skandigraun (from comments). Logger was not printing the whole string because it was exceeding it's 4000 chars limit while my string was 8093 chars.
In short above code is working just as fine!

Android espresso compare strings on the screen with string from assets

I would like to compare content of .txt file that i have in my assets folder with some text on the screen.
Usually when I assert text on the screen I use:
onView(withId(R.id.someId)).check(matches(withText("String")));
is ther any easy way so i can assert it from file?
Also, if you want to shorten your assertions and actions when using Expresso, check this library: https://github.com/SchibstedSpain/Barista (disclaimer: I'm a contributor).
It contains a set of quick actions and assertions that make the tests much more readable.
Here is a code to read text from text file.
StringBuilder buf=new StringBuilder();
InputStream json=getAssets().open("book/contents.json");
BufferedReader in=
new BufferedReader(new InputStreamReader(json, "UTF-8"));
String str;
while ((str=in.readLine()) != null) {
buf.append(str);
}
in.close();
now compare your string from assets to you screen text.
buf.toString().equals("your text here");

android separating a text file

I've been researching about how diablo 2 dynamically generates loot, and I thought it'd be fun to create a fun app that will randomly generate items using this system.
I currently have code which I believe should read the entire txt file, but it's not parsed.
It looks like:
private void itemGenerator() {
int ch;
StringBuffer strContent = new StringBuffer("");
InputStream fs = getResources().openRawResource(R.raw.treasureclass);
// read file until end and put into strContent
try {
while((ch = fs.read()) != -1){
strContent.append((char)ch);
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
An example in the text file would look something like:
Treasure Class Item1 Item2 Item3
tc:armo3 Quilted_Armor Buckler Leather_Armor
tc:armo60a Embossed_Plate Sun_Spirit Fury_Visor
tc:armo60b Sacred_Rondache Mage_Plate Diadem
So what I'm thinking right now is putting each row into an array with StringTokenizer delimited by \n to get each row. Then somehow do it again with tab-delimited for each item in the array and put it into a 2D array?
I haven't coded it yet because I think there's a better way to implement this that I haven't been able to find, and was hoping for some helpful input on the matter.
For anyone actually interested in knowing how the item generation works, their wiki page, http://diablo2.diablowiki.net/Item_Generation_Tutorial, goes very in-depth!
I think you are facing problem in distinguishing between each lines that are read-out from file. In order to read the file line-by-line you should change your code as below:
InputStream fs = getResources().openRawResource(R.raw.treasureclass);
BufferedReader br = new BufferedReader(new InputStreamReader(fs));
String line = null;
while((line = br.readLine()) != null){
Log.i("line", line);
//split the content of 'line' and save them in your desired way
}

Android memory error

I cannot understand why this keeps crashing with a memory error:
server = new URL("http://-link cannot be supplied-");
BufferedReader reader2 = read(server);
line = reader2.readLine();
StringBuilder bigString = new StringBuilder("");
while(line!=null) {
bigString.append(line);
reader2.readLine();
}
the file is not -that- big 7000 odd lines # 240,031 bytes on disk.
Basically what i need to do is to tell wether the file contains a small string (a postcode) the file is basically a list of postcodes.
What is the best way to read this in? as obviously what i am doing is not working at all :D
Your while loop never ends!
while(line!=null) {
bigString.append(line);
line = reader2.readLine();
}
should work.

Android, strings loaded from .txt file are not being escaped

I am loading data from a resource within my own application, and the escape characters I place are not being processed the way I expect them to be. For example, a line in my resource would look like this:
Ellington Human Sciences Building<>EHS<>Human Performance Sciences Building\nNeighbor to Ellington Human Sciences Annex (EHSA)<>292<>482<>73<>25<>Human Sciences
Ellington Human Sciences Annex<>EHSA<>Human Performance Sciences Building\nNeighbor to Ellington Human Sciences Building (EHS)<>340<>464<>28<>20<>Human Sciences
my file reader looks like so:
private synchronized void loadPOIs(Resources resource) throws IOException {
if (mLoaded) return;
InputStream inputStream = resource.openRawResource(R.raw.pois);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
try {
String line;
while((line = reader.readLine()) != null) {
String[] strings = TextUtils.split(line, "<>");
if (strings.length < 7) continue;
POI poi = addPOI(strings[0], strings[1], strings[2], strings[3], strings[4], strings[5], strings[6]);
if (strings.length == 8) {
final int len = strings[7].length();
for (int i = 0; i < len; i++) {
final String prefix = strings[7].substring(0, len - i);
addMatch(prefix, poi);
}
}
}
} finally {
reader.close();
}
mLoaded = true;
}
strings[2] would be the line holding the information about the Point of Interest, and they contain the "\n" character. When I call poi.getInfo() (the getter method of retrieving the info, returns a String) the output allows the "\n" to persist.
any ideas?
You are reading text from a file, and '\n' is just as valid text as any other and does not have any special connotation within a text file. If you want a newline instead, then write a newline in your txt file. It's sure easier than performing scaping over text, and you control the source of the data so there should be no trouble in modifying it.

Categories

Resources