Android, Store large amount of text (HTML) and search through them - android

I am making a framework in order to easily "appify" books.
This framework will need to automatically detect chapter and heading to make a table of contents. The idea is to also be able to easily search through the text and find what you are looking for.
Now what I still need to figure out is:
how to store the data in such a way that I can easily detect the chapters and heading
and still be able to search through the text.
The text that is stored needs to be formatted, so I thought I would store them as HTML or Markdown (which will be translated to HTML). I don't think it would be very searchable if the text is in HTML.
P.S. it does not have to be HTML if there are other more efficient ways to format the text.

Do you really want to do such thing on the device itself?
I can suggest you to use separate sqlite database for every book. With separate tables for table of contents, chapters, summarized keywords of chapters(for faster search) and other service info.
Also here you can find full text search example
Also I recommend you to bring your own sqlite build with your app.
Now lets talk about the main problem of yours - the book scraping.
I have no competency here, I believe this problem is the same as the web sites scraping.
Upd:
Please do not store book contents as HTML, you can store it as markdown for example, it takes less amount of storage, easier to sanitize and you can always apply your styles later

Related

How can I search content in HTML and not the tags

I have a database of content of which the majority are HTML pages which are then used for display purposes in an app.
We are looking to build out a search feature but I have some concerns over false positives appearing due to the results including HTML code.
E.g searching for "title" will return any content pages which have a title html tag
We are currently using NSPredicates to perform the query on a Core Data database.
Are there any easy/efficient ways to prevent these results being returned?
I have the same problem on Windows and Android as well!
One idea for iOS is to actually store a separate a text version apart from the HTML version. You could then use very simple (even if not very efficient) predicates lie
[NSPredicate predicateWithFormat:#"text CONTAINS[cd] %#", searchText];
A more performant way would be to strip out the words and store them in lowercase in an indexed attribute of another entity.
In both cases, the parsing should be done beforehand via one of the available libraries (see e.g. link in the comment).

Android Application - Multiple Pages with Static Content

I am looking to create an app that has about 50 pages of static content. I can give an example of what it would look like, so that it will be easy to understand the questions.
Imagine a Jokes app, with tens or hundreds of pages
The user can see a full list of jokes, which shows the headings in a list view
Selecting a joke subject will take them to the joke page
From there they can go 'Next' or 'Previous'
They should also be able to favorite a joke
Going to the Favorites pages, will list the favorites for them
The joke pages are static. I could add more jokes with an app update but there is no dynamic content. So I am planning to have any server side code that the app can call.
Now the questions:
In Android, can I achieve this with a single activity (for the joke display) and switch the content based on selection?
There are several to store the jokes - sqlite, separate html pages or just strings.xml. Which is better for these use cases?
If there are multiple headings within a single joke (i.e. formatting as bold for them to stand out), I need to store the formatting along with the content. So HTML looks like the option?
This may be out of scope, but I want to capture the content in a standard way so that if I build an iOS app for this, I can just worry about the UI part and use the same content. Again HTML is the option?
Thanks for looking.
Yes you can achieve this with a single activity.
This is really up to you but Android provides support for SQL Databases. You may also consider looking into content providers.
Note: I would not use strings.xml because you can't load new jokes into strings.xml. If you are getting your jokes dynamically from a website, then you really should either load your content into a database and have the app display from the database, or else just load each html page individually. The html page will be easier as you will basically just be making a browser app, but the database will certainly be faster and cleaner.
HTML is certainly an option, though this question seems a little bit vague. It really depends on how you want to get your jokes. If you want to grab them as HTML pages and just display them, then the work is done for you. If you want to parse through them and display them as an android specific app, then it will be more work but you have more control on the app side.
Yes if you want your app to work cross platform you can use HTML to standardize your view across multiple devices.

how to insert images, links, carriage return into Searchable Dictionary for Android

I am developing a glossary using the sample code Searchable Dictionary. Thanks to searching here, I have figured out how to update the database, which is a .txt file, and then get it to load by changing the version number in Dictionary.java.
My question is, how to do the following:
I would like to be able to insert illustrative images into the definitions.
I would also like to insert links to other entries in the dictionary (e.g. 'inventory' should have a link to 'product flow' and other related terms).
I would also like to know how to insert a carriage return.
My original glossary in spreadsheet format has several fields: 'term' 'definition' 'example' 'related terms'. I want to be able to put in links and images inside these fields and have a couple of carriage returns in between each field to differentiate them.
The dictionary code seems to take in everything as a string, so even if I try to put 'image.jpg', or '\n' for a new line, it simply prints that as part of the string. Is there a way around this?
Searching stackoverflow gave a few links to using SQLite. I am honestly a newbie at all this; the last time I programmed anything significant was ten years ago. Rewriting the code to directly access a SQLite database would be nontrivial for me. So I would like to know if that is really the route I should be taking. If it is, then could you point me to the most simple tutorials for constructing a dictionary that way? I downloaded SQLite data browser, but haven't figured out how to use to construct a new database. I know it should not be so hard; I just don't know what I am doing. :(
If there is an easy way to just do it inline, still using the Searchable Dictionary sample code as a base, that would really make my day. Otherwise, any specific suggestions/directions would be really appreciated.
Thank you!!
Update:
For clarification, below is an example of one entry in my glossary, as desired. There are carriage returns between sections, and links and images are inline with text:
Heijunka, or Load Leveling - An approach to smooth production flow when a mix of products is to be produced, by identifying for a selected time period, the smallest batch size at which to produce each specific product in the mix, before switching over to make another product in the mix.
Example:
Keeping a steady work flow, even if much slower than the original max, reduces waste (<-this is a link to the entry 'waste' in the glossary):
[image of line of balance graph with load leveling, and without]
Related Terms: work structure, demand leveling (<-These are links to respective entries)
Not sure if you saw this already, but Android has some developer lessons for saving Key-Value sets for simple data, and saving to SQLlite for more complex structures.
It sounds like your app needs a database called "Invetory" with the following fields: "ProductImage", "ProductTitle", "ProductLink". And you want to store the image as a BLOB. There's a good SO post on how to take an image from a URL and convert it to a byte array for storage: how to store Image as blob in Sqlite & how to retrieve it?
For the carriage return, i'm assuming you're using "\n"? If that's not working have you tried unescaping your string for TextView:
String s = unescape(stringFromDatabase)
Or for SQLlite:
DatabaseUtils.sqlEscapeString()
Key-value data: http://developer.android.com/training/basics/data-storage/shared-preferences.html
SQLlite data: http://developer.android.com/training/basics/data-storage/databases.html
Additional SQLite resources:
http://www.youtube.com/watch?v=G8ZRXdztESU
http://www.vogella.com/articles/AndroidSQLite/article.html

Using a humongous database in an android APP

I am trying to use a database for my application which needs a list of all the words in Arabic language, unfortunately this database is very large in size, more than 200 MB, I've seen that the only solution for such a problem is using a web service or having my database online and download it on first use which is not practical in my case since this is a game and the user can play it while he's disconnected, plus the download size will be large and it will use alot of space on his phone. I couldn't find a way to make the size of my DB reasonable.
My question is if there is a way to shrink the size of the database knowing that all the data stored in it is of the type text.
I've noticed that the keyboard in my phone has an auto-complete feature, where is it getting the list of valid words from? Can i use it for my application?
You'll want to store your words in a prefix tree (or trie). It is a space-efficient structure for this kind of data.
For more info, see: https://gamedev.stackexchange.com/questions/4142/best-way-to-store-a-word-list-in-java-android
your database might have so much extra information included, for example grammar, inflections, comments etc. If this is the case, then re-create your database with only the limited data/columns you need to be used inside phone.

How should I store data to make it easy and efficient to search in android

What I have is an app that displays some documents. In the string resources I have the documents divided into smaller pieces in anticipation of making them searchable. Think of them like newspapers with a number of articles where each article is a separate string resource. There will not be any storing of user input (unless I decide to store recent searches). In the search part of the android developer docs it mentions this but says it is not going to go into details of how to store and search data just how to use the search dialog and widget.
What kind of storage of my data should I be using. Is simple string resources good? should I look into a real databasing? which of these make it the most efficient and quickest to search? I'm new to android so any help would be appreciated.
answer:using android's built in sqlite database system and FTS3 tables.
I would definitely use a database for this.
Read all the documents and link each word to each document in a database.
A word search would then produce a list of documents containing this word quickly.
Make sure you reindex each time you add and remove a document.
By the way, you should see to improving your accept rate.
Also, this is problably not a Android question.

Categories

Resources