I am trying to develop an app to get the RSS feeds from http://xxx.xxx.com/xxxxxblog .
Can someone help me with the HTML parsing to get the feeds?
You can try JSoup to parse the HTML.
It is very simple to use and well documented, you should not have too much trouble parsing your page.
You can find how to do that from this page
http://jsoup.org/cookbook/extracting-data/selector-syntax
It uses different html tag to parse data between that tag.
The feeds on this web page seem clearly delimited by <dc:subject> tag.
As you only need to get the feeds, the shortest way may be better to get the feed boundaries with regular expression that would also capture the header (something like <dc:subject>(.*?)</dc:subject>). Read line by line, once you detect the expression - this is the start of the feed. Maybe it is philosophically not the most right way and we should parse all HTML instead but why to run unnecessary code ...
There is no lack of Java built-in parsers either, starting from Java's built in HTML parser and continuing to various alternative libraries that in some cases may fit better, some also suggest to use XML parser (XPath). Various solutions are discussed here.
please try
Use this example code to create RSS reader that is actually can handle namespace extensions
https://github.com/dodyg/AndroidRivers/blob/master/src/com/silverkeytech/android_rivers/xml/RssParser.kt
The library underlying this code is this https://github.com/thebuzzmedia/simple-java-xml-parser.
It works very well in Android as well.
Related
is that possible in Android to go though whatever link and get main content from that page(f.e.text) or whatever i want to get? If yes, how i can realize that?
There is couple ways to get data from websites.
First and maybe the most popular way is parsing RSS feed from webiste. Java and Android are providing couple parsers and ways to parse xml or in this case RSS Feed. You can take a look in this examples:
https://developer.android.com/samples/BasicSyncAdapter/src/com.example.android.basicsyncadapter/net/FeedParser.html
https://www.tutorialspoint.com/android/android_rss_reader.htm
Second way is getting needed informations from API if it is provided from webiste and offten that API will be in JSON format. For example https://openweathermap.org/ will return JSON file filled with informations of weather which you can pares into your app. Also Android and Java are providing couple ways to get informations from JSON format. You can take a look on this one:
http://www.androidhive.info/2012/01/android-json-parsing-tutorial/
Third you can use support library called Jsoup for parsing HTML from particular webiste/s. You can find examples how to parse HTML on their offical webiste: https://jsoup.org/
Maybe there is more ways certanly you should look up for them.
I'm developing a cross platform mobile app with Qt 5.3.1. I need to load various HTML pages and parse DOM element values from them. At the moment I have succesfully loaded a page with QNetworkAccessManager and stored it in QByteArray but I hit the wall trying to parse the valuable data out from it.
Couple points:
I can't use QWebkit since it's not supported on Android on Qt 5
The HTML can't be assumed being strict mark up, eg Qt's XML readers or DOM parsers won't work on their own
I'm only parsing text from pages. The information is all i need, not visual style
What options do I have? It sounds a little bit stupid that WebKit would be the only way doing this, since I don't need to display any graphical data from webpages. Is writing my own DOM parser for HTML the way to go?
http://qt-project.org/wiki/Handling_HTML
Has a pretty good list of html parsers that are available.
Sometimes a good regular expression can catch what you need, but it isn't as robust as a good HTML parser.
The first link on the page looks pretty promising:
http://tidy.sourceforge.net/libintro.html
I don't know how difficult it would be to build the libraries for Qt Android, but it looks do-able, and works with standard tools.
Hope that helps.
I would like to know which is more efficient to get the data from the server by the xml or json.
Another question:
does XmlPullParser related to parsing xml data that come from the web service? so if I am using json I don't need XmlPullParser ! or there is other uses !
thank you very much
What I've found extremely useful for parsing JSON is Google's gson library. For xml, you can use gson underneath to do the same thing with gson-xml. With a single line of code you can map your JSON/XML to your objects without having to write a single line of parsing code.
If you find performance to be an issue (I'm making this suggestion because these libs make you super productive), there are mechanisms in both to allow you finer grained control. I doubt you'll have problems with performance though.
For a very thoroughly researched answer to the headline question (though focussed on browsers, not android apps), see David Lee's Balisage 2013 paper:
http://www.balisage.net/Proceedings/vol10/html/Lee01/BalisageVol10-Lee01.html
His conclusion, in one line, is that the choice between XML and JSON makes very little difference in itself - though the details of how you do XML or how you do JSON can make a big difference.
i want to write a program that gets the match dates from this link http://www.goal.com/en/teams/germany/148/fc-bayern-munich-news
and use it in my program i just want the dates and the matches how can i do this? in andorid
I'd write an Activity to display the data, which calls an AsyncTask to connect to the site and download the HTML. I'd then use some kind of parser to grab the data I want and save it to a database.
Have you written Java before? If not I'd start out by learning the language. Download Eclipse and write a simple program that can connect to the site and grab the HTML. Then add the parser.
Once you are that far, do the Hello World tutorial, then work your way through the other tutorials. Also learn about the Android Application Lifecycle. At that point you can start thinking about moving your code over to the Android framework.
EDIT
Here are some links to information about potential parsers & parsing approaches.
Tag Soup
What HTML parsing libraries do you recommend in Java
Two HTML parsing links
You could also consider using (hushed voice) regex/pattern matching.
I have read the example for Rss Parsing from the ibm site.(http://www.ibm.com/developerworks/opensource/library/x-android/).
In this example,the rss are shown in a listview and then,if you press one announcement you can see it in the web browser of the device.How could i see them in the app,with no use of the device browser?
Thanks a lot
Create a layout with a WebView then load the URL from each "announcement" using WebView.loadUrl.
I'm a little confused but you seem to have answered your own question.
You say you don't want to use the web browser on the device but the example in your question doesn't use the browser. It does exactly what you're asking for.
The idea is that you download the html from the website and then use the parser to break it up into separate "announcements" and store them in list view items in your program.
I have done a bit of this type of thing myself in android. I used jsoup java library, which makes breaking the html into the bits you want to display really easy.
If you want some more help I can give you an example of an app I made that pulls movie times from google.com/movies as an example. here are links to the classes where I did the html download and parse:
ScreenScraper.java
HtmlParser.java