i want to fetch Bus time and number etc using This in json or XML format in android , i am able to fetch mode=driving,mode=walking,mode=bicycling etc information , see this link for reference but i want mode=transit , you can check this Link here in Left corner you can see Bus Number, Bus arrival minutes, etc , i want these all details in XML or JSON format , can you suggest me the link or how can i do this ?
Thank you.
i got answer from https://stackoverflow.com/users/1171619/mike
1)There's no official Google Transit API at the momemnt. Transits are provided by agencies, and most of Transits are not public. So, Google is not allowed to open them as API.
2)u may try to consume the "unofficial" data using your link + "&output=json".
3)wever, the result won't be a valid JSON. Instead, that's something, that can be easily converted to a JavaScript object. (The differences are: there is no quotes around property names, the strings are not properly encoded etc.)
4)Imagine you got this JavaScript object. However, it won't allow you to easily get the structured route details. Object's properties contain the route points coordinates, but no descriptions. The only place where the descriptions may be found is 'panel' property, which contains a chunk of HTML text (you may find a link to the sample of HTML in my blog post)
5)So, you'll have to convert this HTML into XML (X-HTML) and then build the parser of this XML to get the essence data of a trip.
6)Seems like a bit of overkill to me. Having in mind, that "unofficial" API may change in the future, including slight changes in 'panel' HTML structure that will kill your parser.
Posted By Mike
Thanks Buddy
Related
is that possible in Android to go though whatever link and get main content from that page(f.e.text) or whatever i want to get? If yes, how i can realize that?
There is couple ways to get data from websites.
First and maybe the most popular way is parsing RSS feed from webiste. Java and Android are providing couple parsers and ways to parse xml or in this case RSS Feed. You can take a look in this examples:
https://developer.android.com/samples/BasicSyncAdapter/src/com.example.android.basicsyncadapter/net/FeedParser.html
https://www.tutorialspoint.com/android/android_rss_reader.htm
Second way is getting needed informations from API if it is provided from webiste and offten that API will be in JSON format. For example https://openweathermap.org/ will return JSON file filled with informations of weather which you can pares into your app. Also Android and Java are providing couple ways to get informations from JSON format. You can take a look on this one:
http://www.androidhive.info/2012/01/android-json-parsing-tutorial/
Third you can use support library called Jsoup for parsing HTML from particular webiste/s. You can find examples how to parse HTML on their offical webiste: https://jsoup.org/
Maybe there is more ways certanly you should look up for them.
Currently I am creating an Android application which allows to extract main content and picture from a website. Now I am using Jsoup API to extract all p tags from the HTML. However, it is not a good solution. Any suggestion or better solution enable me to extract main content and picture from a website in Android?
I didn't find anything that works for me, so I published Goose for Android, here: https://github.com/milosmns/goose
Some description follows...
Document cleaning
When you pass a URL to Goose, the first thing it starts to do is clean
up the document to make it easier to parse. It will go through the
whole document and remove comments, common social network sharing
elements, convert em and other tags to plain text nodes, try to
convert divs used as text nodes to paragraphs, as well as do a general
document cleanup (spaces, new lines, quotes, encoding, etc).
Content / Images Extraction
When dealing with random article links you're bound to come across the
craziest of HTML files. Some sites even like to include 2 or more HTML
files per site. Goose uses a scoring system based on clustering of
English stop words and other factors that you can find in the code.
Goose also does descending scoring so as the nodes move down - the
lower their scores become. The goal is to find the strongest grouping
of text nodes inside a parent container and assume that's the relevant
group of content as long as it's high enough (up) on the page.
Image extraction is the one that takes the longest. Trying to find the
most important image on a page proved to be challenging and required
to download all the images to manually inspect them using external
tools (not all images are considered, Goose checks mime types,
dimensions, byte sizes, compression quality, etc). Java's Image
functions were just too unreliable and inaccurate. On Android, Goose
uses the BitmapFactory class, it is well documented, tested, and is
fast and accurate. Images are analyzed from the top node that Goose
finds the content in, then comes a recursive run outwards trying to
find good images - Goose also checks if those images are ads, banners
or author logos, and ignores them if so.
Output Formatting
Once Goose has the top node where we think the content is, Goose will
try to format the content of that node for the output. For example,
for NLP-type applications, Goose's output formatter will just suck all
the text and ignore everything else, and other (custom) extractors can
be built to offer a more Flipboardy-type experience.
Why do you think it's not a good solution to use Jsoup?
I've written many web scrapers for different webpages, and in my experience Jsoup is the way to go for that task. You should study the Jsoup Syntax it is very powerful and with the right selectors you could extract most information from HTML documents very easy. Generally it becomes harder to extract information when the document has no id, class attributes or other unique features.
Other HTML parsers that might be interesting for you are JTidy and TagSoup
You could try the textracto api it automatically identifies the main content of HTML documents. There is also the opportunity to parse OpenGraph meta data, therefore you were also able to extract a picture (og:image).
Say I have this site http://www.motortrend.com/gas_prices/34/98006/, which displays the gas prices for my area. Is there any way to convert this information to xml so I can actually parse and get specific pieces of information from it ?
First of all, you can write authors of site, maybe they have open API for you.
In the another way you can use jsoup and parse page by page data.
I'm developing an Android app that might use Wikipedia API to retrieve the content of a given page (from its title). I searched a lot in the web but I don't find useful information about the implementation. I read the MediaWiki documentation and I tried to format my requests in json format (example: request for "mountain" page content but the text isn't clear and I don't know how manage the request from my Android application.
So, my question is: how can I getting (clear) wikipedia page content by passing the title page from my application? And how to save the well format content in a String (that will corresponds with a TextView in a second moment)?
Anyone knows a good tutorial or can help me with some snippets?
Thank you very much indeed guys! :)
action=parse or action=mobileview or action=query&prop=extracts, depending on what exactly do you need. Use the API sandbox to interactively experiment with various requests, it has usage examples and shows how to build requests properly.
I've downloaded Google drive sdk for Android,
the API is not well documented, so I didn't manage to get to conclusion if what I want to do is possible.
I want to capture an image with the camera convert it to black and white pdf, and then perform OCR on it to get the fields I need as String.
Do I need to send a server request for it or maybe I can Do it on the client side only using Drive api?
sample code will be helpful.
Google's docs don't specify what happens to an uploaded file when you request OCR, specifically, they don't tell you if there is a response string.
However, a little experimenting shows that the only way to get the OCR data is to lookup the document after OCR is complete and grab the text.
You'll find the data structure for 'Files' here: https://developers.google.com/drive/v2/reference/files#resource - what your are after will be in "indexableText" as a string.
Unfortunately, it won't parse out any sort of 'fields'. That would require an understanding of the content... Also, it doesn't seem to capture any email addresses, which is an issue if you are trying to do business cards.
BTW, you will have to wait some time, upto 2 minutes, before the data is available. I'm not entirely sure, but it could also be that object id will not be available for that amount of time, so you might have to either run a background process or do something else.
Sorry that you didn't find the documentation, it is plentiful and available here: https://developers.google.com/drive/
The entire Drive API functions by making server calls, please check here: https://developers.google.com/drive/v2/reference/files/insert for how to perform OCR on uploading files to Drive. Look at the cunningly named "ocr" parameter.