I have an android application using a webview on which the user has to log in with username and password before being redirected to the page i would like to scrape data off with jsoup. Since the jsoup thread would be a different session the user would have to login again.
Now i would like to use the cookie received from the webview to send with the jsoup request to be able to scrape my data.
The cookie is being synced with cookiesyncmanager with following code. This is basically where I am stuck cause i dont know how to read out the cookie nor how to attach it to the jsoup request. Please help? :)
public void onPageFinished(WebView view, String url) {
CookieSyncManager.getInstance().sync();
The jsoup scrape I am doing after the user has logged in with something like this:
doc = Jsoup.connect("https://need.authentication.com").get();
Elements elements = doc.select("span.tabCount");
Element count = elements.first();
Log.d(TAG, "test"+(count));
I'm not an android developer but maybe you can try something like this:
final String url = "https://need.authentication.com";
// -- Android Cookie part here --
CookieSyncManager.getInstance().sync();
CookieManager cm = CookieManager.getInstance();
String cookie = cm.getCookie(url); // returns cookie for url
// ...
// -- JSoup part here --
// Jsoup uses cookies as "name/value pairs"
doc = Jsoup.connect("https://need.authentication.com").header("Cookie", cookie).get();
// ...
I hope this helps a bit, but as i said before: im no android developer (and code isn't tested!)
Here's some documentation:
CookieManager
CookieSyncManager
Jsoup Connection
Related
I am using Android Web View in my Xamarin Project to perform third party authentication. Once the login is successful I need to extract the authentication cookies. This cookies I am storing in persistent storage and then I am using them for passing to subsequent requests.
For example:
Android App >(opens) webview > Loads (idp provider) url > User provides credentials and saml request is sent to my backend server > backend server validates saml and returns authentication cookies.
It returns two cookies.
Now everything works fine. And in OnPageFinished method of the WebClient of webview I am trying to extract the cookies using the method.
public override void OnPageFinished(WebView view, string url)
{
base.OnPageFinished(view, url);
var handler = OnPageCompleted;
var uri = new Uri(url);
AllowCookies(view);
var cookies = CookieManager.Instance.GetCookie(url);
var onPageCompletedEventArgs = new OnPageCompletedEventArgs { Cookies = cookies, Url = uri.AbsolutePath, RelativeUrl = uri.PathAndQuery, Host = uri.Host };
handler?.Invoke(this, onPageCompletedEventArgs);
}
private void AllowCookies(WebView view)
{
CookieManager.Instance.Flush();
CookieManager.AllowFileSchemeCookies();
CookieManager.SetAcceptFileSchemeCookies(true);
CookieManager.Instance.AcceptCookie();
CookieManager.Instance.AcceptThirdPartyCookies(view);
CookieManager.Instance.SetAcceptCookie(true);
CookieManager.Instance.SetAcceptThirdPartyCookies(view, true);
}
The problem is, I am able to get just one cookie(wc_cookie_ps_ck
), I am unable to see the other authentication cookie(.AspNetCore.Cookies
).
Here's how the cookies appear in browser.
Please note that in postman and in chrome browser both the cookies appear.
But in android webview only cookie with name ".AspNetCore.Cookies" is not appearing at all.
As per Java document,"When retrieving cookies from the cookie store, CookieManager also enforces the path-match rule from section 3.3.4 of RFC 2965 . So, a cookie must also have its “path” attribute set so that the path-match rule can be applied before the cookie is retrieved from the cookie store."
Since both of my cookies have different path, is that the reason the one with path set as "/project" is not appearing?
After days and days of finding the answer to the question. I finally have found an answer.
I did remote debugging of the webview with the desktop chrome and I found out that all the cookies that I needed were present in the webview.
However the method,
var cookies = CookieManager.Instance.GetCookie(url);
doesn't return the cookie which has the same site variable set.
This looks like a bug from Xamarin Android. I have already raised an issue in Xamarin Android github.
In the xamarin android github issue I have mentioned the steps to reproduce.
For me, the workaround to resolve the issue was to set the samesite cookie varibale off in my asp.net core back end project.
As follows:
In order to configure the application cookie when using Identity, you can use the ConfigureApplicationCookie method inside your Startup’s ConfigureServices:
// add identity
services.AddIdentity<ApplicationUser, IdentityRole>();
// configure the application cookie
services.ConfigureApplicationCookie(options =>
{
options.Cookie.SameSite = SameSiteMode.None;
});
Link for the above solution mentioned. Here.
i have cookies,want to send the cookie to WebView,how i can do?
i use JSOUP to get cookies from login page and i use the follow code but not working
webview.LoadUrl(URL, Cookies)
I had to do something like this a couple of weeks ago. I ditched the webview though and used a recyclerview for the info because of data and ram consumption.
res = Jsoup.connect(url)
.data("user", user, "passwrd", pass)
.method(Connection.Method.POST)
.execute();
String sessionId = res.cookie("PHPSESSID");
The code up there is actually from Jsoup's creator himself on how to store cookies. Then its a matter of setting the cookiemanager to these cookies.
CookieManager cookieManager = CookieManager.getInstance();
cookieManager.setAcceptCookie(true);
cookieManager.setCookie(url,String.format("%s=%s",
"PHPSESSID", sessionId));
Then I just loaded the webview. I think a really important line here is formatting the string in the second parameter of the setCookie() method to match how the cookie is actually a Map with a key and value of PHPSESSID and the cookies characters. PHPSESSID is the id name that shows up for me in the cookie.
I have several apps in which I am receiving a cookie from the webview on logged in webpages and reuse it directly with jsoup to scrape content as follows:
final String url = "https://need.authentication.com";
// -- Android Cookie part here --
CookieSyncManager.getInstance().sync();
CookieManager cm = CookieManager.getInstance();
String cookie = cm.getCookie(url); // returns cookie for url
// ...
// -- JSoup part here --
// Jsoup uses cookies as "name/value pairs"
doc = Jsoup.connect("https://need.authentication.com").cookie(url, cookie).get();
This doesn't work for all urls. Receiving the cookie is never a problem but jsoup sometimes can not use the cookie.
All i would like to do now is add this existing cookie to a httpclient or another non-deprecated option to download the page and then hand it to jsoup for further scraping as I have the feeling jsoup isn't handling cookies correctly.
Jsoup debug is only showing:
03-19 03:06:16.394 1317-3369/mysource.internationsexpress W/System.err: at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:512)
03-19 03:06:16.394 1317-3369/mysource.internationsexpress W/System.err: at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:493)
03-19 03:06:16.394 1317-3369/mysource.internationsexpress W/System.err: at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205)
03-19 03:06:16.394 1317-3369/mysource.internationsexpress W/System.err: at org.jsoup.helper.HttpConnection.get(HttpConnection.java:194)
and for more info the cookie looks like this:
__indbg=481084b1-3d71-461a-b6e1-93d;
__gads=ID=0058c3ccb75f72f2:T=1458162316:S=ALN;
INSESSION=ct8njokkc4uadlmjjg8a3gvp1ng4m0acvvveea66bkpmn32fvc;
INEP=%5B%22nw01_101_B_0%22%2C%22mp04_103_B_0%22%2C%22in01_244;
WASLOGGEDIN=1;
INREMEMBERME=cHlMQlRVbzVOUkhJTU5kU25tMlplZ2RvNWxvbkN4TmdsR0RBVWp6Qkp6dkpONW1Tb2o3MH;
INBP=mobile;
__utmt=1;
__utma=68558281.1607821733.1458162272.1458240416.1;
__utmb=68558281.1.10.1458327475;
__utmc=68558281;
__utmz=68558281.1458162272.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
__utmv=68558281.|2=community=sanj=1^3=loggedIn=1=1^5=experiment=%7Cst01_267_B_2%7Cmt01
cookie(name, value) expects the name of the cookie not its related url.
Try this instead:
doc = Jsoup //
.connect("https://need.authentication.com") //
.header("Cookie", cookie) //
.get();
I have this scenario which my app shows in a webView a 2-page login process.
The first page asks only to which domain to you plan on connecting.
The second page asks for the credentials.
I'm trying to perform the login in the webView and then execute requests from my native code.
I realize I need to get the stored cookie from the webView (but from which url? from the first page or the second one?), and then use the cookie for the native code requests.
Can someone please tell me how to go about it? The login process is easy - the user logs in through the webview - fine. Now, I know how to use the cookie manage but I dont know which cookie am I suppose to look for - is it the url of the first login page? is it the second one? does it matter?
Next, how do I use the cookie to send back to server with a GET request so the server will know I'm authenticated?
I appreciate the answers I'm clueless and begging for help :)
Since the accepted answer does not really describe how it is done:
Put these lines somewhere where your app starts:
CookieHandler.setDefault(new CookieManager()); // Apparently for some folks this line works already, for me on Android 17 it does not.
CookieSyncManager.createInstance(yourContext); // or app will crash when requesting cookie
And then in your connection:
String cookies = CookieManager.getInstance().getCookie(urlString);
URL url = new URL(urlString);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setReadTimeout(10000 /* milliseconds */);
conn.setConnectTimeout(15000 /* milliseconds */);
// conn.setRequestMethod("GET");
// conn.setDoInput(true);
if (cookies != null)
conn.setRequestProperty("Cookie", cookies);
// Starts the query
conn.connect();
I have done the opposite of you: I log in with loopj Android Asynchronous Http Client, and want the session cookies to apply to a webview, for the same website. I don't know if it will help you, but, I am going to post my code for copying over the cookies. Maybe seeing the process will help you to look for the items you need... to copy cookies from webview, to HTTP. I can't offer further help, since I'm fairly new at Android. (And, of course, I adapted my code from other people's posts.)
Class variable declarations:
private AsyncHttpClient loopjClient = new AsyncHttpClient();
private PersistentCookieStore myCookieStore;
onCreate() initialization:
myCookieStore = new PersistentCookieStore(this);
loopjClient.setCookieStore(myCookieStore);
After HTTP login:
// get cookies from the generic http session, and copy them to the webview
CookieSyncManager.createInstance(getActivity().getApplicationContext());
CookieManager.getInstance().removeAllCookie();
CookieManager cookieManager = CookieManager.getInstance();
List<Cookie> cookies = myCookieStore.getCookies();
for (Cookie eachCookie : cookies) {
String cookieString = eachCookie.getName() + "=" + eachCookie.getValue();
cookieManager.setCookie("http://www.example.com", cookieString);
//System.err.println(">>>>> " + "cookie: " + cookieString);
}
CookieSyncManager.getInstance().sync();
// holy ****, it worked; I am automatically logged in for the webview session
Note that loopj is like the webview, in that all cookie management and sending are automatic. I just copy all cookies for the domain. I think you'd be fine, doing the same... thus, no worry about whether from the first or second page.
At the end I found my way and it was pretty simple.
Once the user logs in through the webview - a cookie is set on the device.
Later on once I want to perform Native api calls on the service I ask the cookie manager for the cookie that was set based on the url.
I then take the important header that is used to authenticate on the server and send it along with my api calls.
I have a sharepoint site, which has ntlm authentication. in order for me to load the page, i do an authentication to the site using this.
public String LoadUrlWithNTLM(String url){
CkHttp http = new CkHttp();
http.put_Login("username");
http.put_Password("password");
http.put_NtlmAuth(true);
http.put_SessionLogFilename("ntlmAuthLog.txt");
String source = http.quickGetStr(url);
return source;
}
and load the webview with this.
public void LoadWebView(String url, String source){
webView = (WebView) findViewById(R.id.webView1);
webView.getSettings().setJavaScriptEnabled(true);
webView.setWebViewClient(new WebViewClient());
webView.loadDataWithBaseURL(url, source, "text/html", "", "");
}
i call this in the OnCreate()
source= LoadUrlWithNTLM(url);
LoadWebView(url,source);
then i check if there is a url event click with this
webView.setWebViewClient(new WebViewClient() {
public boolean shouldOverrideUrlLoading(WebView view, String url){
String toWebView = LoadUrlWithNTLM(url);
LoadWebView(url,source);
return false;
}
});
at some point, i can manage through go to the Sharepoint Site with NTLM Authentication, but when i click some link, it just display "401 UNAUTHORIZED" and do not invoke the shouldOverrideUrlLoading() method on breakpoint.
After authorizing, each subsequent HTTP request should include an Authorization header that contains the result of the prior authorization. If the subsequent request were sent using Chilkat HTTP, then the object would automatically send this Authorization header. However, the WebView has no knowledge of it, and it's including any Authorization header with it's request, and therefore you get the "401 Unauthorized" error.
One solution is to see if you can do NTLM authorization with WebView. I'm assuming you're using Chilkat only because this is not possible.
Another solution is to use Chilkat as you are doing, but then get the value of the Authorization header (from Chilkat) and explicitly set this header field with WebView. I don't know enough about WebView to know whether this is possible. To get the value of the Authorization header from Chilkat may require a new Chilkat feature (and I think this may be easy to do). (or it's already possible, but in a convoluted way)