How to escape characters in regex of Androguard?

How to escape characters in regex of Androguard? - android

I am using Androguard to do a static analysis of an apk file. I am using
a, d, dx = AnalyzeAPK("/app-debug.apk", decompiler="dad")
to decompile the apk. Then I am doing
d.get_regex_strings(".*PackageManager.NameNotFoundException.*").
I know that the string PackageManager.NameNotFoundException must be in there because it is my own application for which I have the source code. However, Androguard tells me that it could not find the string.
I also tried variations, such as
d.get_regex_strings(".*PackageManager\.NameNotFoundException.*")
or
d.get_regex_strings(".*PackageManager.*NameNotFoundException.*").
The problem seems to be the dot in the middle between PackageManager and NameNotFoundException. So, how do I escape characters in Androguard?

Related

Unable to download file with special character from Amazon S3

I have been trying to download a file from Amazon S3 that ends with special character.
The file name ends with an "=" as a result of Base64 encoding. Now I am trying to download this file and I receive an error,
The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey;
I tried URL Encoding the string. So now the "=" becomes "%3D" and still I receive the same error.
But if I remove the "=" from the file name, I am being able to download the file without issues. But this is a common file and it is to be accessed form iOS as well.
NOTE: The iOS Amazon SDK works even when the file name has "=" in it.
The issue is faced only in Android SDK.

According to AWS documentation
Safe Characters
The following character sets are generally safe for use in key names:
Alphanumeric characters [0-9a-zA-Z]
Special characters !, -, _, ., *, ', (, and )
and
Characters That Might Require Special Handling
The following characters in a key name may require additional code handling and will likely need to be URL encoded or referenced as HEX. Some of these are non-printable characters and your browser may not handle them, which will also require special handling:
Ampersand ("&")
Dollar ("$")
ASCII character ranges 00–1F hex (0–31 decimal) and 7F (127 decimal.)
'At' symbol ("#")
Equals ("=")
Semicolon (";")
Colon (":")
Plus ("+")
Space – Significant sequences of spaces may be lost in some uses (especially multiple spaces)
Comma (",")
Question mark ("?")
So it confirms you that "=" require special handling,
It will be better if you replace the last "=" char with another safe char to avoid the issue ...
Please try to change the "=" to "&#61"
As on iOS, there is no issue, I expect that it could be relative to the Android environment.
You may note that some chars could also be forbidden because the SH or BASH or ANDROID shell environment execution,
please also to take in consideration that some disk format option (FAT32 on a normal android external memory card) could also represent a factor which forbids some char in the filename.
If you take a look here and more especially on the #kreker answer:
According to wiki and assuming that you are using external data storage which has FAT32.
Allowable characters in directory entries
are
Any byte except for values 0-31, 127 (DEL) and: " * / : < > ? \ | + , . ; = [] (lowcase a-z are stored as A-Z). With VFAT LFN any Unicode except NUL
You will note that = is not an allowed char on the android FAT32 partition ...
As I expect that Android will consider = as restricted char you may try to escape it with a \= or add a quote to the file name on your code ...
An example with a copy:
cp filename=.png mynewfile=.png #before
cp "filename=.png" "mynewfile=.png" #after
"VCS...=.png"
If nothing of this tricks will work you have to change the filename to remove the "=" when you create those files.
Regards

The following characters in a key name may requiere additional code
handling and will likely need to be URL encoded or referenced as
HEX.
Some of these are non-printable characters and your browser may not handle them, which will also require special handling:
The best practices to ensure compatibility between applications defining Key names are using:
- Alphanumeric characters [0-9a-zA-Z]
- Special characters !, -, _, ., *, ', (, and )
Using android you need to encode the file name, the character (commonly used as operator):
=
to :
%3D

First of all I think you are using CopyObjects method of s3. OR you received a file name from an s3 event or somewhere else which you are trying to download. The issue is aws handles special characters differently when they store the names. If you'll go to s3 console and click on the file name. You'll see the URI which will have different values for special characters like space will be replaced by + like that. So you need to handle the special characters accordingly. Misleading examples wont help you as aws has constraints on file names but if you save them otherwise it will replace them with acceptable characters and your actual file name will be different than the one you uploaded hence you getting file not found

Google Sheets App Script to remove Bracketed word from String

Hmm, I can't find the man page for 'replace' in Googles App scripts, I only see 'replaceText'. Anyway, from what I gather from the SO posts, the below should work, hopefully someone can spot it easily.
The String in the Cell is "[pro] all, everybody" and I want to remove the bracketed word '[pro]' so the result is 'all, everybody'.
It does work just fine with:
Cell = Cell.toString().replace("\[pro\]","");
but when I try to make it generic, it fails with all these (not sure what the pattern matching rules are, thus the question for the man page):
Cell = Cell.toString().replace("\[pr.\]","");
Cell = Cell.toString().replace("\[pr.*\]","");
Cell = Cell.toString().replace("\[.*\]","");
they should work, no ? What am I missing ?
Also, how would I use 'replaceText', I can't seem to apply it directly to the 'Cell' object.

The String#replace is a JavaScript function where you need to use a regex with a regex literal notation or with new RegExp("pattern", "modifiers") constructor notation:
Cell = Cell.toString().replace(/\[pr[^\]]*]/,"");
When using a regex literal, backslashes are treated as literal backslashes, and /\d/ matches a digit. The constructor notation equivalent is new RegExp("\\d").
The /\[pr[^\]]*]/ regex matches the first instance of:
\[pr - literal substring [pr
[^\]]* - 0+ chars other than ]
] - a literal ] symbol.
And replaces with an empty string.

Concatenate formatted string in macro

I am doing some algorithm development on android platform. I want to modify my past developer's code and add keyword to it, since he has had put so many useful log info in the code. But I want to grep a new keyword by logcat to see all the useful log I want.
1.The idea is to use: adb logcat | grep 'keyword' to see the log file. For example the keyword can be a person's name James.
2.The past developer remode the ALOGE in the header file like this. and he add many LOG_ACD in the .c file.
#define LOG_ACD(fmt, args...) if (acd->stats_debug_mask & STATS_DEBUG_MASK_ACD_LOG) ALOGE(fmt, ##args)
example in c is LOG_AcD("%s: acd_enable %d, monitor %d, freq %d, afd_state %d, acd_atb %d",
func, output->acd_enable, output->acd_monitor,
output->freq, output->acd_state, output->acd_atb);
3.How can I add the keyword to the above line of code to force each line of LOG_ACD in .c file has my new keyword? The interesting part for me is the ALOGE itself is not a string, the format string will be generated in the .c file.
I hope I describe the problem clearly. Thank you guys

You say that the format string will be gerenated in the C file. I think you don't mean what you say.
For printf-like functions, it is common to specify a literal format string. (The format string is the string with all the format specifiers like %d. A literal string is a zero-terminated string constant between double quotes.) If that is the case (and your example backs this assumption), you can use string-literal concatenation:
#define LOG_ACD(fmt, args...) ALOGE("ACD: " fmt, ##args)
Two adjacent string literals are compiled as one, e.g. "A" "B" is essentially the same as "AB". The macro will generate a compile-time error when the format string is not a literal, but, as said above, that's unusual.

pathPattern in intent-filter matching literal period

I have read quite a few other SOs on this - in particular the highly-voted answer to this question: Android intent filter for a particular file extension?
My scenario is somewhat simpler - I simply want to match a particular filename on our website - e.g. http://our_domain/filename.extn - but taking into account some minor variance in case (I call this out further down).
I've written my intent-filter as follows:
<data
android:scheme="http"
android:host="our_domain"
android:pathPattern="/filename\\.extn" />
Double-escaping the \ so that it will be read out of the XML as \., thus escaping the period so that the pattern matcher sees a literal . instead of the 'any' character.
For my tests I've written a small app that takes a string from a text box, creates an ACTION_VIEW intent with the given URI, and starts it - then checking whether the browser launches or whether I see a chooser with my app listed.
The app is correctly identified for the exact path - e.g. http://our_domain/filename.extn, but it is also being identified if I replace the . with any other character that's valid in a URI path - e.g, all of the following also trigger a match:
http://our_domain/filename'extn
http://our_domain/filename~extn
http://our_domain/filenameaextn
The last of which is the most worrying!
How can I set the path pattern to ensure that only a literal period matches?
Please note, I am aware that simply using path instead of pathPattern might work - however, the pattern also incorporates some minor case-insensitivity - e.g. F*f*ileN*n*ame - I have removed this stuff for the question as it makes no difference to the behaviour of this period-matching.
Is it possible that matching only literal . characters is actually not supported by the intent-filter system (not by design but by bug), and that they'll always be treated as 'any'?

Is it possible that matching only literal. characters is actually not supported by the intent-filter system (not by design but by bug), and that they'll always be treated as 'any'?
Yes, this looks like an android bug. I've just gone through the source code of android's PatternMatcher and this behaviour (bug?) is present to this day.
I.e. it looks like matching the . literal only works in one case - when it's preceded by a * expression. Only then it is properly escaped in code - \\ is taken into consideration). That's why people who are just trying to match a file extension are able to use a pattern like this:
<data android:pathPattern=".*\\.ext" />
As soon as the escape sequence (\\) is preceded by something else than *, the escaping is not taken into account and the dot (.) is treated as a wildcard rather than as a literal and matches any character.
I've been thinking whether I should report this bug, but it might not be worth it, considering how few people have run into it. I looked for similar SO questions, but haven't found any. Also, the . wildcard is not even mentioned as a valid wildcard in the documentation. The only valid wildcards according to the documentation are .* and *.

Just guessing here, you might try (hack) wrapping it in [], like this: pattern="filename[.]extn", so you're only accepting characters from the following list: "." - give it a shot?
There are plenty of other regex games you could probably play, but that's the first one that comes to mind.

How to generate ODIN-1 in Python

I need to generate the ODIN-1 of a string in Python. The official documentation specifies applying SHA-1 to the input string/identifier, but I'm not sure if I need to perform other operations to it beforehand? Also, is the final output the hex digest of the SHA-1 or something else?
E.g. How can I convert this MAC to ODIN-1 in Python? "74e2f543d2ce"
Thanks in advance!

from hashlib import sha1
def odin1(mac_addr):
"""SHA1 hexdigest of hex representaiton of MAC address"""
to_hash ''.join([i.decode('hex') for i in mac_addr.split(':')])
return sha1(to_hash).hexdigest()
>>> odin1('1a:2b:3c:4d:5e:6f')
'82a53f1222f8781a5063a773231d4a7ee41bdd6f'
Let's break this down, line by line between the documentation you linked to, and my answer:
// NOTE: iOS returns MAC Address NOT a string, but a 6-byte array.
// A human readable MAC Address may be represented as the following:
#"1a:2b:3c:4d:5e:6f";
#"1A2B3C4D5E6F";
In python:
>>> '1A'.decode('hex') == '1a'.decode('hex')
True
So we can convert the string given to us into a more agreeable format (that reduces "any ambiguity around punctuation and capitalization"):
>>> mac = "1a:2b:3c:4d:5e:6f".split(':')
>>> hex_mac = [m.decode('hex') for m in mac]
>>> hex_mac
['\x1a', '+', '<', 'M', '^', 'o']
We can treat this list as a string (just the same as if we used a byte array) to get the same result from the SHA1 hash function.
Of course, we can receive MAC addresses this way:
>>> mac = '1A2B3C4D5E6F'
>>> hex_chunks = lambda s: [s[i: i+2] for i in range(0, len(s), 2)]
>>> [m.decode('hex') for m in hex_chunks(mac)]
['\x1a', '+', '<', 'M', '^', 'o']
So it would be up to us to properly unify the input for the single function to operate across all possible forms. Regardless, our function could take either form, the end result is what matters:
>>> sha1(''.join(['\x1a', '+', '<', 'M', '^', 'o'])).hexdigest()
Will produce the correct hash (according to the link you posted).
Hope this helps make my answer clearer.

I need to generate the ODIN-1 of a string in Python.
No you don't, not according to the docs.
You generate an ODIN-1 of an 802.11 MAC address, ANDROID_ID, or DeviceUniqueID. Some relevant quotes:
The seed should be left unaltered from the format returned by the operating system.
NOTE: iOS returns MAC Address NOT a string, but a 6-byte array" right underneath the chart.
… representing it as a raw byte array prevents any ambiguity around punctuation and capitalization:
And IIRC, ANDROID_ID is a 64-bit integer, neither a MAC nor a string. (I don't know about DeviceUniqueId on Windows Phone.)
So, you probably need to generate the ODIN-1 of a 6-byte array [0x74, 0xe2, 0xf5, 0x43, 0xd2, 0xce], not a 12-character string "74e2f543d2ce". The sample shows how to do that in Objective-C; in Python, it's:
mac = bytes([0x74, 0xe2, 0xf5, 0x43, 0xd2, 0xce])
Or, since your question specifies Android, presumably you don't want the MAC address at all, in any format… but I'll assume that was just a mistaken tag, and you're using iOS, and do want the MAC address.
How do you do that?
Hash Step: Pass the Identifier Seed through the SHA-1 hash function.
In Python, that's:
hash = hashlib.sha1(mac)
The resulting message digest is ODIN-1.
In Python, that's:
digest = hash.hexdigest()
Putting it together:
hashlib.sha1(bytes([0x74, 0xe2, 0xf5, 0x43, 0xd2, 0xce])).hexdigest()
The result is a "40 lowercase character string", just as the docs say it should be:
'10f4ab0775380aceaca5a2733604efa6d6364b08'
Also, if you're looking for clarification on a preliminary spec posted on a wiki page, why would you ask about it at SO instead of posting a comment on that page?
To answer your first specific question:
I'm not sure if I need to perform other operations to it beforehand?
The spec says:
The seed should be left unaltered from the format returned by the operating system.
To answer your second:
Also, is the final output the hex digest of the SHA-1 or something else?
The spec says:
The resulting message digest is ODIN-1.
// The format of this hash should be a 40 lowercase character string:
Meanwhile, there's sample code attached to the project (as you'd expect, given that it's at googlecode)… but it's not that helpful.
The iOS sample is completely missing the relevant code. It's a generic GUI app generated by the wizard, with an added #import "ODIN.h" and textView.text = [ODIN1() lowercaseString]; in the viewDidLoad. But that ODIN.h file, and the corresponding ODIN.m or libODIN.a or whatever doesn't appear to be anywhere. (From a brief glance at the project.pbxproj, there's clearly supposed to be more files, which they apparently just didn't check in.)
The Android sample does have the relevant code, but it clearly violates the spec. It gets the ANDROID_ID as a Unicode string, then encodes it to iso-8859-1, calls SHA-1 on the resulting bytes, and generates a hex digest out of it. The docs explicitly say to use the OS value exactly as returned by the OS; the code Latin-1 encodes it instead.
The Windows sample, on the other hand, does seem to do what the docs say—it gets the DeviceUniqueId as a byte[], and uses it as-is. (However, the code won't actually work, because it's using an obsoleted API call, which throws an exception rather than return a byte[]…)
At this point, I have to ask why you're following this spec in the first place. If you're trying to interoperate with someone else's code, you probably care which of the contradictory ways of interpreting this spec is being used by that code, rather than trying to guess which one the designers intended.
Not to mention that Apple has explicitly told people not to use anything based on the MAC to replace the UDID, and ODIN is something trivially based on the MAC to replace the UDID…

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.