Automatic transformation of Android's dex code - android

I want to transform/instrument Dex files. The goals of transformation include measuring code coverage. Note that the source files are not available. So instrumenting Dex is the only option.
I am wondering if there are any existing code base that I could look at as examples to write a tool to achieve my goal.
I know about the Smali project and a host of other projects that build on Smali. However, none of these projects are good examples for my purpose.
I am looking for code that automatically transforms smali code or the dexlib representation, from which smali is generated. The later option is preferred for my purpose because the overhead of generating smali can be avoided.

It's a lot of code, but dx's DexMerger is an example program that transforms dex files. It's made quite complicated by the fact that it needs to guess the size of the output in order make forward-references work.
You'd also need to create infrastructure to rewrite dalvik instructions. DexMerger's InstructionTransformer does a shallow rewrite: it adjusts offsets from one mapping to another. To measure code coverage your instruction rewriting would probably need to be much more sophisticated.

Another option that have become available recently is Dexpler. It is an extension of Soot, which is a framework for analysis and instrumentation of Java programs. Dexpler reads in .apk files and converts to Jimple intermediate format. Jimple code can then be arbitrarily instrumented, and eventually dumped into a new apk.

(For the record, I am answering my own question here)
Eventually I did not find any tool that fit my requirements. So I ended up building my own tool, called Ella, based on DexLib. Out of the box, it does a few things such as measuring code coverage, recording method traces, etc. But it can be easily extended to do other types of transformations.

In some cases smali itself does a small amount of instruction rewriting while re-assembling a dex file. Things like replacing a const-string with a const-string/jumbo, or a goto instruction with a "larger" one, if the target is out of range. This involves replacing instructions in the instruction list with potentially larger ones, and the corresponding fixing up of offsets.
CodeItem.fixInstructions is the method responsible for this.
Additionally, there is the asmdex library. I'm not all that familiar with it, but it sounds like it might be relevant to what you're wanting to do.

I know it's a bit late but just in case you're still interested or perhaps for some other readers. ASMDEX has been mentioned already. And I think that's your best bet for the moment for what you're trying to achieve.
As for adding new registers take a look at org.ow2.asmdex.util.RegisterShiftMethodAdapter class. It's not perfect! As a matter of fact as it is it's horrible changing existing 4bit instructions when adding a register would mean some register would end up being 0xF and won't fit in 4 bits.
But it should be a good start.

Related

Data compression on Android (other than java.util.zip ?)

I have a lot of data (text format) to send from a device. It obviously means that I should compress it. But my question is whether there are any ways of doing it other than by zip algorithm (like this). The reason I am asking this question is over here - for a text file i.e. 7-zip is twice (!) better than zip. Which is a significant gain. And maybe there are even better algorithms.
So are there any effective ways of data compression (better than zip) available for Android?
You would need to compile another library into your code, since I doubt that compression algorithms other than zlib are available as part of the standard libraries on the Android.
The 7-zip algorithm you refer to is actually called LZMA, which you can get in library form in the LZMA SDK. The source code is available in Java as well as C. If you can link C code into your application, that would be preferable for speed.
Since there's no such thing as a free lunch, the speed is important. LZMA will require much more memory and much more execution time to achieve the improved compression. You should experiment with LZMA and zlib on your data to see where you would like the tradeoff to fall between execution time and compression, both to choose a package and to pick compression levels within a package.
If you find that you'd like to go the other way, to less compression and even higher speed than zlib, you can look at lz4.
Your question is too general.
You can use any library, as long as it is in Java or C/C++ (via the NDK). If you don't want to use external libraries, you have to stick to what's in the SDK. Depending on how you are sending the data, there might be standard ways to do this. For example, HTTP uses gzip and has the necessary headers already defined.
In short, test different things with your expected data format and size, find the best one and integrate it in your app.

Combining C-code files into one C-code file

I'm converting libx264 to renderscript as an exercise in how much work it is to port a bit larger project into renderscript. One of the pains with renderscript is that everything needs to be declared static to not be automatically getting a java interface. Also this automatic java interface can't handle pointer, multi-dim arrays etc. Hence I need to declare all functions and global variables as static in libx264, besides a few invocation functions to control it.
My problem then is that since everything is declared static I need to have all the code in one file scope. I started to just include all the C-code files into one and compile that. Which would had worked quite easily if not libx264 itself had also included C-files with different pre-processing macro definitions, hence some functions exist twice with different content and some is redeclared identical. I could of course handle this manually, but it would be easier with a tool.
I'm asking if anyone knows of a tool that can take a C project and pre-process/merge that into one C-file, managing re-declarations, conflicting declarations, etc.
And I thought the heap allocations would be the difficult problem...
I have found a tool that does this, CIL.
http://sourceforge.net/projects/cil
http://kerneis.github.com/cil/doc/html/cil/merger.html
/Harald

memory usage and codes in android

sorry if it's a silly question. Do comments in the java or xml file effect the memory usage of the android application? has anyone tried to monitor the memory usage of his/her application with and without the comments?
No, comments do not use any memory.
It's important to understand that, in programming in C, Java, etc. what you're writing is source code which, before being run on the computer (or, specifically, your Android device) is compiled into a machine code format. The processor does not run your source code as you see it. The source code you write typically contains lots of stuff like comments (which do NOT have any effect on the actual code) or perhaps things like compiler directives (which may control how the compiler compiles sections of your code).
(I realise it's more correct to use the term byte code in the case of Java, but trying to keep the answer simple here.)
An exception to this however would be if you're talking about the case where you insert a file (e.g. XML file) as a raw resource within your Android application. But, I think this topic is an advanced one for you to learn about later.
Comments in your code are compiled out and have no effect whatsoever on memory usage in an application.

determining if resources such as those in strings.xml are no longer used

I have done some significant re-coding on one of my Android programs and now I am unsure if certain xml strings are used anymore. In addition I have a few translations which makes the task even more difficult. Is there a tool to test this? This would be useful for drawables also.
I am using the eclipse plugin.
This question has been discussed in the irc channel before. There is no tool to test it, but I agree it would be useful. Note that resources can be referenced in xml, but they can also be referenced from code. Furthermore, resources can also be looked up by their identifier, and such lookup could be determined by runtime.
So actually you cannot determine 100% whether a resource is used or not anymore, but you can probably determine which resources are referenced in a static way (in xml or code). Depending on your code/app which you know best yourself, such approach might be sufficient in many cases.
The approach would be to write a tool that parses xml and java source files and also take the import statements into consideration. With that information you should be able to determine which resources you can get rid of.
The easiest way is to remove them all, attempt to compile, and re-add those the compiler says are lacking. It's a little tiresome, but it's certainly tractable.
Note, as Mathias already pointed out, that it's technically possible to access resources by name with a string at runtime, and the way I suggest here would remove such resources though they are, in fact, needed. However, this pattern should be really rarely seen in any application, and if you are the one who wrote it, you already know if/where you do such treatment.
Use grep to extract a list of resources to a file by way of sort
Use recursive grep through sort and uniq to create a list of those mentioned in any source file (make a copy of project without unused files or dispatch grep on a list of used ones, of course commented out code will be an issue)
Use diff on the two lists

Android and Protocol Buffers

I am writing an Android application that would both store data and communicate with a server using protocol buffers. However, the stock implementation of protocol buffers compiled with the LITE flag (in both the JAR library and the generated .java files) has an overhead of ~30 KB, where the program itself is only ~30 KB. In other words, protocol buffers doubled the program size.
Searching online, I found a reference to an Android specific implementation. Unfortunately, there seems to be no documentation for it, and the code generated from the standard .proto file is incompatible with it. Has anyone used it? How do I generate code from a .proto file for this implementation? Are there any other lightweight alternatives?
I know it's not a direct answer to your question, but an extra 30kb doesn't sound that bad to me. Even on EDGE that'll only take an extra 1 to 2 seconds to download. And memory is tight on android, but not THAT tight -- 30 kb is only about 1/10th of one percent of the available application memory space.
Are there any other lightweight alternatives?
I'm taking this to mean "to using protocol buffers", rather than "for using protocol buffers with an Android application". I apologise if you are already commited to protocol buffers.
This site is about "comparing serialization performance and other aspects of serialization libraries on the JVM". You'll find many alternatives listed there.
While there is no mention of the memory footprint of the different implementations at the moment I am sure it is a metric which the people on the wiki would be interested in.
Just to revive this archaic thread for anyone seeing it, the answer is to use Square's Wire library (https://github.com/square/wire)
As they mention themselves:
Wire messages declare public final fields instead of the usual getter methods. This cuts down on both code generated and code executed. Less code is particularly beneficial for Android programs.
They also internally build using the Lite runtime I believe.
And of course Proguard, the new Android 2.0 minify tools, [other generic answers], etc etc.
Use ProGuard[1] on your project. It will reduce the size of jars included in APK file.
[1] http://developer.android.com/guide/developing/tools/proguard.html

Categories

Resources