Data compression on Android (other than java.util.zip ?)

Data compression on Android (other than java.util.zip ?) - android

I have a lot of data (text format) to send from a device. It obviously means that I should compress it. But my question is whether there are any ways of doing it other than by zip algorithm (like this). The reason I am asking this question is over here - for a text file i.e. 7-zip is twice (!) better than zip. Which is a significant gain. And maybe there are even better algorithms.
So are there any effective ways of data compression (better than zip) available for Android?

You would need to compile another library into your code, since I doubt that compression algorithms other than zlib are available as part of the standard libraries on the Android.
The 7-zip algorithm you refer to is actually called LZMA, which you can get in library form in the LZMA SDK. The source code is available in Java as well as C. If you can link C code into your application, that would be preferable for speed.
Since there's no such thing as a free lunch, the speed is important. LZMA will require much more memory and much more execution time to achieve the improved compression. You should experiment with LZMA and zlib on your data to see where you would like the tradeoff to fall between execution time and compression, both to choose a package and to pick compression levels within a package.
If you find that you'd like to go the other way, to less compression and even higher speed than zlib, you can look at lz4.

Your question is too general.
You can use any library, as long as it is in Java or C/C++ (via the NDK). If you don't want to use external libraries, you have to stick to what's in the SDK. Depending on how you are sending the data, there might be standard ways to do this. For example, HTTP uses gzip and has the necessary headers already defined.
In short, test different things with your expected data format and size, find the best one and integrate it in your app.

Related

How does one convert a binary blob to source and why don't people do it?

Richard Stallman says that a risk to users using Android, is that the source code for the binary blobs is unavailable, and it is unclear whether these blobs could have a hidden back door or undesired behaviour from the users perspective.
But since the blobs for Nexus / Pixel devices can be downloaded freely from the android website, what is to stop people from converting the binary to C++ source, and examining it ? The files aren't very large...
EDIT: Since the files aren't very large, if a number of people work on it, why would it be hard to examine every instruction ? I can understand that when the binary is converted to C++ source, each line of code would have to be examined, but my question is, since the files aren't too large there shouldn't be too many lines...

A large number of people would find it very difficult to openly collaborate on an endeavour as you are suggesting because they would be afraid of legal intimidation and parasitic and restrictive IP-NDAs of various financially powerful entities.
As to your question about reverse engineering, there is an entire stackexchange site dedicated to it. You could get started there.

Is it possible to edit/modify .so file?

I'm developing an Android application which contains native code.
The native code is compiled in a .so file that has important algorithms inside.
I'm really worrying about the possibility that my .so file can be edited or modified and then re-build (re-pack). Like apks they can be modified and repacked to create a new one
I have several questions here:
1) Is there any way to edit/modify .so files and re-build?
2) If there are, how do people do that?
3) How to prevent .so files from being edited then re-built?

The short answer is that anything that a computer can read and understand, it can also modify. There is no bullet-proof signature mechanism in Android for Java or native code. Still, the so files are generally considered much less vulnerable than the Java code, even with obfuscation turned on.
Reverse engineering a shared library is hard but possible. Disassembly, change, and assembly back is not hard if one knows what to change.
There are many ways to strengthen protection of your C++ code against reverse engineering, but none will hold against a determined and well-funded attack. So, if the stakes are very high, consider running the important part of your algorithm on your server, and prey for its security.

image decoding and manipulation using JNI on android

background
On some apps, it is important to handle large images without OOM and also quickly.
For this, JNI (or renderscript, which sadly lacks on documentation) can be a nice solution.
In the past, i've succeeded using JNI for rotating huge bitmaps while avoiding OOM (link here , here and here). it was a nice (yet annoyingly hard) experience, but in the end it worked.
the problem
the android framework has plenty of functions to handle bitmaps, but i have no idea what is the situation on the JNI side.
I already know how to pass a bitmap from android's "java world" to the "JNI world" and back.
What i don't know is which functions I can use on the JNI side to help me with bitmaps.
I wish to be able to do all image operations (including decoding) on JNI, so that I won't need to worry about OOM when presented with large images, and in the end of the process, I could convert the data to Java-bitmap (to show the user) and/or write it to a file.
again, i don't want to convert the data on the JNI side to a java bitmap just to be able to run those operations.
As it turns out, there are some libraries that offer many functions (like JavaCV), but they are quite large and I'm not quite sure about their features and if they really do the decoding on the JNI-side, so I would prefer to be able to know what is possible via the built-in JNI function of Android instead.
the question
which functions are available for image manipulation on the JNI side on android?
for example, how could i run face detection on bitmaps, apply matrices, downsample bitmaps, scale bitmaps, and so on... ?
for some of the operations, i can already think of a way to implement them (scaling images is quite easy, and wikipedia can help a lot), but some are very complex.
even if i do implement the operations by myself, maybe others have made it much more efficiently, thinking of the so many optimizations that C/C++ can have.
am i really on my own when going to the JNI side of android, where i need to implement everythign from scratch?
just to make it clear, what i'm interested in is:
input bitmap on java -> image manipulation purely in JNI and C/C++ (no convertion to java objects whatsoever) ->output bitmap on java.

"built-in JNI function of Android" is kind of oxymoron. It's technically correct that many Android Framework Java classes use JNI somewhere down the chain to invoke native libraries.
But there are three reservations regarding this statement.
These are "implementation details", and are subject to change without notice in any next release of Android, or any fork (e.g. Kindle), or even OEM version which is not regarded a "fork" (e.g. built by Samsung, or for Quallcom SOC).
The way native methods are implemented in core Java classes is different from the "classical" JNI. These methods are preloaded and cached by the JVM and are therefore do not suffer from most of the overhead typical for JNI calls.
There is nothing your Java or native code can do to interact directly with the JNI methods of other classes, especially classes that constitute the system framework.
All this said, you are free to study the source code of Android, to find the native libraries that back specific classes and methods (e.g. face detection), and use these libraries in your native code, or build a JNI layer of your own to use these libraries from your Java code.
To give a specific example, face detection in Android is implemented through the android.media.FaceDetector class, which loads libFFTEm.so. You can look at the native code, and use it as you wish. You should not assume that libFFTEm.so will be present on the device, or that the library on device will have same API.
But in this specific case, it's not a problem, because all work of neven is entirely software based. Therefore you can copy this code in its entirety, or only relevant parts of it, and make it part of your native library. Note that for many devices you can simply load and use /system/lib/libFFTEm.so and never feel discomfort, until you encounter a system that will misbehave.
One noteworthy conclusion you can make from reading the native code, is that the underlying algorithms ignore the color information. Therefore, if the image for which you want to find face coordinates comes from YUV source, you can avoid a lot of overhead if you call
// run detection
btk_DCR_assignGrayByteImage(hdcr, bwbuffer, width, height);
int numberOfFaces = 0;
if (btk_FaceFinder_putDCR(hfd, hdcr) == btk_STATUS_OK) {
numberOfFaces = btk_FaceFinder_faces(hfd);
} else {
ALOGE("ERROR: Return 0 faces because error exists in btk_FaceFinder_putDCR.\n");
}
directly with your YUV (or Y) byte array, instead of converting it to RGB and back to YUV in android.media.FaceDetector.findFaces(). If your YUV buffer comes from Java, you can build your own class YuvFaceDetector which will be a copy of android.media.FaceDetector with the only difference that YuvFaceDetector.findFaces() will take Y (luminance) values only instead of a Bitmap, and avoid the RGB to Y conversion.
Some other situations are not as easy as this. For example, the video codecs are tightly coupled to the hardware platform, and you cannot simply copy the code from libstagefright.so to your project. Jpeg codec is a special beast. In modern systems (IIRC, since 2.2), you can expect /system/lib/libjpeg.so to be present. But many platforms also have much more efficient HW implementations of Jpeg codecs through libstagefright.so or OpenMAX, and often these are used in android.graphics.Bitmap.compress() and android.graphics.BitmapFactory.decode***() methods.
And there also is an optimized libjpeg-turbo, which has its own advantages over /system/lib/libjpeg.so.

It seems that your question is more about C/C++ image processing libraries than it is about Android per se. To that end, here are some other StackOverflow questions that might have information you'd find useful:
Fast Cross-Platform C/C++ Image Processing Libraries
C++ Image Processing Libraries

Automatic transformation of Android's dex code

I want to transform/instrument Dex files. The goals of transformation include measuring code coverage. Note that the source files are not available. So instrumenting Dex is the only option.
I am wondering if there are any existing code base that I could look at as examples to write a tool to achieve my goal.
I know about the Smali project and a host of other projects that build on Smali. However, none of these projects are good examples for my purpose.
I am looking for code that automatically transforms smali code or the dexlib representation, from which smali is generated. The later option is preferred for my purpose because the overhead of generating smali can be avoided.

It's a lot of code, but dx's DexMerger is an example program that transforms dex files. It's made quite complicated by the fact that it needs to guess the size of the output in order make forward-references work.
You'd also need to create infrastructure to rewrite dalvik instructions. DexMerger's InstructionTransformer does a shallow rewrite: it adjusts offsets from one mapping to another. To measure code coverage your instruction rewriting would probably need to be much more sophisticated.

Another option that have become available recently is Dexpler. It is an extension of Soot, which is a framework for analysis and instrumentation of Java programs. Dexpler reads in .apk files and converts to Jimple intermediate format. Jimple code can then be arbitrarily instrumented, and eventually dumped into a new apk.

(For the record, I am answering my own question here)
Eventually I did not find any tool that fit my requirements. So I ended up building my own tool, called Ella, based on DexLib. Out of the box, it does a few things such as measuring code coverage, recording method traces, etc. But it can be easily extended to do other types of transformations.

In some cases smali itself does a small amount of instruction rewriting while re-assembling a dex file. Things like replacing a const-string with a const-string/jumbo, or a goto instruction with a "larger" one, if the target is out of range. This involves replacing instructions in the instruction list with potentially larger ones, and the corresponding fixing up of offsets.
CodeItem.fixInstructions is the method responsible for this.
Additionally, there is the asmdex library. I'm not all that familiar with it, but it sounds like it might be relevant to what you're wanting to do.

I know it's a bit late but just in case you're still interested or perhaps for some other readers. ASMDEX has been mentioned already. And I think that's your best bet for the moment for what you're trying to achieve.
As for adding new registers take a look at org.ow2.asmdex.util.RegisterShiftMethodAdapter class. It's not perfect! As a matter of fact as it is it's horrible changing existing 4bit instructions when adding a register would mean some register would end up being 0xF and won't fit in 4 bits.
But it should be a good start.

Android and Protocol Buffers

I am writing an Android application that would both store data and communicate with a server using protocol buffers. However, the stock implementation of protocol buffers compiled with the LITE flag (in both the JAR library and the generated .java files) has an overhead of ~30 KB, where the program itself is only ~30 KB. In other words, protocol buffers doubled the program size.
Searching online, I found a reference to an Android specific implementation. Unfortunately, there seems to be no documentation for it, and the code generated from the standard .proto file is incompatible with it. Has anyone used it? How do I generate code from a .proto file for this implementation? Are there any other lightweight alternatives?

I know it's not a direct answer to your question, but an extra 30kb doesn't sound that bad to me. Even on EDGE that'll only take an extra 1 to 2 seconds to download. And memory is tight on android, but not THAT tight -- 30 kb is only about 1/10th of one percent of the available application memory space.

Are there any other lightweight alternatives?
I'm taking this to mean "to using protocol buffers", rather than "for using protocol buffers with an Android application". I apologise if you are already commited to protocol buffers.
This site is about "comparing serialization performance and other aspects of serialization libraries on the JVM". You'll find many alternatives listed there.
While there is no mention of the memory footprint of the different implementations at the moment I am sure it is a metric which the people on the wiki would be interested in.

Just to revive this archaic thread for anyone seeing it, the answer is to use Square's Wire library (https://github.com/square/wire)
As they mention themselves:
Wire messages declare public final fields instead of the usual getter methods. This cuts down on both code generated and code executed. Less code is particularly beneficial for Android programs.
They also internally build using the Lite runtime I believe.
And of course Proguard, the new Android 2.0 minify tools, [other generic answers], etc etc.

Use ProGuard[1] on your project. It will reduce the size of jars included in APK file.
[1] http://developer.android.com/guide/developing/tools/proguard.html

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.