I have 10000 to 12000 image files and having space up to 800 MB present in external storage.
I am using a loop which takes each file path and generates md5 of it, but due to huge amount of files being read to create md5, this takes alot of time.
This is the algorithm for generating md5 of file.
public static String getMd5OfFile(String filePath) {
String returnVal = "";
try {
InputStream input = new FileInputStream(filePath);
// byte[] buffer = new byte[1024];
byte[] buffer = new byte[2048];
MessageDigest md5Hash = MessageDigest.getInstance("MD5");
int numRead = 0;
while (numRead != -1) {
numRead = input.read(buffer);
if (numRead > 0) {
md5Hash.update(buffer, 0, numRead);
}
}
input.close();
byte[] md5Bytes = md5Hash.digest();
for (int i = 0; i < md5Bytes.length; i++) {
returnVal += Integer.toString((md5Bytes[i] & 0xff) + 0x100, 16).substring(1);
}
} catch (Throwable t) {
t.printStackTrace();
}
return returnVal.toUpperCase();
}
So the question is can i increase the buffer size to make operation faster and by how much should i do it, which would not either break the operation or create an issue for generation of md5.
And does wrap the buffer stream in input stream will make it faster?
As with any optimisation problems, you should measure your performance to learn if any of the changes you make have impact.
2k is certainly a small buffer size and a larger one could do better. But I/O stacks have buffers all the way down, so it might have negligible impact. Try and measure yourself.
Another optimisation worth trying out is to notice that reading a file is an I/O-bound operation and computing MD5 is CPU-bound. Have one thread read file content and another thread just update MD5 state. Depending on the number of CPU cores on your device, you could hash multiple files in parallel with performance gains.
Related
I have to upload big video files to a server, but it's taking too long to upload, so I decided to split/chunk the files and then send them to the server
After splitting my files, I get a response like the following:
[ /storage/emulated/0/1493357699.mp4.001, /storage/emulated/0/1493357699.mp4.002, /storage/emulated/0/1493357699.mp4.003, /storage/emulated/0/1493357699.mp4.004, /storage/emulated/0/1493357699.mp4.005, /storage/emulated/0/1493357699.mp4.006, /storage/emulated/0/1493357699.mp4.007, /storage/emulated/0/1493357699.mp4.008 ]
My thought is what is the use to upload spitting/chunk file to server?
My code for splitting files:
public static List<File> splitFile(File f) {
try {
int partCounter = 1;
List<File> result = new ArrayList<>();
int sizeOfFiles = 1024 * 1024;// 1MB
byte[] buffer = new byte[sizeOfFiles];
// create a buffer of bytes sized as the one chunk size
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(f));
String name = f.getName();
int tmp = 0;
while ((tmp = bis.read(buffer)) > 0) {
File newFile = new File(f.getParent(), name + "." + String.format("%03d", partCounter++));
// naming files as <inputFileName>.001, <inputFileName>.002, ...
FileOutputStream out = new FileOutputStream(newFile);
out.write(buffer, 0, tmp);//tmp is chunk size. Need it for the last chunk,
// which could be less then 1 mb.
result.add(newFile);
}
return result;
} catch (Throwable throwable) {
throwable.printStackTrace();
}
return null;
}
I have implemented in one of my projects. I see two primary reasons:
To achieve multi-threaded / multiple connection for uploading chunks. You can upload multiple chunks at the same time.
Stop/Resume uploading of rest of the chunks if either of the chunk fails to upload (depending on server response)
In my app, i'm sending a file from a client, using sockets. On the other side, another client receive the file using InputStream and then bufferedOutputStream save the file in the system.
I don´t know why, the file isn´t utterly transmited. I think this is because of network overload, anyway, i don´t know how to solve it.
Transmiter is:
Log.d(TAG,"Reading...");
bufferedInputStream.read(byteArrayFile, 0, byteArrayFile.length);
Log.d(TAG, "Sending...");
bufferedOutputStream.write(byteArrayFile,0,byteArrayFile.length);
bufferedOutputStream.flush();
Receiver is:
bufferedOutputStream=new BufferedOutputStream(new FileOutputStream(file));
byteArray=new byte[fileSize];
int currentOffset = 0;
bytesReaded = bufferedInputStream.read(byteArray,0,byteArray.length);
currentOffset=bytesReaded;
do {
bytesReaded = bufferedInputStream.read(byteArray, currentOffset, (byteArray.length-currentOffset));
if(bytesReaded >= 0){ currentOffset += bytesLeidos;
}
} while(bytesReaded > -1 && currentOffset!=fileSize);
bufferedOutputStream.write(byteArray,0,currentOffset);
You don't state where filesize came from, but there are numerous problems with this code. Too many to mention. Throw it all away and use DataInputStream.readFully(). Or use the following copy loop, which doesn't require a buffer the size of the file, a technique which does not scale, assumes that the file size fits into an int, and adds latency:
byte[] buffer = new byte[8192];
int count;
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
Use this at both ends. If you're sending multiple files via the same connection it gets more complex, but you haven't stated that.
I've implemented a custom BackupAgent and part of my data are images which are about 1 MB large. When creating the backup, every image is written as a separate entity. On restoring the images, I wanted to read the data in 4K (BUFFER_SIZE) chunks like this and write it to a file like this:
FileOutputStream out = new FileOutputStream(file);
byte[] buffer = new byte[BUFFER_SIZE];
int offset = 0;
int n = 0;
// readEntityData returns 0 when all data of entity is read
while (0 != (n = data.readEntityData(buffer, offset, BUFFER_SIZE))) {
out.write(buffer, 0, n);
offset += n;
}
However, this only reads the first 4K chunk correctly, on the second call of readEntityData an IOException with error code 0xffffffff is thrown.
When I make the buffer as large as the entity's data size and read all the data at once, it works perfectly, but I think it would be safer to use a smaller buffer.
Has anybody experienced something like that? All examples I found read the data at once and not in multiple chunks.
I have an encryption and decryption code which I use to encrypt and decrypt video files (mp4). I'm trying to speed up the decryption process as the encryption one is not that relevant for my case. This is the code that I have for the decryption process:
private static void decryptFile() throws IOException, ShortBufferException, IllegalBlockSizeException, BadPaddingException
{
//int blockSize = cipher.getBlockSize();
int blockSize = cipher.getBlockSize();
int outputSize = cipher.getOutputSize(blockSize);
System.out.println("outputsize: " + outputSize);
byte[] inBytes = new byte[blockSize];
byte[] outBytes = new byte[outputSize];
in= new FileInputStream(inputFile);
out=new FileOutputStream(outputFile);
BufferedInputStream inStream = new BufferedInputStream(in);
int inLength = 0;;
boolean more = true;
while (more)
{
inLength = inStream.read(inBytes);
if (inLength == blockSize)
{
int outLength
= cipher.update(inBytes, 0, blockSize, outBytes);
out.write(outBytes, 0, outLength);
}
else more = false;
}
if (inLength > 0)
outBytes = cipher.doFinal(inBytes, 0, inLength);
else
outBytes = cipher.doFinal();
out.write(outBytes);
}
My question is how to speed up the decryption process in this code. I've tried decrypting a 10MB mp4 file and it decrypts in 6-7 seconds. However, I'm aiming for < 1 seconds. Another thing I would like to know is if my writing to the FileOutputStream out is actually slowing the process down rather than the decryption process itself. Any suggestions on how to go about speeding things up here.
I'm using AES for encryption/decryption.
Until I find a solution, I will be using a ProgressDialog which tells the user to wait until the video has been decrypted (Obviously, I'm not going to use the word: decrypted).
Why are you decrypting data only by blockSize increments ? You do not show what type of object cipher is, but I am guessing this is a javax.crypto.Cipher instance. It can handle update() calls over arrays of arbitrary length, and you will have much less overhead if you use longer arrays. You should process data by blocks of, say, 8192 bytes (that's the traditional length for a buffer, it interacts reasonably well with CPU inner caches).
bytebiscuit, your question gave me the solution which I am trying from past 6 days. I just modified your code little bit, and my 52 mb video file is getting decrypted in just 4 seconds. Previous decrypting technique took 45 seconds which was a different logic (not yours) . Thats a massive difference 45 seconds to 4 seconds. Where ever I have done modification I am putting //modified comment lines. I am sure if your video is 10mb video, it will get decrypted in 1 second for sure. Try applying this, it should work out.
private static void decryptFile() throws IOException, ShortBufferException, IllegalBlockSizeException, BadPaddingException
{
//int blockSize = cipher.getBlockSize();
int blockSize = cipher.getBlockSize();
int outputSize = cipher.getOutputSize(blockSize);
System.out.println("outputsize: " + outputSize);
byte[] inBytes = new byte[blockSize*1024]; //modified
byte[] outBytes = new byte[outputSize * 1024]; //modified
in= new FileInputStream(inputFile);
out=new FileOutputStream(outputFile);
BufferedInputStream inStream = new BufferedInputStream(in);
int inLength = 0;;
boolean more = true;
while (more)
{
inLength = inStream.read(inBytes);
if (inLength/1024 == blockSize) //modified
{
int outLength
= cipher.update(inBytes, 0, blockSize*1024, outBytes);//modified
out.write(outBytes, 0, outLength);
}
else more = false;
}
if (inLength > 0)
outBytes = cipher.doFinal(inBytes, 0, inLength);
else
outBytes = cipher.doFinal();
out.write(outBytes);
}
I suggest you use the profiling tool provided in the android sdk. it will tell you where you spend the most time (i.e. : file writing or decoding).
see http://developer.android.com/guide/developing/debugging/debugging-tracing.html
This work on the emulator as well as on an actual device.
Consider using the NDK. On devices before Froyo (and even Froyo itself), it would be really slow due to the lack of JIT (or a very simple one in Froyo). Even with the JIT, native architecture-optimized crypto code will always outrun Dalvik.
See also this question.
As an aside, if you're using AES directly, you're probably doing something wrong. If this is part of an effort to do DRM, make sure you realize the full extent of the fact that decompiling an Android app is trivial. Your key will not be secure, which by definition defeats the encryption.
Instead of spending efforts to improve an inadequate architecture, you should consider a streaming solution: it has the great advantage to spread the computation time for the decryption so that it becomes no more noticeable. I mean: do not produce another file from your video source but rather a stream, with a local http server. Unfortunately there is no such component in the SDK, you have to make your own implementation or search for an existing one.
I have a problem with SHA-1 performance on Android. In C# I get calculated hash in about 3s, same calculation for Android takes about 75s. I think the problem is in reading operation from file, but I'm not sure how to improve performance.
Here's my hash generation method.
private static String getSHA1FromFileContent(String filename)
{
try
{
MessageDigest digest = MessageDigest.getInstance("SHA-1");
//byte[] buffer = new byte[65536]; //created at start.
InputStream fis = new FileInputStream(filename);
int n = 0;
while (n != -1)
{
n = fis.read(buffer);
if (n > 0)
{
digest.update(buffer, 0, n);
}
}
byte[] digestResult = digest.digest();
return asHex(digestResult);
}
catch (Exception e)
{
return null;
}
}
Any ideas how can I improve performance?
I tested it on my SGS (i9000) and it took 0.806s to generate the hash for a 10.1MB file.
Only difference is that in my code i am using BufferedInputStream in addition to the FileInputStream and the hex conversion library found at:
http://apachejava.blogspot.com/2011/02/hexconversions-convert-string-byte-byte.html
Also I would suggest that you close your file input stream in a finally clause
If I were you I would use the JNI like this guy did and get the speed up that way. This is exactly what the C interface was made for.