I have a list containing URLs of Images that I want to download with Android's DownloadManager. It's a pretty basic design to define a AsyncTask subclass that removes an object from the list and then invokes the DownloadManager. My problem is how to handle the parallelism of the AsyncTasks when each of those will list.get(curentIndex). To make things clear I want each Task (which will be running on the Executor so running in parallel amongst the others not only the Main Thread) to run atomically the action that
will remove the objects from the lists
will and increment the index
This is in general a synchronization problem. There are many ways to tackle this. I would suggest something along the lines:
// Your index. Defaulted at 0. Thread safe Integer wrapper object.
// All set/get operations are guaranteed atomical.
AtomicInteger mIndex = new AtomicInteger(0);
// Your list. This is thread-safe. All mutative operations
// are atomical, and properly locked. The list is copied on each write.
// This solution, though it does what it needs to do, is SLOW. Don't use for
// large lists.
CopyOnWriteArrayList<Object> mList = ... ;
// Your async task. Maybe more of them.
AsyncTask mTask = ...;
// Anything here that should be thread safe.
ensurePrerequisites(mTask);
// Execute on thread pool to make sure they're parallel.
// CAUTION: You will still get only (CPU_CORE_COUNT + 1) max parallel tasks.
mTask.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);
public void synchronized ensurePrerequisites(AsyncTask task) {
// I assume you want to pass the values into the task.
int idx = mIndex.getAndIncrement();
task.setIndex(idx);
task.setObject(mList.remove(idx));
}
Related
I have a few questions about the the usages of a generic thread framework(for a specific amount of jobs of course) vs the usage of many asyntasks.
I would like to know if it is better to have many asyntasks for small jobs, a handler thread when the job takes a bit longer or it is better to have a generic thread by yourself having generic jobs with a notification system built on top of the the running thread(or a subclass, same story).
My idea is for creating thread that handles different jobs without knowing beforehand which are the jobs. This goes in the direction of creating a sort of small framework for handling different generic jobs.
For example my approach goes in the direction of the code below:
public (abstract if you want to extend and add something on top) class WorkerThread extends Thread {
private static final String TAG = WorkerThread();
private List<WorkTask> syncQueue = new ArrayList< WorkTask >();
private boolean clearQueue = false;
public WorkerThread() {
}
public void stop(boolean clear) {
clearQueue = clear;
this.stopWorker = true;
}
public void addTask(WorkerTask task) {
synchronized (syncQueue) {
if (task != null && !getSynQueue().contains(task)) {
getSynQueue().add(task);
}
}
}
#Override
public void run() {
while (!stopWorker) {
WorkerTask task = null;
synchronized (syncQueue) {
if (!getSynQueue().isEmpty()) {
task = getSynQueue().get(0);
}
}
if (task != null) {
try {
task.run();
synchronized (syncQueue) {
if (!getSynQueue().isEmpty()) {
getSynQueue().remove(task);
//notify something/someone
}
}
} catch (Exception e) {
Log.e(TAG, "Error in running the task." + e.getMessage());
synchronized (syncQueue) {
//again u can notify someone
}
} finally {
//here you can actually notify someone of success
}
}
}
if(clearQueue){
getSynQueue().clear();
}
}
private List<WorkerTask> getSynQueue() {
return this.syncQueue;
}
}
Here the task is the abstract base class that all the jobs extend.
Then on top of this thread or a subclass of this class can be an observer that notifies when something went wrong with the jobs/tasks.
So far, as I know, the pros and cons for my approach are like that:
Thread:
Pros:
1. Long time operations.
2. Centralized.
3. Scalable.
4. Once you have it properly tested it will work smoothly.
Cons
1. Complex architecture.
2. Hard to maintain.
3. Over-engineering for small jobs.
AsynTask:
Pros
1. Easy to be used.
2. Good for short-time operation jobs.
3. Easy to maintain/understand.
Cons
1. Cannot scale that much, you need to stick to doInBackground and onPostExecute.
2. Not good for long-time operation jobs.
If I missed something please correct me.
Final question would be, when the architecture gets a bit big with a lot of requests, short-time, long-time, isn't it better to try and make a generic framework that can handle it both rather than do it with asynctasks there, maybe handlerthread in other parts, etc?
AsyncTask should be used for short operations. From the official documentation:
AsyncTask
AsyncTask is designed to be a helper class around Thread and Handler
and does not constitute a generic threading framework. AsyncTasks
should ideally be used for short operations (a few seconds at the
most.) If you need to keep threads running for long periods of time,
it is highly recommended you use the various APIs provided by the
java.util.concurrent package such as Executor, ThreadPoolExecutor and
FutureTask.
Therefore, if the operation is designed to take longer than a few seconds, it is recommended to use the concurrent package. E.g of operations are, storing a file, a picture, a login, etc.
However, Asynctask has also a few limitations as it can be seen in the link below:
AsyncTask limitations
There is a limit of how many tasks can be run simultaneously. Since AsyncTask uses a thread pool executor with max number of worker threads (128) and the delayed tasks queue has fixed size 10. If you try to execute more than 138 AsyncTasks the app will crash with java.util.concurrent.RejectedExecutionException.
Also, between Api 1.6 and 3.0 there is no way to customize the AsyncTask. It can run tasks in parallel but no customization possible. A
Between API 3.0 and API 4.3.1 there is a default fixed delayed queue size of 10, a minimum number of tasks, 5, and a maximum number of tasks 128. However, after API 3.0, one can define its own executor.
Therefore, if you have a number of minimum 16 tasks to run, the first 5 will start, the next 10 will go into the queue, but from the 16th a new worker thread will be allocated. After version 4.4(kitkat), the number of parallel asynctasks depends on the amount of processors that a device has:
private static final int CPU_COUNT = Runtime.getRuntime().availableProcessors();
private static final int CORE_POOL_SIZE = CPU_COUNT + 1;
private static final int MAXIMUM_POOL_SIZE = CPU_COUNT * 2 + 1;
private static final BlockingQueue<Runnable> sPoolWorkQueue = new LinkedBlockingQueue<Runnable>(128);
The aforementioned details, make the AsyncTask usable only in very specific cases, like loading an image or a file from or to the storage.
On the hand a generic threading framework(or library) can be easily extendable and customizable. Example of libraries is the RxJava that is easily customizable. Using RxJava one can specify task to run on either UI thread and background. Also, for loading/saving pictures asynchronously to/on the sdcard, one can use different libraries like Picasso or Glide.
My question is very simple, what is the best approach to work with Parse using the local store at the time I want to query the saved objects.
Is it better to trigger several queries to the local store directly on the main thread and avoid nesting a lot of anonymous classes or using a background thread?
It's important thing to notice is that this method is going to be called very frequently and the pattern will be repeated in several places with different queries. I'm evaluating both efficiency and code quality in readability. These methods will be called synchronously so we can assume the data will be consistent at any time.
As the objects are being saved locally I would expect the queries to be very fast in response. Here's a rough sample of how the code would look like in both cases.
Option one:
public void processBatches() {
ParseQuery<Batch> batchQuery = Batch.getQuery();
int batchCount = batchQuery.fromLocalDatastore().count();
List<Batch> batches = batchQuery.fromLocalDatastore().find();
for(Batch b : batches) {
// do whatever I need to do
}
}
Option two:
public void processBatches() {
ParseQuery<Batch> batchQuery = Batch.getQuery();
int batchCount = batchQuery.fromLocalDatastore().countInBackground(new CountCallback() {
#Override
public void done(int i, ParseException e) {
if (i > 0) {
batchQuery.findInBackground(new FindCallback<Batch>() {
#Override
public void done(List<Batch> list, ParseException e) {
for (Batch batch : list) {
// do whatever I need to do
}
}
});
}
}
});
}
Well since in option one you are blocking the UI thread, there could be a delay in the user's ability to interact with your application. This is not a very good option since even if it is for just a moment, users don't want to be waiting unless they know operations are happening. But, if you know that at any time there will be little to no delay, go ahead and do it.
Nevertheless, I argue that option two is going to be the best option. This is because, in general, all network operations should be performed in the background. Although in your case you are performing local datastore queries, suppose a user has gone to their application task manager and cleared the data (very rare this will happen) what happens now when you perform the find from local data store and processing of Batch objects? Well, the app crashes. Again, this is not a very good option for the usability for your application.
Choose the second option, and allow an AsyncThread to run the find() and count() query operations to the network if there is nothing found from local data store queries. Also, from the Parse documentation for find:
public Task<List<T>> findInBackground()
Retrieves a list of ParseObjects that satisfy this query from the source in a background thread.
This is preferable to using ParseQuery.find(), unless your code is already running in a background thread.
Returns:
A Task that will be resolved when the find has completed.
Parse's creators prefers that the users of their API use a background thread to perform operations.
It really depends.
Is the user triggering the update? If so then do it on the main thread because you don't want them waiting
If not, then is the data access a result of fetching data from the web (and hence you should already be on a background thread) so could probably just remain on the background thread
Also what happens in "// do whatever I need to do"? Is it an update to the UI or more background processing?
I am working on an Android application that uses greenDAO as a data persistence layer. The application downloads data from various different sources across multiple threads (determined by a thread pool), each piece of data is inserted into the database in a transaction using insertOrReplaceInTx. This is working fine.
My question is whether it is technically possible, using greenDAO, to encapsulate these different transactions (which occur on different threads) into an overall transaction, using nested transactions. I know in theory it is possible to do this if all the transactions were taking place on a single thread, however I am unsure if this possible with the insertOrReplaceInTx calls occurring on different threads.
The reason I wish to encapsulate these into a single overall transaction is because they represent a synchronisation process within an app. In the event of any single part of the import failing, I wish to abort and rollback all of the modifications within the overall transaction.
If I begin a transaction with db.beginTransaction on the main thread where I initiate the import process, this creates a deadlock when another thread tries to insertOrReplaceInTxt.
Is the correct way to counter this to ensure that all greenDAO transactions are taking place on the same thread?
Afaik, you cannot because each thread manages its own connection.
If you have such dependency between these operations, you probably want to sync them anyways.
e.g. what if Job A finishes way before Job B and Job B's db connection fails. Your data will go out of sync again. You still need some logic for the other job.
Also, writers are mutually exclusive.
I would suggest creating a utility class that can run a list of runnables in a transaction. Each job, when finished, enqueues a Runnable to this utility. These runnables include the actual database commands.
When the last one arrives (this depends on your dependency logic), the utility will run all runnables in a transaction.
A sample implementation may look like this: (I used a simple counter but you may need a more complex logic)
class DbBundle {
AtomicInteger mLatch;
List<Runnable> mRunnables = new ArrayList();
DbBundle(int numberOfTx) {
mLatch = new AtomicInteger(numberOfTx);
}
void cancel() {
mLatch.set(-1); // so decrement can never reach 0 in submit
}
boolean isCanceled() {
mLatch.count() < 0;
}
void submit(Runnable runnable) {
mRunnables.add(runnable);
if (mLatch.decrementAndGet() == 0) {
db.beginTransaction();
try {
for (Runnable r : mRunnables) r.run();
db.setTransactionSuccessful()
} finally {
db.endTransaction();
}
}
}
}
When you create each job, you pass this shared DbBundle and the last one will execute them all.
So a job would look like:
....
try {
if (!dbBundle.isCanceled()) { // avoid extra request if it is already canceled
final List<User> users = webservice.getUsers();
dbBundle.submit(new Runnable() {
void onRun() {
saveUsers(users);//which calls db. no transaction code.
});
});
} catch(Throwable t) {
dbBundle.cancel();
}
I recall reading somewhere that android guarantees that LruCache provides latest info for all threads, and that one thread's operation will complete before the same thread sees an edit on the cache from another thread. I am using LruCache to store bitmaps obtained from my app's server, and using a pool of threads to obtain bitmaps from the network.
Now I cannot find the reference to this in the Android docs or any other mention. Do I need to mark LruCache instances as volatile or set synchronize(LruCache) around cache operations?
mibollma is not wrong in his response regarding Android LruCache Thread Safety. People often mistake thread safety and atomicity.
If a class is thread safe, it means that, when for instance two threads call an operation on it, the internals do not break. Vector is such a class with every operation being synchronized. If two different threads call Vector.add, they will both synchronize on the instance and the state is not broken. For instance something like this:
synchronized void add(final T obj) {
objects[index++] = obj;
}
This thread-safe in the sense that no two threads will add an element at the same position. If it would not be synchronized they could both read index = 0 and try to write at that position.
Now why do you still need to synchronize? Imagine you have a case like this:
if(!collection.contains(element)) {
collection.add(element);
}
In that case your operation is not atomic. You synchronize once, when you ask if the element is already present and a second time when you add that element. But there is a window in between those two calls when another thread could make progress and your assumption of the collection not containing the element is broken.
In pseudo code:
if(!coll.contains(element)) { // << you have the exclusive lock here
//Thread 2 calls coll.add(element) << you do not have the lock anymore
coll.add(element); // << doomed!
}
So this is why the answer is correct in the sense that you should synchronize around non-atomic operations like
synchronized(coll) {
if(!coll.contains(element)) { // << you have the exclusive lock here
// Thread 2 wants to call << still holding the lock
// coll.add(element) but
// cannot because you hold the lock
coll.add(element); // << unicorns!
}
}
Because synchronization is pretty expensive the concurrent collections come with atomic operations like putIfAbsent.
Now back to your original question: should you make the LruCache volatile? In general you do not mark the LruCache itself volatile but the reference to it. If such a reference is shared across threads and you plan to update that field then yes.
If a field is not marked volatile a thread might not see the updated value. But again: this is just the reference to the LruCache itself and has nothing to do directly with its contents.
In your specific scenario I would rather have the reference final instead of volatile since no thread should set the reference to null anyways.
The question if you need to put synchronized around your cache operations depends on the case. If you want to create a single atomic operation, like putIfAbsent, then yes.
public void putIfAbsent(final K key, final V value) {
synchronized(lruCache) {
if(!lruCache.containsKey(key)) {
lruCache.put(key, value);
}
}
}
But later in your code, when you call just lruCache.get(key) there is no need to wrap that into a synchronized block itself. Only when you plan to create an atomic operation that should not interfere with another thread.
I'm looking for a design pattern or approach for the following scenario. I wish to kick off two separate background threads for data retrieval from different sources. I then want one method (on the UI thread) to be called once both background threads have completed their work. As the data from the two sources must be combined to be useful, I must wait until both have finished retrieving before manipulating the data. How can I achieve this on the Android platform?
Edit: My first version has been bothering me, and I didn't like the necessary added boolean with it, so here's another version. Call it with this from onPostExecute of each added task.
ArrayList<AsyncTask> tasks;
public void doStuffWhenDone(AsyncTask finishedTask)
{
tasks.remove(finishedTask);
if(tasks.size() > 0)
return;
... do stuff
}
I'll keep the older one up also, since they both work, but I think the above is much cleaner. Now to go tidy up one of my earlier projects.
ArrayList<AsyncTask> tasks;
boolean hasBeenDone = false;
public void doStuffWhenDone()
{
for(int i=0;i<tasks.size();i++)
if(hasBeenDone || (tasks.get(i).getStatus() != AsyncTask.Status.FINISHED))
return;
hasBeenDone = true;
... do stuff
}
It's easily extendable to however many tasks you have, and there's no need for a thread to handle the threads. Just call the method at the end of each task. If it's not the last one done, nothing happens.
Edit: Good point, but I don't think it needs to be atomic. Since both AsyncTasks' onPostExecute methods run on the UI thread, they'll be called one after the other.
Use a CountDownLatch, like this:
CountDownLatch barrier = new CountDownLatch(2); // init with count=2
startWorkerThread1(barrier);
startWorkerThread2(barrier);
barrier.await(); // it will wait here until the count is zero
doStuffWithTheResult();
when a worker thread finishes, call barrier.countDown() from it.
You can use AsyncTask and an int to know if both jobs are finished...