In a game for Android written in Scala, I have plenty of objects that I want to pool. First I tried to have both active (visible) and non active instances in the same pool; this was slow due to filtering that both causes GC and is slow.
So I moved to using two data structures, so when I need to get a free instance, I just take the first from the passive pool and add it to the active pool. I also fast random access to the active pool (when I need to hide an instance). I'm using two ArrayBuffers for this.
So my question is: which data structure would be best for this situation? And how should that (or those) specific data structure(s) be used to add and remove to avoid GC as much as possible and be efficient on Android (memory and cpu constraints)?
The best data structure is an internal list, where you add
var next: MyClass
to every class. The non-active instances then become what's typically called a "free list", while the active ones become a singly-linked list a la List.
This way your overhead is exactly one pointer per object (you can't really get any less than that), and there is no allocation or GC at all. (Unless you want to implement your own by throwing away part or all of the free list if it gets too long.)
You do lose some collections niceness, but you can just make your class be an iterator:
def hasNext = (next != null)
is all you need given that var. (Well, and extends Iterator[MyClass].) If your pool sizes are really quite small, sequential scanning will be fast enough.
If your active pool is too large for sequential scanning down a linked list and elements are not often added or deleted, then you should store them in an ArrayBuffer (which knows how to remove elements when needed). Once you remove an item, throw it on the free list.
If your active pool turns over rapidly (i.e. the number of adds/deletes is similar to the number of random accesses), then you need some sort of hierarchical structure. Scala provides an immutable one that works pretty well in Vector, but no mutable one (as of 2.9); Java also doesn't have something that's really suitable. If you wanted to build your own, a red-black or AVL tree with nodes that keep track of the number of left children is probably the way to go. (It's then a trivial matter to access by index.)
I guess I'll mention my idea. The filter and map methods iterate over the entire collection anyway, so you may as well simplify that and just do a naive scan over your collection (to look for active instances). See here: https://github.com/scala/scala/blob/v2.9.2/src/library/scala/collection/TraversableLike.scala
def filter(p: A => Boolean): Repr = {
val b = newBuilder
for (x <- this)
if (p(x)) b += x
b.result
}
I ran some tests, using a naive scan of n=31 (so I wouldn't have to keep more than a 32 bit Int bitmap), a filter/foreach scan, and a filter/map scan, and a bitmap scan, and randomly assigning 33% of the set to active. I had a running counter to double check that I wasn't cheating by not looking at the right values or something. By the way, this is not running on Android.
Depending on the number of active values, my loop took more time.
Results:
naive scanned a million times in: 197 ms (sanity check: 9000000)
filter/foreach scanned a million times in: 441 ms (sanity check: 9000000)
map scanned a million times in: 816 ms (sanity check: 9000000)
bitmap scanned a million times in: 351 ms (sanity check: 9000000)
Code here--feel free to rip it apart or tell me if there's a better way--I'm fairly new to scala so my feelings won't be hurt: https://github.com/wfreeman/ScalaScanPerformance/blob/master/src/main/scala/scanperformance/ScanPerformance.scala
Related
I recently had the task of performing a cross-selection operation on some collections, to find an output collection that was matching my criteria. (I will omit the custom logic because it is not needed).
What I did was creating a class that was taking as a parameter Lists of elements, and I was then calling a function inside that class that was responsible for processing those lists of data and returning a value.
Point is, I'm convinced I'm not doing the right thing, because writing a class holding hundreds of elements, taking names lists as parameters, and returning another collection looks unconventional and awkward.
Is there a specific programming object or paradigm that allows you to process large numbers of large collections, maybe with a quite heavy custom selection/mapping logic?
I'm building for Android using Kotlin
First of all, when we talk about the performance, there is only one right answer - write benchmark and test.
About memory: list with 1,000,000 of unique Strings with average size 30 chars will take about 120 Mb (e.g. 10^6 * 30 * 4, where last is "size of char", let's think that this is Unicode character with 4 bytes). And please add 1-3% for collateral expenses, such as link references. Therefore: if you have hundreds of Strings then just load whole data into memory and use list, because this is the fastest solution (synchronous, immutable, etc.).
If you can do streaming-like operations, you can use sequences. They are pretty lazy, the same with Java Streams and .Net Linq. Please check example below, it requires small amount of memory.
fun countOfEqualLinesOnTheSamePositions(path1: String, path2: String): Flow<String> {
return File(path1).useLines { lines1 ->
File(path2).useLines { lines2 ->
lines1.zip(lines2)
.map { (line1, line2) ->
line1 == line2
}
.count()
}
}
}
If you couldn't store whole data in memory and you couldn't work with stream-like schema, you may:
Rework algorithm to single-pass to multiple-pass, there each is stream-like. For example, Huffman Coding is two-pass algorithm, so it can be used to compress 1Tb of data by using small amount of memory.
Store intermediate data on the disk (this is much complex for this short answer).
For additional optimizations:
To cover case of merging a lot of parallel streams, please consider also Kotlin Flow. It allows you to work asynchronously, to avoid IO blocks. For example, this can be useful to merge ~100 network streams.
To keep a lot of non-unique items in memory, please consider caching logic. It can save memory (however please benchmark first).
Try operate with ByteBuffers, instead of Strings. You can get much less allocation (because you can deallocate object explicitly), however code will be too complex.
I will start this by saying that on iOS this algorithm takes, on average, <2 seconds to complete and given a simpler, more specific input that is the same between how I test it on iOS vs. Android it takes 0.09 seconds and 2.5 seconds respectively, and the Android version simply quits on me, no idea if that would be significantly longer. (The test data gives the sorting algorithm a relatively simple task)
More specifically, I have a HashMap (Using an NSMutableDictionary on iOS) that maps a unique key(Its a string of only integers called its course. For example: "12345") used to get specific sections under a course title. The hash map knows what course a specific section falls under because each section has a value "Course". Once they are retrieved these section objects are compared, to see if they can fit into a schedule together based on user input and their "timeBegin", "timeEnd", and "days" values.
For Example: If I asked for schedules with only the Course ABC1234(There are 50 different time slots or "sections" under that course title) and DEF5678(50 sections) it will iterate through the Hashmap to find every section that falls under those two courses. Then it will sort them into schedules of two classes each(one ABC1234 and one DEF5678) If no two courses have a conflict then a total of 2500(50*50) schedules are possible.
These "schedules" (Stored in ArrayLists since the number of user inputs varies from 1-8 and possible number of results varies from 1-100,000. The group of all schedules is a double ArrayList that looks like this ArrayList>. On iOS I use NSMutableArray) are then fed into the intent that is the next Activity. This Activity (Fragment techincally?) will be a pager that allows the user to scroll through the different combinations.
I copied the method of search and sort exactly as it is in iOS(This may not be the right thing to do since the languages and data structures may be fundamentally different) and it works correctly with small output but when it gets too large it can't handle it.
So is multithreading the answer? Should I use something other than a HashMap? Something other than ArrayLists? I only assume multithreading because the errors indicate that too much is being done on the main thread. I've also read that there is a limit to the size of data passed using Intents but I have no idea.
If I was unclear on anything feel free to ask for clarification. Also, I've been doing Android for ~2 weeks so I may completely off track but hopefully not, this is a fully functional and complete app in the iTunes Store already so I don't think I'm that far off. Thanks!
1) I think you should go with AsynTask of Android .The way it handle the View into `UI
threadandBackground threadfor operations (Like Sorting` ) is sufficient enough to help
you to get the Data Processed into Background thread And on Processing you can get the
Content on UI Thread.
Follow This ShorHand Example for This:
Example to Use Asyntask
2) Example(How to Proceed):
a) define your view into onPreExecute()
b) Do your Background Operation into doInBackground()
c) Get the Result into onPostExceute() and call the content for New Activty
Hope this could help...
I think it's better for you to use TreeMap instead of HashMap, which sorts data automatically everytime you mutate it. Therefore you won't have to sort your data before start another activity, you just pass it and that's all.
Also for using it you have to implement Comparable interface in your class which represents value of Map.
You can also read about TreeMap class there:
http://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
In the following code we rotate a complex number by some angle in a loop and then confirm that the resulting number is identical to the one we started with.
public class Complex {
private float r, i;
...
public Complex(Complex other) {
r = other.r;
i = other.i;
}
}
Complex z1 = new Complex(..);
Complex z1_save = new Complex(z1);
Complex z2 = new Complex();
Complex k = new Complex();
k.set_to_first_root_of_unity(8);
int n = 64;
while(n-- != 0) {
z1.multiply(k, z2);
z1 = new Complex(z2); // Line Y
}
Assert.assertEquals(true, z1.equals(z1_save));
Is there a way in Java to write Line Y using the constructor public Complex(Complex other) rather than clone(), and be certain that 64 objects will not be garbage collected?
Update: It seems it is impossible to ask this question in a simplified manner without referring to the context—that of an interactive application. The best answer to the present question so far (assylias's) is that one should not worry about object creation and garbage collection 90% of the time. During redraw, it is necessary to worry about it 100% of the time. I have now restated the question here.
I am worried about the inefficiency of the GC running 64 times unnecessarily.
That is an unnecessary worry. If your objects are in the young generation (which they will considering their scope) GC will be free (as in 0 cost).
When the GC runs on the young generation, it only goes through live objects (objects that are eligible for GC are not visited), so the GC time is a function of the live objects only.
The story is different for the old generation, but your local objects won't reach that stage.
Reference - Brian Goetz, emphasis mine:
What about deallocation?
But allocation is only half of memory management -- deallocation is the other half. It turns out that for most objects, the direct garbage collection cost is -- zero. This is because a copying collector does not need to visit or copy dead objects, only live ones. So objects that become garbage shortly after allocation contribute no workload to the collection cycle.
?It turns out that the vast majority of objects in typical object-oriented programs (between 92 and 98 percent according to various studies) "die young," which means they become garbage shortly after they are allocated, often before the next garbage collection. (This property is called the generational hypothesis and has been empirically tested and found to be true for many object-oriented languages.) Therefore, not only is allocation fast, but for most objects, deallocation is free.
Executing constructor 64 times for an object with ten (or so) fields is not a big deal even for a device like a cell phone.
It is not clear what your task is.
If you are really concerned about calling constructor many times and creating too many identical object, you may try to use the Flightweight pattern.
Your question (and comments) are a bit confused ... but that might just be a problem with your written English skills. So I'm just assuming I understand what you meant to say. I'm also assuming that your example "works" ... which it currently doesn't.
The short answer is that you can reduce object churn (i.e. creation and release of "temporary" objects) by making your Complex object mutable. Typically you do this by adding setter operations that allow you to change the state of the object. But that has the effect of making your Complex class more difficult to use correctly. For example:
public static final ZERO = new Complex(0, 0);
// somewhere else
Complex counter = ZERO;
while (counter.lessThan(10)) {
// ....
counter.setRealPart(counter.getRealPart() + 1); // Ooops!!
}
... and lots more bugs like that.
Then there is the question of whether this will actually reduce garbage collection overheads, and by how much.
As #assylias points out, temporary objects that are created and then reclaimed in the next GC cycle have very low cost. The objects that are expensive are the ones that DON'T become garbage. And it is quite possible that for a normal program running in a normal environment, it is actually more efficient overall to create temporary objects.
Then there is the issue that the latest HotSpot JVMs can do something known as "escape analysis", which (if it works) can determine that a given temporary object will never be visible outside of its creation scope, and therefore doesn't need to be allocated in the heap at all. When that optimization can be applied, the "object churn" concern is mooted.
However, running the GC can be bad for "real time" performance; e.g. in games programming, where the user will notice if the game freezes for a fraction of a second. In cases like that, it is worth considering "tuning" your code to reduce object churn. But there are other possible approaches too ... like using a low-pause garbage collector ... if one is available for your platform.
#assylias's comment makes another important. Beware of premature optimization. Your intuition on the usage of your Complex object ... and the resulting object churn ... could be very wrong. All things being equal, it is best to delay optimization effort until you have profiled the application and determined that:
it needs to be tuned, and
the profiling evidence points to the Complex class being a significant performance bottleneck.
There's no reason to pay attention to garbage collection at all, unless:
users (maybe you) perceive performance issues with the application; and
profiling the application demonstrates that garbage collection is the source of the perceived performance issues.
Lacking both of these conditions: ignore garbage collection.
This is true of performance tuning in Java in general. Don't do it unless you've proven there's a real reason for it.
If you want to be efficient w.r.t GC, minimize your use of new.
So, in your example, you could re-use the variable in "Line Y", and simply set the fields with the new value. Something like:
while(n-- != 0) {
z1.multiply(k, z2);
z1.setValue(z2); // Line Y
}
where z1.setValue(X) sets the state of the object in the same fashion that the constructor new Complex(x) does.
EDIT: Why is this getting down voted? I stand by the statement above about reducing the cost of GC by minimizing the use of new. Yes, I agree in most contexts GC is not a problem - but if your code does call GC frequently, perhaps because your code spends a lot of time in a loop (say a CPU heavy algorithm), then you may well want to reuse objects.
Could someone tell me how to make a good mechanism for async. download of images for use in a ListView/GridView?
There are many suggestions, but each only considers a small subset of the typical requirements.
Below I've listed some reasonable factors (requirements or things to take into account) that I, and my collegues, are unable to satisfy at once.
I am not asking for code (though it would be welcome), just an approach that manages the Bitmaps as described.
No duplication of downloaders or Bitmaps
Canceling downloads/assigning of images that would no longer be needed, or are likely to be automatically removed (SoftReference, etc)
Note: an adapter can have multiple Views for the same ID (calls to getView(0) are very frequent)
Note: there is no guarantee that a view will not be lost instead of recycled (consider List/GridView resizing or filtering by text)
A separation of views and data/logic (as much as possible)
Not starting a separate Thread for each download (visible slowdown of UI). Use a queue/stack (BlockingQueue?) and thread pool, or somesuch.... but need to end that if the Activity is stopped.
Purging Bitmaps sufficiently distant from the current position in the list/grid, preferably only when memory is needed
Calling recycle() on every Bitmap that is to be discarded.
Note: External memory may not be available (at all or all the time), and, if used, should be cleared (of only the images downloaded here) asap (consider Activity destruction/recreation by Android)
Note: Data can be changed: entries removed (multi-selection & delete) and added (in a background Thread). Already downloaded Bitmaps should be kept, as long as the entries they're linked to still exist.
setTextFilterEnabled(true) (if based on ArrayAdapter's mechanism, will affect array indexes)
Usable in ExpandableList (affects the order the thumbnails are shown in)
(optional) when a Bitmap is downloaded, refresh ONLY the relevant ImageView (the list items may be very complex)
Please do not post answers for individual points. My problem is that that the more we focus on some aspects, the fuzzier others become, Heisenberg-like.
Each adds a dimension of difficulty, especially Bitmap.recycle, which needs to be called during operation and on Activity destruction (note that onDestroy, even onStop might not be called).
This also precludes relying on SoftReferences.
It is necessary, or I get OutOfMemoryError even after any number of gc, sleep (20s, even), yield and huge array allocations in a try-catch (to force a controlled OutOfMemory) after nulling a Bitmap.
I am resampling the Bitmaps already.
Check this example. As Its is used by Google and I am also using the same logic to avoid OutOfMemory Error.
http://developer.android.com/resources/samples/XmlAdapters/index.html
Basically this ImageDownlaoder is your answer ( As It cover most of your requirements) some you can also implement in that.
http://developer.android.com/resources/samples/XmlAdapters/src/com/example/android/xmladapters/ImageDownloader.html
In the end, I chose to disregard the recycling bug entirely. it just adds a layer of impossible difficulty on top of a manageable process.
Without that burden (just making adapters, etc stop showing images), I made a manager using Map<String, SoftReference<Bitmap>> to store the downloaded Bitmaps under URLs.
Also, 2-4 AsyncTasks (making use of both doInBackground and onProgressUpdate; stopped by adding special jobs that throw InterruptedException) taking jobs from a LinkedBlockingDeque<WeakReference<DownloadingJob>> supported by a WeakHashMap<Object, Set<DownloadingJob>>.The deque (LinkedBlockingDeque code copied for use on earlier API) is a queue where jobs can leave if they're no longer needed. The map has job creators as keys, so, if an Adapter demands downloads and then is removed, it is removed from the map, and, as a consequence, all its jobs disappear from the queue.
A job will, if the image is already present, return synchronously. it can also contain a Bundle of data that can identify which position in an AdapterView it concerns.
Caching is also done on an SD card, if available, under URLEncoded names. (cleaned partially, starting with oldest, on app start, and/or using deleteOnExit()
requests include "If-Modified-Since" if we have a cached version, to check for updates.
The same thing can also be used for XML parsing, and most other data acquisition.
If I ever clean that class up, I'll post the code.
In a game I need to keeps tabs of which of my pooled sprites are in use. When "active" multiple sprites at once I want to transfer them from my passivePool to activePool both of which are immutable HashSets (ok, i'll be creating new sets each time to be exact). So my basic idea is to along the lines of:
activePool ++= passivePool.take(5)
passivePool = passivePool.drop(5)
but reading the scala documentation I'm guessing that the 5 that I take might be different that the 5 I then drop. Which is definitely not what I want. I could also say something like:
val moved = passivePool.take(5)
activePool ++= moved
passivePool --= moved
but as this is something I need to do pretty much every frame in realtime on a limited device (Android phone) I guess this would be much slower as I will have to search one by one each of the moved sprites from the passivePool.
Any clever solutions? Or am I missing something basic? Remember the efficiency is a primary concern here. And I can't use Lists instead of Sets because I also need random-access removal of sprites from activePools when the sprites are destroyed in the game.
There's nothing like benchmarking for getting answers to these questions. Let's take 100 sets of size 1000 and drop them 5 at a time until they're empty, and see how long it takes.
passivePool.take(5); passivePool.drop(5) // 2.5 s
passivePool.splitAt(5) // 2.4 s
val a = passivePool.take(5); passivePool --= a // 0.042 s
repeat(5){ val a = passivePool.head; passivePool -= a } // 0.020 s
What is going on?
The reason things work this way is that immutable.HashSet is built as a hash trie with optimized (effectively O(1)) add and remove operations, but many of the other methods are not re-implemented; instead, they are inherited from collections that don't support add/remove and therefore can't get the efficient methods for free. They therefore mostly rebuild the entire hash set from scratch. Unless your hash set has only a handful of elements in it, this is bad idea. (In contrast to the 50-100x slowdown with sets of size 1000, a set of size 100 has "only" a 6-10x slowdown....)
So, bottom line: until the library is improved, do it the "inefficient" way. You'll be vastly faster.
I think there may be some mileage in using splitAt here, which will give you back both the five sprites to move and the trimmed pool in a single method invocation:
val (moved, newPassivePool) = passivePool.splitAt(5)
activePool ++= moved
passivePool = newPassivePool
Bonus points if you can assign directly back to passivePool on the first line, though I don't think it's possible in a short example where you're defining the new variable moved as well.