Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
In the following code we rotate a complex number by some angle in a loop and then confirm that the resulting number is identical to the one we started with.
public class Complex {
private float r, i;
...
public Complex(Complex other) {
r = other.r;
i = other.i;
}
}
Complex z1 = new Complex(..);
Complex z1_save = new Complex(z1);
Complex z2 = new Complex();
Complex k = new Complex();
k.set_to_first_root_of_unity(8);
int n = 64;
while(n-- != 0) {
z1.multiply(k, z2);
z1 = new Complex(z2); // Line Y
}
Assert.assertEquals(true, z1.equals(z1_save));
Is there a way in Java to write Line Y using the constructor public Complex(Complex other) rather than clone(), and be certain that 64 objects will not be garbage collected?
Update: It seems it is impossible to ask this question in a simplified manner without referring to the context—that of an interactive application. The best answer to the present question so far (assylias's) is that one should not worry about object creation and garbage collection 90% of the time. During redraw, it is necessary to worry about it 100% of the time. I have now restated the question here.
I am worried about the inefficiency of the GC running 64 times unnecessarily.
That is an unnecessary worry. If your objects are in the young generation (which they will considering their scope) GC will be free (as in 0 cost).
When the GC runs on the young generation, it only goes through live objects (objects that are eligible for GC are not visited), so the GC time is a function of the live objects only.
The story is different for the old generation, but your local objects won't reach that stage.
Reference - Brian Goetz, emphasis mine:
What about deallocation?
But allocation is only half of memory management -- deallocation is the other half. It turns out that for most objects, the direct garbage collection cost is -- zero. This is because a copying collector does not need to visit or copy dead objects, only live ones. So objects that become garbage shortly after allocation contribute no workload to the collection cycle.
?It turns out that the vast majority of objects in typical object-oriented programs (between 92 and 98 percent according to various studies) "die young," which means they become garbage shortly after they are allocated, often before the next garbage collection. (This property is called the generational hypothesis and has been empirically tested and found to be true for many object-oriented languages.) Therefore, not only is allocation fast, but for most objects, deallocation is free.
Executing constructor 64 times for an object with ten (or so) fields is not a big deal even for a device like a cell phone.
It is not clear what your task is.
If you are really concerned about calling constructor many times and creating too many identical object, you may try to use the Flightweight pattern.
Your question (and comments) are a bit confused ... but that might just be a problem with your written English skills. So I'm just assuming I understand what you meant to say. I'm also assuming that your example "works" ... which it currently doesn't.
The short answer is that you can reduce object churn (i.e. creation and release of "temporary" objects) by making your Complex object mutable. Typically you do this by adding setter operations that allow you to change the state of the object. But that has the effect of making your Complex class more difficult to use correctly. For example:
public static final ZERO = new Complex(0, 0);
// somewhere else
Complex counter = ZERO;
while (counter.lessThan(10)) {
// ....
counter.setRealPart(counter.getRealPart() + 1); // Ooops!!
}
... and lots more bugs like that.
Then there is the question of whether this will actually reduce garbage collection overheads, and by how much.
As #assylias points out, temporary objects that are created and then reclaimed in the next GC cycle have very low cost. The objects that are expensive are the ones that DON'T become garbage. And it is quite possible that for a normal program running in a normal environment, it is actually more efficient overall to create temporary objects.
Then there is the issue that the latest HotSpot JVMs can do something known as "escape analysis", which (if it works) can determine that a given temporary object will never be visible outside of its creation scope, and therefore doesn't need to be allocated in the heap at all. When that optimization can be applied, the "object churn" concern is mooted.
However, running the GC can be bad for "real time" performance; e.g. in games programming, where the user will notice if the game freezes for a fraction of a second. In cases like that, it is worth considering "tuning" your code to reduce object churn. But there are other possible approaches too ... like using a low-pause garbage collector ... if one is available for your platform.
#assylias's comment makes another important. Beware of premature optimization. Your intuition on the usage of your Complex object ... and the resulting object churn ... could be very wrong. All things being equal, it is best to delay optimization effort until you have profiled the application and determined that:
it needs to be tuned, and
the profiling evidence points to the Complex class being a significant performance bottleneck.
There's no reason to pay attention to garbage collection at all, unless:
users (maybe you) perceive performance issues with the application; and
profiling the application demonstrates that garbage collection is the source of the perceived performance issues.
Lacking both of these conditions: ignore garbage collection.
This is true of performance tuning in Java in general. Don't do it unless you've proven there's a real reason for it.
If you want to be efficient w.r.t GC, minimize your use of new.
So, in your example, you could re-use the variable in "Line Y", and simply set the fields with the new value. Something like:
while(n-- != 0) {
z1.multiply(k, z2);
z1.setValue(z2); // Line Y
}
where z1.setValue(X) sets the state of the object in the same fashion that the constructor new Complex(x) does.
EDIT: Why is this getting down voted? I stand by the statement above about reducing the cost of GC by minimizing the use of new. Yes, I agree in most contexts GC is not a problem - but if your code does call GC frequently, perhaps because your code spends a lot of time in a loop (say a CPU heavy algorithm), then you may well want to reuse objects.
Related
Talking in context of a game based on openGL renderer :
Lets assume there are two threads :
that updates the gameLogic and physics etc. for the in game objects
that makes openGL draw calls for each game object based on data in the game objects (that thread 1 keeps updating)
Unless you have two copies of each game object in the current state of the game you'll have to pause Thread 1 while Thread 2 makes the draw calls otherwise the game objects will get updated in the middle of a draw call for that object ! which is undesirable!
but stopping thread 1 to safely make draw calls from thread 2 kills the whole purpose of multithreading/cocurrency
Is there a better approach for this other than using hundreds or thousands or sync objects/fences so that the multicore architecture can be exploited for performance?
I know I can still use multiThreading for loading texture and compiling shaders for the objects which are yet to be the part of the current game state but how do I do it for the active/visible objects without causing conflict with draw and update?
The usual approach is that the simulation thread after completing a game step commits the state into an intermediary buffer and then signals the renderer thread. Since OpenGL executes asynchronously the render thread should complete rather quickly, thereby releasing the intermediary buffer for the next state.
You shouldn't render directly from the game state anyway, since what the renderer needs to do its works and what the simulation produces not always are the same things. So some mapping may be necessary anyway.
This is quite a general question you're asking. If you ask 10 different people, you'll probably get 10 different answers. In the past I implemented something similar, and here's what I did (after a long series of optimisation cycles).
Your model-update loop which runs on a background thread should look something like this:
while(true)
{
updateAllModels()
}
As you said, this will cause an issue when the GL thread kicks in, since it may very well render a view based on a model which is half way through being rendered, which can cause UI glitches at the best case.
The straight-forward way for dealing with this would be synchronising the update:
while (true)
{
synchronized(...)
{
updateAllModels();
}
}
Where the object you synchronize with here is the same object you'll use to synchronize the drawing method.
Now we have an improved method which won't cause glitches in the UI, but the overall rendering will probably take a very severe performance hit, since all rendering needs to wait until all model updates are finished, or vise versa - the models update will need to wait until all drawing is finished.
Now, lets think for a moment - what do we really need to be synchronizing?
In my app (a space game), when updating the models, I needed to calculate vectors, check for collisions and update all the object's positions, rotations, scale, etc.
Out of all these things, the only things the view cares about is the position, rotation, scale and a few other small considerations which the UI needs to know in order to correctly render the game world. The rendering process doesn't care about a game object's vector, the AI code, collision tests, etc. Considering this, I altered my update code to look something like this:
while (true)
{
synchronized(...)
{
updateVisibleChanges(); // sets all visible changes - positions, rotations, etc
}
updateInvisibleChanges(); // alters vectors, AI calculations, collision tests, etc
}
Same as before, we're synchronising the update and the draw methods, but this time, the critical section is much smaller than before. Essentially, the only things which should be set in the updateVisibleChanges method are things which pertain to the position, rotation, scale, etc of the objects which should be rendered. All other calculations (which are usually the most exhaustive ones) are performed afterwards, and do not stop the rendering from occurring.
An added bonus from this method - when you're performing your invisible changes, you can be sure that all objects are in the position they need to be (which is very useful for accurate collision tests). For example, in the method before the last one, object A moves, then object A tests a collision against object B which hasn't moved yet. It is possible that had object B moved before object A tested a collision, there would be a different result.
Of course, the last example I showed isn't perfect - you will still need to hang the rendering method and/or the updateVisible method to avoid clashes, but I fear that this will always be a problem, and the key is minimizing the amount of work you're doing in either thread sensitive method.
Hope this helps :)
I was watching video Google IO 2008 - Dalvik Virtual Machine Internals to understand how Dalvik VM works and why those people has preferred Dalvik VM over JVM for android. I found that android uses separate memory for Garbage information about the objects , opposed to the JVM where we have mark bits(bits telling whether object is able for garbagfe collection or not) together with objects.
Can anybody tell me in detail what are the advantages and disadvantages of having separate memory for marks bits and not having separate memory for mark bits ?
I was unable to get this difference by watching video.
Some advantages of a separate bitmap:
Much denser. A typical GC needs maybe eight bits of GC metadata, but due to alignment an in-object header might round this memory up to 32 bits.
Some operations, in particular around sweeping, become faster. This is partly because the denser (see above) bitmap means less memory traffic and better cache use, but also because some operations (e.g. zeroing all mark bits) can be vectorized when in this format. (Other parts of the GC needs to be designed to make use of that ability.)
If you fork() on a Unix system, a separate bitmark makes better use of copy-on-write: Pages containing objects might remain shared.
Some advantages of in-object mark bits:
Depending on the scheme used to associate objects with bitmaps, getting the mark bit for an object and vice versa can be quite complicated and/or slow. An in-object header, on the other hand, is trivial to access.
Easier memory management: No need to create a separate allocation of the right size and keep it in sync.
Many fast schemes for finding bitmaps for objects and vice versa are quite restrictive in other regards. For example, if you create a bitmap for every page and store the bitmap pointer at the start of the page, you have a problem storing objects larger than a page.
Separate mark bits work by having an array of bits where each bit represents an address in the heap that can start an object. For example, suppose the heap is 65536 bytes and all objects are aligned at 16 byte boundaries, then there are 4096 addresses in the heap that can be the start of an object. This means the array needs to contain 4096 bits, which can be efficiently stored as 512 bytes or 64 64bit sized unsigned integers.
In-object mark bits works by having one bit of each header of each object be set to 1 if the object is marked and 0 otherwise. Note that this requires each object to have a dedicated header area. Runtimes such as the JVM and .NET all add headers to objects so you essentially get the space for the mark bit for free.
But it doesn't work for conservative collectors which don't have full control of the environment they are running in, such as the Boehm GC. They can misidentify integers as pointers, so for them modifying anything in the mutators data heap is risky.
Mark & sweep garbage collection is divided into two phases: marking and sweeping. Marking using in-object mark bits is straight-forward (pseudo-code):
if not obj.is_marked():
obj.mark()
mark_stack.append(obj)
Using a separate array for storing mark bits, we have to convert the objects address and size to indices in the bit array and set the corresponding bits to 1:
obj_bits = obj.size_in_bytes() / 16
bit_idx = (obj - heap.start_address()) / 16
if not bitarr.bit_set(bit_idx):
bitarr.set_range(bit_idx, obj_bits)
mark_stack.append(obj)
So in our example, if an object is 128 bytes long, 8 bits will be set in the bit array. Clearly, using in-object mark bits is much simpler.
But separate mark bits gain some momentum when sweeping. Sweeping involves scanning through the whole heap and finding continuous regions of memory which is unmarked and therefore can be reclaimed. Using in-object mark bits, it would roughly look like this:
iter = heap.start_address()
while iter < heap.end_address():
# Scan til the next unmarked object
while iter.is_marked():
iter.unmark()
iter += iter.size()
if iter == heap.end_address():
return
# At an unmarked block
start = iter
# Scan til the next marked object
while iter < heap.end_address() and not iter.is_marked():
iter += iter.size()
size = iter - start
# Reclaim the block
heap.reclaim(start, size)
Note how the iteration jumps from object to object in the iter += iter.size() lines. This means that the sweep phase running time is proportional to the total number of live and garbage objects.
Using separate mark bits, you would do roughly the same loop except that large swathes of garbage objects would be flown over without "stopping" on each of them.
Consider the 65536 heap again. Suppose it contains 4096 objects that are all garbage. Iterating the 64 64bit integers in the mark bits array and seeing that they are all 0 is obviously very fast. Therefore the sweeping phase can potentially be much faster with separate mark bits.
But there is another wrinkle! In any mark and sweep collector, the running time is dominated by the mark phase and not the sweep phase which is usually very quick. So the verdict is still out. Some prefer separate mark bits, other prefer in-object ones. To the best of my knowledge, no one has yet been able to show which approach is superior to the other.
I find the document on this link
It describe as below:
Weak references are useful for mappings that should have their entries removed automatically once they are not referenced any more (from outside). The difference between a SoftReference and a WeakReference is the point of time at which the decision is made to clear and enqueue the reference:
A SoftReference should be cleared and enqueued as late as possible, that is, in case the VM is in danger of running out of memory.
A WeakReference may be cleared and enqueued as soon as is known to be weakly-referenced.
But when I look through the Dalvikvm's source code, found something in dvmCollectGarbageInternal(Heap.cpp L446 Android 4.4) function. It seem two references are
cleared at the same time.
/*
* All strongly-reachable objects have now been marked. Process
* weakly-reachable objects discovered while tracing.
*/
dvmHeapProcessReferences(&gcHeap->softReferences,
spec->doPreserve == false,
&gcHeap->weakReferences,
&gcHeap->finalizerReferences,
&gcHeap->phantomReferences);
Do I miss something?
================================================================================
With #fadden's help, I found the reserve code
if (!marked && ((++counter) & 1))
The dalvikvm reserve the half sofereference every GC procedure, and I copy someone's test code the test
final ArrayList> list = new ArrayList>(
SR_COUNT);
for (int i = 0; i < SR_COUNT; ++i) {
list.add(new SoftReference(new Integer(i)));
}
/* Test */
for (int i = 0; i < 3; ++i) {
System.gc();
try {
Thread.sleep(200);
} catch (final InterruptedException e) {
}
}
/* Check */
int dead = 0;
for (final SoftReference<Integer> ref : list) {
if (ref.get() == null) {
++dead;
}
Log.d(TAG, "dead: " + dead);
}
All the log from logcat is just what I think.
FWIW, the best description of weak/soft/phantom references in Java is in chapter 17 of The Java Programming Language ("Garbage Collection and Memory").
There's no mandated policy for soft reference retention. The VM is allowed to discard all or none during a GC, or anything in between. The only requirement is that the VM is supposed to discard all softly-reachable objects before throwing OOM.
You can continue Dalvik's logic in dvmHeapProcessReferences() in MarkSweep.cpp. Note in particular the call to preserveSomeSoftReferences(), which retains some but not others based on the reference "color". You can read more about colors on the wikipedia GC article.
From Understanding Weak References, by Ethan Nicholas:
https://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html
Weak references
A weak reference, simply put, is a reference that isn't strong enough to force an object to remain in memory. Weak references allow you to leverage the garbage collector's ability to determine reachability for you, so you don't have to do it yourself. You create a weak reference like this:
WeakReference weakWidget = new WeakReference(widget);
and then elsewhere in the code you can use weakWidget.get() to get the actual Widget object. Of course the weak reference isn't strong enough to prevent garbage collection, so you may find (if there are no strong references to the widget) that weakWidget.get() suddenly starts returning null.
...
Soft references
A soft reference is exactly like a weak reference, except that it is less eager to throw away the object to which it refers. An object which is only weakly reachable (the strongest references to it are WeakReferences) will be discarded at the next garbage collection cycle, but an object which is softly reachable will generally stick around for a while.
SoftReferences aren't required to behave any differently than WeakReferences, but in practice softly reachable objects are generally retained as long as memory is in plentiful supply. This makes them an excellent foundation for a cache, such as the image cache described above, since you can let the garbage collector worry about both how reachable the objects are (a strongly reachable object will never be removed from the cache) and how badly it needs the memory they are consuming.
And Peter Kessler added in the comments:
The Sun JRE does treat SoftReferences differently from WeakReferences. We attempt to hold on to object referenced by a SoftReference if there isn't pressure on the available memory. One detail: the policy for the "-client" and "-server" JRE's are different: the -client JRE tries to keep your footprint small by preferring to clear SoftReferences rather than expand the heap, whereas the -server JRE tries to keep your performance high by preferring to expand the heap (if possible) rather than clear SoftReferences. One size does not fit all.
I have a static class with a method in it that I run a few hundred times. Currently, everytime the method is run, it creates two different stack objects. If I were to make that class non-static so I can create the two stacks on construction and then reuse them by clearing them, would it be quicker? I guess the answer depends on creating a new stack object vs clearing an existing one (which is likely empty anyway) and if the performance gain (if any) from clearing it instead is greater than the performance loss from having a non-static method.
I've tried profiling the two and it never seems to work, but that's a different question.
It depends on how you use static variables and method in your code.
Instance variables and objects are stored on the heap.
Local variables are stored on the stack.
Static variables are stored in a permanent area on heap. The garbage collector works by marking and sweeping objects. Static variables cannot be elected for garbage collection while the class is loaded. They can be collected when the respective class loader (that was responsible for loading this class) is itself collected for garbage.
If i have a value to be passed to another activity i would use intents instead of static variables.
In a custom list adapter we use a static viewholder. So using static variables or methods depends on different situation.
You can analyze memory usage by objects using a tool called MAT Analyzer. The video in the below talks about memory management and how to detect and solve memory leaks
http://www.youtube.com/watch?v=_CruQY55HOk.
MemoryInfo mi = new MemoryInfo();// current memory usage
ActivityManager activityManager = (ActivityManager) getSystemService(ACTIVITY_SERVICE);
activityManager.getMemoryInfo(mi);
long availableMegs = mi.availMem / 1048576L;
http://developer.android.com/training/articles/perf-tips.html. Have a look at this link for performance tips especially the topic under Prefer Static Over Virtual.
Memory availabilty is one of the criteria's to be considered using static variables and methods for performance and avoiding memory leaks.
This is really a question about trying to reuse objects. You can reuse objects in a static method too if you declare a static member. Separately: yes it's probably better to design this without static anything.
In any event, the upside to reuse is avoiding object creation. You still pay some cost of "clearing" the object's state. Or else, you risk memory leaks in the case of something like a Stack.
There is an ongoing maintenance issue: you add new state to the object, and, did you remember to update the method that clears it?
You also need to now synchronize access to this method or otherwise prevent two threads from using it at once. That could introduce a bottleneck as threads can't execute the method concurrently.
You also always pay the memory cost of this object living in memory for the entire runtime.
In the olden days, people would create object pool abstractions to avoid recreating objects. This has its own complexity and runtime overhead, and are generally well out of favor, since the cost of creating an object and GCing it is so relatively small now.
Trying to reuse objects solely for performance is rarely a performance win. It would have to be in a tight loop and not suffer from several possible problems above to be worth it.
In a game for Android written in Scala, I have plenty of objects that I want to pool. First I tried to have both active (visible) and non active instances in the same pool; this was slow due to filtering that both causes GC and is slow.
So I moved to using two data structures, so when I need to get a free instance, I just take the first from the passive pool and add it to the active pool. I also fast random access to the active pool (when I need to hide an instance). I'm using two ArrayBuffers for this.
So my question is: which data structure would be best for this situation? And how should that (or those) specific data structure(s) be used to add and remove to avoid GC as much as possible and be efficient on Android (memory and cpu constraints)?
The best data structure is an internal list, where you add
var next: MyClass
to every class. The non-active instances then become what's typically called a "free list", while the active ones become a singly-linked list a la List.
This way your overhead is exactly one pointer per object (you can't really get any less than that), and there is no allocation or GC at all. (Unless you want to implement your own by throwing away part or all of the free list if it gets too long.)
You do lose some collections niceness, but you can just make your class be an iterator:
def hasNext = (next != null)
is all you need given that var. (Well, and extends Iterator[MyClass].) If your pool sizes are really quite small, sequential scanning will be fast enough.
If your active pool is too large for sequential scanning down a linked list and elements are not often added or deleted, then you should store them in an ArrayBuffer (which knows how to remove elements when needed). Once you remove an item, throw it on the free list.
If your active pool turns over rapidly (i.e. the number of adds/deletes is similar to the number of random accesses), then you need some sort of hierarchical structure. Scala provides an immutable one that works pretty well in Vector, but no mutable one (as of 2.9); Java also doesn't have something that's really suitable. If you wanted to build your own, a red-black or AVL tree with nodes that keep track of the number of left children is probably the way to go. (It's then a trivial matter to access by index.)
I guess I'll mention my idea. The filter and map methods iterate over the entire collection anyway, so you may as well simplify that and just do a naive scan over your collection (to look for active instances). See here: https://github.com/scala/scala/blob/v2.9.2/src/library/scala/collection/TraversableLike.scala
def filter(p: A => Boolean): Repr = {
val b = newBuilder
for (x <- this)
if (p(x)) b += x
b.result
}
I ran some tests, using a naive scan of n=31 (so I wouldn't have to keep more than a 32 bit Int bitmap), a filter/foreach scan, and a filter/map scan, and a bitmap scan, and randomly assigning 33% of the set to active. I had a running counter to double check that I wasn't cheating by not looking at the right values or something. By the way, this is not running on Android.
Depending on the number of active values, my loop took more time.
Results:
naive scanned a million times in: 197 ms (sanity check: 9000000)
filter/foreach scanned a million times in: 441 ms (sanity check: 9000000)
map scanned a million times in: 816 ms (sanity check: 9000000)
bitmap scanned a million times in: 351 ms (sanity check: 9000000)
Code here--feel free to rip it apart or tell me if there's a better way--I'm fairly new to scala so my feelings won't be hurt: https://github.com/wfreeman/ScalaScanPerformance/blob/master/src/main/scala/scanperformance/ScanPerformance.scala