Xamarin Android garbage collection algorithm

Xamarin Android garbage collection algorithm - android

I am reading the Xamarin.Android garbage collection docs about helping the GC perform better by reducing referenced instances.
The section begins by saying:
Whenever an instance of a Java.Lang.Object type or subclass is scanned during the GC, the entire object graph that the instance refers to must also be scanned. The object graph is the set of object instances that the "root instance" refers to, plus everything referenced by what the root instance refers to, recursively.
...which I understand.
It then goes to show a custom class inheriting from the standard Activity class. This custom activity class has a field that is a list of strings which is initialized in the constructor to have 10,000 strings. This is said to be bad because all 10,000 instances will have to be scanned for reachability during GC. That I also understand.
The part that I am not clear on, is the recommended fix: it says the List<string> field should be moved to another class that doesn't inherit from Java.Lang.Object and then an instance of that class should be referenced from the activity just like the list was being referenced before.
My question: how does pushing a field deeper into the object graph help the GC when the total number of instances is still 10,000 and the opening paragraph says they will be scanned eventually because the process is recursive?
As a side note, I am also reading up (here) on the SGen GC used by Mono on Android and the object graph traversal process is described as being breadth-first starting with the GC roots. This explains how a 10,000 item list will cause a longer GC pause as each item is checked, but still doesn't explain how moving that list deeper into the graph will help because the GC will eventually scan it as it goes deeper into the graph.

I'll try to explain this the best I can, and I'm nowhere near an expert here so anyone who wants to chime in, please do so.
When we are referring to doing a peer walk, we are locating any roots and traversing the live reference graph to see what is reachable and what is not:
Root Objects:
Objects pointed at by static fields / properties
Objects on the stack of each managed thread
Objects that have been passed into native APIs
Basically you then have to deal with two managed GCs. We'll call them the Xamarin GC and the Android GC for reference.
Xamarin.Android has peer objects which are used to reference the native Java objects known in the Android JVM. They implement a core interface:
namespace Android.Runtime
{
public interface IJavaObject : IDisposable
{
// JNI reference to the Java object it is wrapping.
// Also known as a pointer to the JVM object
public IntPtr Handle { get; set; }
...
}
}
Whenever we have an object with IJavaObject inherited, it will keep a strong reference via that JNI handle above to ensure it is kept alive as long as the managed object is alive.
Think of it this way:
IJavaObject -> IntPtr Handle -> Java Object
In GC terms, it would be represented as the following:
Allocated and collected by Xamarin GC -> GC Root -> Allocated and collected by Android GC
We then have a GC process in Xamarin.Android:
When the GC runs, you can see that it will replace a strong JNI handle with a weak reference and then invoke the Android GC which will collect our Java object. Because of this, the peers are scanned for any relationships to ensure that they are mirrored in the JVM. This keeps these objects from being collected prematurely.
Once this happens, we run the Android GC and when it's finished it will walk through the peer objects and check the weak references.
If an object is gone, we collect it on the C# side
If an object still exists, then we change the weak reference back to a strong JNI handle
Thus this graph needs to be checked and updated each time a GC runs on peer objects. That's why it's much slower for these wrapper type objects because the entire object graph has to be scanned starting at the peer object.
So when there are significant object graphs that our peer object uses, we can help out the GC process by moving the storage of the references outside the peer class. This is usually done by rooting our reference independent of the peer. And since it's not stored as a field, the GC will not try to do a relationship walk on the object graph.
As noted earlier, this isn't a huge issue to worry about until you notice long GCs. You can then use this as a solution.
Image Credit: Xamarin University(https://www.xamarin.com/university)

Related

Android Garbage collector Runs on Main Thread?

Example:
Lets say ideally that object is garbage collectable (activity changed orientation and strong reference to object lost) but not yet disposed. So line 2 will return true. Is there any way that object get disposed while execution is on line 3? Or it wait until it finishes?
new Thread {
WeakReference item= new WeakReference(object);
void method(){
2 if(item.get()!=null)
3 item.get().getName();
}
}

If you have strong reference to an object then that object is not eligible for GC.
There is no way strong referenced object will get disposed in code between null check and next line... or any other line as long as you can access that object reference. Only if you set that object reference to null, or you assign another object to that reference, previous object can be garbage collected if there are no other references pointing to it.
On the other hand, when you are dealing with weak references (of any kind) first you have to take strong reference out of weak reference wrapper and then you can safely use that strong reference further on (after you check it is not null, of course). If you don't take strong reference, object in weak wrapper can vanish at any time.
Wrong usage - object can be collected between null check and getName call
if(item.get()!=null)
item.get().getName();
Correct usage - taking strong reference for further processing
Object object = item.get();
if(object!=null)
object.getName();

First of all Garbage Collector does not run on what you think as your proccess's main thread.
When looking from Operating System's prospective, GC may run either in main thread of the virtual machine that runs your application. Or it can run on a new thread.
But from Java prospective, GC does not run on any of your application's thread. The thread that run GC is neither your Java main thread nor a Java thread accessible to you.
From the prospective of your Java code, the main thread and all other thread are stopped (removed from scheduler) while GC runs. This is not always true though. But that is up to the VM implementation. But you must always assume that all your Java threads, including main thread are stopped while GC runs.
So, to precisely answer your question, **
Yes, your week reference can become null in the second line.
**
Your code can get a NullPointerException in line three.
Because line 2 and 3 are two seperate non antomic operations. It is possible that GC can cick in after executing line 2 , stop execution of all your threads, do garbage collection , and then resume all your threads causing a NullPointerException to occur at line 3.

Android native strong pointer vs std::shared_ptr

I'm referring to Refbase.h, Refbase.cpp and StrongPointer.h
In the Android implementation of strong pointer, any strong-pointer based object must inherit refbase i.e.
sp<TheClass> theObj // TheClass must inherit from class RefBase
This requirement can be seen in the code for one of sp's methods:
template<typename T> sp<T>& sp<T>::operator =(T* other) {
if (other != NULL) {
other->incStrong(this);
}
if (mPtr != NULL) {
mPtr->decStrong(this);
}
mPtr = other;
return *this;
}
In order for call to incStrong or decStrong to not fail . . . other and mPtr must have inherited RefBase
QUESTION
Why is sp implemented such that the obj that it's managing is required to be a child of RefBase? There's not even a way to enforce this requirement at compile-time or even runtime. (Well maybe if(type()...)
Std library doesn't have such a requirement
...
Upon further thought, is the answer that this provides flexibility?
If yes, how does this provide flexibility?

It saves a memory allocation. When you write:
std::shared_ptr<Foo> pFoo{new Foo(bar)};
pFoo actually has a pointer to a shared data structure (allocated on the heap), which has the reference counters, and the pointer to the actual Foo object. By making objects be derived from RefBase, you can embed the reference counts in the object itself (saving the additional memory allocation).
Interestingly, with C++11 onwards, you can avoid the additional memory allocation by using std::make_shared<Foo> which will do a single memory allocation and construct the shared data structure and the Foo object in it.
The fact there is no compile time checking of the derivation from RefBase is carelessness. m_ptr should have been declared as RefBase *m_ptr, and then operator * (etc) should have done a static_cast to T*. In fact, I would probably have made sp<T> inherit from sp_base which had the comparison operators as public, and the other functions as protected.
Edit
On second thoughts, there is quite a bit of compile time checking. If T doesn't have an incStrong member, the compilation will fail, and it almost certainly won't unless it derives from RefBase. I still think converting a T* to a RefBase* would have been a better check, but the one that is there is probably good enough.

It automatically allows you to create sp from any object implementing RefBase, while for shared pointer you can shoot yourself in the foot while trying to wrap raw pointer into shared one.
So while for shared_ptr you might need this:
http://en.cppreference.com/w/cpp/memory/enable_shared_from_this
for sp you can almost safely pass raw pointer to sp contructor.

Does retrieving a parcelable object through bundle always create new copy?

I'm passing a parcelable object to a fragment by adding into a bundle while creating the fragment. In onc instance modification to this parcelled object reflects modification in original object and in another case it is not. I'm a little baffled by this behaviour.
Till now I have assumed retrieving a parcelled objects through a bundle always create new object[no sure whether it's shallow copy or deep copy].
Someone please clarify parcelable behaviour.

I was struggling with a similar issue. At the first glance it seems that we always obtain a new deep copy from the parcelled objects. Moreover, there are even some StackOverflow answers which suggest to use Parcelable interface to clone objects. All this just increases confusion regarding the subject.
Here is what I've found after a lot of searching and googling:
Take a closer look at the official Parcel documentation. Here is the important quote:
An unusual feature of Parcel is the ability to read and write active
objects. For these objects the actual contents of the object is not
written, rather a special token referencing the object is written.
When reading the object back from the Parcel, you do not get a new
instance of the object, but rather a handle that operates on the
exact same object that was originally written.
Ok, as you can see, there are some special objects that are not being copyed during unparceling. But this is still a bit confusing. Does it mean we have another strong reference to the original object which prevents its garbage collection? And what are the use-cases for such objects?
To answer the aforementioned questions I decided to look through the Android source code. The methods I was looking for are readStrongBinder and writeStrongBinder which according to the docs do not cause a new object creation when the parcels are sent/received. And I think I found the desired answer in the ResultReceiver.java class. Here is the interesting line:
mReceiver = IResultReceiver.Stub.asInterface(in.readStrongBinder());
To understand what is this line actually doing we should go to the official AIDL documentation. Here are the most important parts:
The steps a calling class must take to call a remote interface defined
with AIDL:
...
5. In your implementation of onServiceConnected(), you will receive an
IBinder instance (called service). Call
YourInterfaceName.Stub.asInterface((IBinder)service) to cast the
returned parameter to YourInterface type.
A few comments on calling an IPC service:
Objects are reference counted across processes.
So let's put all things together:
The parcelled objects can be extracted without involving deep copy process.
If the parcelled objects are read using readStrongBinder method no new instances are being created. We just objtain a new reference to the original object and this reference can prevent its dealllocation.
To know whether our object will be deep copyed after the parcel has been received we should take a closer look at the concrete Parcelable interface implementation.
Android documentation can be really confusing and it may take a lot of time to understand it correctly.
Hope this info will help you.
If you want to read about a real-world example when the confusion regarding Parcelable objects can cause serious problems check out my blog post.

Missing Virtual Destructor Memory Effects

According to the standard, polymorphism with a missing virtual destructor leads to undefined behavior. In practice, it really leads to the destructor for the derived class not being called when the parent class is deleted. However, does it also lead to memory leaks in any common compilers/systems? I'm particularly interested in g++ on Android/Linux.
Specifically, I'm referring to whether the deletion of memory for the derived class will somehow leak. Consider:
class Base {}
class Derived {
int x;
}
If I delete a Base* to a Derived, will I leak 4 bytes? Or does the memory allocator already know how many bytes to free based on the allocation?

It certainly can do. Consider:
class A
{
public:
virtual void func() {}
};
class B : public A
{
public:
void func() { s = "Some Long String xxxxxx"; }
private:
std::string s;
// destructor of B will call `std::string` destructor.
};
A* func(bool b)
{
if (b)
return new B;
return new A;
}
...
A* a = func(true);
...
delete a;
Now, this will create a memory leak, as std::string s in the B object is not freed by A::~A - you need to call B::~B, which will only happen if the destructor is virtual.
Note that this applies to ALL compilers and all runtime systems that I'm aware of (which is all the common ones and some not so common ones).
Edit:
Based on the updated actual question: Memory de-allocation happens based on the allocated size, so if you can GUARANTEE that there NEVER is a single allocation happening because of the construction/use of the class, then it's safe to not have a virtual destructor. However, this leads to interesting issues if a "customer" of the base-class can make his/her own extension classes. Marking derived classes as final will protect against them being further derived, but if the base class is visible in a header-file that others can include, then you run the risk of someone deriving their own class from Base that does something that allocates.
So, in other words, in something like a PImpl, where the Impl class is hidden inside a source file that nobody else derives from, it's plausible to have this. For most other cases, probably a bad idea.

A missing destructor will cause undefined behavior specifically because it's implausible for the compiler to know exactly what the side effects might be.
Think of it as the cleanup side of RAII. In that case, if you manage to not clean up despite claiming that you did, side effects might be:
Leaked memory (you allocated something... when do you deallocate it now?)
Deadlocks (you locked something... when do you unlock it now?)
Sockets remaining open (you opened it sometime... but now when do you close it?)
Files remaining open (you opened it sometime... but now when do you flush it?)
Accessing invalid pointers (for example, you updated a pointer to some member... but now when do you unset it?)
Your hard drive gets erased (technically this is a valid answer for any undefined behavior)

This should cause Undefined Behaviour which means it might also cause memory leaks. In 5.3.5/3 (n4296 c++14) for delete you have:
In the first alternative (delete object), if the static type of the object to be deleted is different from its
dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the
static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete
array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.

JNI GlobalReference (New/Delete) and java.nio.ByteBuffer what's the relationship in Android NDK context

Have been using java.nio.ByteBuffers on the NDK side for a while now - noticed this article about Android relationship with JNI, GC and future of ICS. Article here http://android-developers.blogspot.com/2011/11/jni-local-reference-changes-in-ics.html
So... here is the concern:
Since the "pointer" that JNI provides seems to actually be a reference that is managed by the JNI internaly - it could be "moved" or deleted by GC at some point if it is not marked as NewGlobalReference() in JNI method before being passed to c++ classes?
In my JNI methods I take the Direct Buffer address and pass it on to classes that use it, without any
env->NewGlobalRef(jobject);
env->NewLocalRef(jobject);
env->DeleteGlobalRef(jobject);
management.
For now it all works - but is it correct?
Thoughts?
P.S - I do use free(ByteBuffer) on exit/destructor in c++

A local reference is only valid for the duration of the JNI method that it is passed to or created in. After that method returns to the JVM, the reference is no longer valid. If you're not breaking that rule, you're OK.

It's a bit unclear what you're asking, so let me try to clarify a few points.
Any jobject type you get in JNI, whether returned from a JNI call like FindClass or passed in as an argument (jobject, jclass, jbyteArray, etc), is a local reference. It has a very short lifespan. If you pass it to NewGlobalRef, you get a global reference in return; this will last until you delete it.
Any JNI function that takes or returns a pointer type is giving you a pointer that's good until something invalidates it. For example, if you call GetStringUTFChars, you get a const char* that's valid until you call ReleaseStringUTFChars.
References are not pointers, and pointers are not references. You can't pass a char* to NewGlobalRef, and you can't dereference a global reference (where "can't" is usually an error or a native crash).
What I assume you're doing is calling GetDirectByteBufferAddress on a ByteBuffer object, which returns a void* that points to the start of the direct byte buffer. This pointer is valid until the storage is freed. How that happens depends upon how you allocated it:
If you allocated the direct byte buffer with ByteBuffer.allocateDirect(), then Dalvik owns the storage. It will be freed when the ByteBuffer becomes unreachable and is garbage collected.
If you allocated the storage yourself, and associated it with a ByteBuffer with the JNI NewDirectByteBuffer call, then it's valid until you free it.
For the allocateDirect() case, it's very important that your native code stops using the pointer before your managed code discards the ByteBuffer. One way to do this would be to retain a global reference to the ByteBuffer in your native code, and invalidate your buffer pointer at the same time you delete the global reference. Depending on how your code is structured that may not be necessary.
See also the JNI Tips page.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.