Can I modify the data set of a PagingDataAdapter using peek()?

Can I modify the data set of a PagingDataAdapter using peek()? - android

I am looking for a way to update specific items in my PagingDataAdapter from the Paging 3 library. The recommended way at the moment seems to be to invalidate the PagingSource but this causes the adapter to fetch the whole data set again, which is not efficient and also shows my loading spinner.
However, I noticed that I can access and modify items in the adapter using the peek() method and it seems to work quite well. Am I missing anything here? Will this fall apart in certain scenarios? I know that it's good practice to keep data classes immutable but this approach makes my life a lot easier.
Here is an example of my usage and it seems to work quite well:
viewModel.chatMessageUpdateEvents.collect { messageEvent ->
when (messageEvent) {
is FirestoreChatMessageListener.ChatMessageUpdateEvent.MessageUpdate -> {
val update = messageEvent.chatMessage
val historyAdapterItems = chatMessagesHistoryAdapter.snapshot().items
val updatedMessage =
historyAdapterItems.find { chatMessage ->
chatMessage.documentId == messageEvent.chatMessage.documentId
}
if (updatedMessage != null) {
val messagePosition = historyAdapterItems.indexOf(updatedMessage)
chatMessagesHistoryAdapter.peek(messagePosition)?.unsent = update.unsent
chatMessagesHistoryAdapter.peek(messagePosition)?.imageUrl = update.imageUrl
chatMessagesHistoryAdapter.notifyItemChanged(messagePosition)
}
}
}
}

I replied in a separate comment but wanted to post here for visibility.
This is really not recommended and a completely unsupported usage of paging.
One of the primary ways of restoring state if Pager().flow() hasn't been cleaned up (say if ViewModel hasn't been cleared yet) is via the .cachedIn(scope) method, which will cache out-of-date data in your case. This is also the only way to multicast (make the loaded data in PagingData re-usable) for usage in Flow operations like .combine() that allow you to mix transformations with external signals.
You'll also need to handle races between in-flight loads, what happens if you get a messageEvent the same time an append finishes? Who wins in this case and is it possible between taking the .snapshot() a new page is inserted so your notify position is no longer correct?
In general it's much simpler to have a single source of truth and this is the recommended path, so the advice has always been to invalidate on every backing dataset update.
There is an open FR in Paging's issue tracker to support Flow<Item> or Flow<Page> style data to allow granular updates, but it's certainly a farther future thing: https://issuetracker.google.com/160232968

Related

Paging 3 RemoteMediator and SavedStateHandle

I'm using the RemoteMediator in an app to load page keyed data. Everything works fine, except when after process death, the data is refreshed.
My current implementation is :
val results = savedStateHandle.get<String>("query").flatMapLatest { query ->
repository.getPager(
query = query,
)
}.cachedIn(viewModelScope)
I do know about the initialize() function of RemoteMediator, but how do I tie it in with process death?

As you found out, .cachedIn just operates in memory, so it won't survive process death. You cannot rely on Paging's internal cache of items in memory for this, you need to cache the loaded items on disk.
I would recommend using something like Room or some dedicated persistence layer that is actually built to handle large lists of arbitrary data classes.
I would not recommend to try to serialize and stash the entire list of data into SavedState as this could become prohibitively expensive quite quickly.
For your other point on RemoteMediator - it is just a "dumb" callback which has no influence on what Paging actually loads or displays. It's simply a way for your to write custom logic which is triggered during edge-case conditions in Paging. You probably only want this if you are already using a layered approach and trying to skip remote REFRESH. If that is your case, the RemoteMediator.intiailize function is guaranteed to complete before Paging starts loading, which means you can check whether you are coming from SavedState and there is already cached data, and if so, you can skip remote REFRESH by returning InitializeAction.SKIP_INITIAL_REFRESH.

Can we use LiveData without loosing any value?

I would like to use a LiveData for handling kind of notifications, as it is already lifecycle aware, between a custom view and its wrapping fragment. But it seems that a LiveData may loose values : it will only update to its most recent state and also won't fire values during inactive state of its observers.
I've looked at the SingleLiveEvent purpose from Google code samples, but that solution does not seems to be battle tested yet, and the ticket is still open with recent tries to improve the solution.
So I am looking for a simple way to get notified about events, and at the same time not being worried about Lifecycles (that was why I went for LiveData as a first solution), and that could handle multiple observers.
Is there an existing solution for that ? If I try to implement it, it is sure that I will land into at least an anti-pattern.
One easy way (perhaps too easy) is to use callbacks : but the problem is that I need this feature for several callbacks in my component, leading me in a poor architecture. And also, I want a subscribe system, meaning that there could be more than one observer.
One other way, could be to use RxJava and tranform it into a LiveData, with LiveDataReactiveStreams.fromPublisher() : but now the question is whether I will get all values or only the last one. That's the closest solution I could deal with.
As an interesting alternative there could be AutoDispose or RxLifecycle. And an interesting resource I've found : Blog post on LiveData
What are your thoughts, suggestions ?
Also, please notice that I need this communication from a component wrapped into a Fragment (ChessBoard) toward another Fragment (ChessHistory). So they are both lifecycle aware.

It is not ideal, but this does the trick for me:
/**
* This LiveData will deliver values even when they are
* posted very quickly one after another.
*/
class ValueKeeperLiveData<T> : MutableLiveData<T>() {
private val queuedValues: Queue<T> = LinkedList<T>()
#Synchronized
override fun postValue(value: T) {
// We queue the value to ensure it is delivered
// even if several ones are posted right after.
// Then we call the base, which will eventually
// call setValue().
queuedValues.offer(value)
super.postValue(value)
}
#MainThread
#Synchronized
override fun setValue(value: T) {
// We first try to remove the value from the queue just
// in case this line was reached from postValue(),
// otherwise we will have it duplicated in the queue.
queuedValues.remove(value)
// We queue the new value and finally deliver the
// entire queue of values to the observers.
queuedValues.offer(value)
while (!queuedValues.isEmpty())
super.setValue(queuedValues.poll())
}
}
The main problem with this solution is that if the observers are inactive at the time the values are delivered via super.setValue(), then the values will be lost regardless. However, it solves the issue of losing values when several new ones are posted very quickly – which, in my opinion, is usually a bigger problem than losing values because your observer is inactive. After all, you can always do myLiveData.observeForever() from a non-lifecycle-aware object in order to receive all notifications.
Not sure this will be enough for you, but I hope it can help you or give you some ideas about how to implement your own approach.

Paging Library with custom DataSource not updating row on Room update

I have been implementing the new Paging Library with a RecyclerView with an app built on top of the Architecture Components.
The data to fill the list is obtained from the Room database. In fact, it is fetched from the network, stored on the local database and provided to the list.
In order to provide the necessary data to build the list, I have implemented my own custom PageKeyedDataSource. Everything works as expected except for one little detail. Once the list is displayed, if any change occurs to the data of a list's row element, it is not automatically updated. So, if for example my list is showing a list of items which have a field name, and suddenly, this field is updated in the local Room database for a certain row item, the list does not update the row UI automatically.
This behaviour only happens when using a custom DataSource unlike when the DataSource is obtained automatically from the DAO, by returning a DataSource Factory directly. However, I need to implement a custom DataSource.
I know it could be updated by calling the invalidate() method on the DataSource to rebuild the updated list. However, if the app is showing 2 lists at a time (half screen each for example), and this item appears in both lists, it would be needed to call invalidate() for both lists separately.
I have thought with a solution in which, instead of using an instance of the item's class to fill each ViewHolder, it uses a LiveData wrapped version of it, to make each row observe for changes on its own item and update that row UI when necessary. Nevertheless, I see some downsides on this approach:
A LifeCycleOwner (such as the Fragment containing the RecyclerView for example) must be passed to the PagedListAdapter and then forward it to the ViewHolder in order to observe the LiveData wrapped item.
A new observer will be registered for each list's new row, so I do not know at all if it has an excessive computational and memory cost, considering it would be done for every list in the app, which has a lot of lists in it.
As the LifeCycleOwner observing the LiveData wrapped item would be, for example, the Fragment containing the RecyclerView, instead of the ViewHolder itself, the observer will be notified every time a change on that item occurs, even if the row containing that item is not even visible at that moment because the list has been scrolled, which seems to me like a waste of resources that could increase the computational cost unnecessarily.
I do not know at all if, even considering those downsides, it could seem like a decent approach or, maybe, if any of you know any other cleaner and better way to manage it.
Thank you in advance.

Quite some time since last checked this question, but for anyone interested, here is the cause of my issue + a library I made to observe LiveData properly from a ViewHolder (to avoid having to use the workaround explained in the question).
My specific issue was due to a bad use of Kotlin's Data Classes. When using them, it is important to note that (as explained in the docs), the toString(), equals(), hashCode() and copy() will only take into account all those properties declared in the class' constructor, ignoring those declared in the class' body. A simple example:
data class MyClass1(val prop: Int, val name: String) {}
data class MyClass2(val prop: Int) {
var name: String = ""
}
fun main() {
val a = MyClass1(1, "a")
val b = MyClass1(1, "b")
println(a == b) //False :) -> a.name != b.name
val c = MyClass2(2)
c.name = "c"
val d = MyClass2(2)
d.name = "d"
println(c == d) //True!! :O -> But c.name != d.name
}
This is specially important when implementing the PagedListAdapter's DiffCallback, as if we are in a example's MyClass2 like scenario, no matter how many times we update the name field in our Room database, as the DiffCallback's areContentsTheSame() method is probably always going to return true, making the list never update on that change.
If the reason explained above is not the reason of your issue, or you just want to be able to observe LiveData instances properly from a ViewHolder, I developed a small library which provides a Lifecycle to any ViewHolder, making it able to observe LiveData instances the proper way (instead of having to use the workaround explained in the question).
https://github.com/Sarquella/LifecycleCells

Creating Observable without using Observable.create

I am using RxJava in my Android app and I want to load data from the database.
In this way, I am creating a new Observable using Observable.create() which returns a list of EventLog
public Observable<List<EventLog>> loadEventLogs() {
return Observable.create(new Observable.OnSubscribe<List<EventLog>>() {
#Override
public void call(Subscriber<? super List<EventLog>> subscriber) {
List<DBEventLog> logs = new Select().from(DBEventLog.class).execute();
List<EventLog> eventLogs = new ArrayList<>(logs.size());
for (int i = 0; i < logs.size(); i++) {
eventLogs.add(new EventLog(logs.get(i)));
}
subscriber.onNext(eventLogs);
}
});
}
Though it works correctly, I read that using Observable.create() is not actually a best practice for Rx Java (see here).
So I changed this method in this way.
public Observable<List<EventLog>> loadEventLogs() {
return Observable.fromCallable(new Func0<List<EventLog>>() {
#Override
public List<EventLog> call() {
List<DBEventLog> logs = new Select().from(DBEventLog.class).execute();
List<EventLog> eventLogs = new ArrayList<>(logs.size());
for (int i = 0; i < logs.size(); i++) {
eventLogs.add(new EventLog(logs.get(i)));
}
return eventLogs;
}
});
}
Is this a better approach using Rx Java? Why? What is actually the difference among the two methods?
Moreover, since the database load a list of elements, makes sense to emit the entire list at once? Or should I emit one item at a time?

The two methods may look like similar and behave similar but fromCallable deals with the difficulties of backpressure for you whereas the create version does not. Dealing with backpressure inside an OnSubscribe implementation ranges from simple to outright mind-melting; however, if omitted, you may get MissingBackpressureExceptions along asynchronous boundaries (such as observeOn) or even on continuation boundaries (such as concat).
RxJava tries to offer proper backpressure support for as much factories and operators as possible, however, there are quite a few factories and operators that can't support it.
The second problem with manual OnSubscribe implementation is the lack of cancellation support, especially if you generate a lot of onNext calls. Many of these can be replaced by standard factory methods (such as from) or helper classes (such as SyncOnSubscribe) that deal with all the complexity for you.
You may find a lot of introductions and examples that (still) use create for two reasons.
It is much easier to introduce push-based datastreams by showing how the push of events work in an imperative fashion. In my opinion, such sources spend too much time with create proportionally instead of talking about the standard factory methods and showing how certain common tasks (such as yours) can be achieved safely.
Many of these examples were created the time RxJava didn't require backpressure support or even proper synchronous cancellation support or were just ported from the Rx.NET examples (which to date doesn't support backpressure and synchronous cancellation works somehow, courtesy of C# I guess.) Generating values by calling onNext was worry-free back then. However, such use does lead to buffer bloat and excessive memory usage, therefore, the Netflix team came up with a way of limiting the memory usage by requiring observers to state how many items are they willing to proceed. This became known as backpressure.
For the second question, namely if one should create a List or a sequence of values, it depends on your source. If your source supports some kind of iteration or streaming of individual data elements (such as JDBC), you can just hook onto it and emit one by one (see SyncOnSubscribe). If it doesn't support it or you need it in List form anyway, then keep it as it is. You can always convert between the two forms via toList and flatMapIterable if necessary.

As it's explain in the response you linked, with Observable.create you'll may need to violate the advanced requirements of RxJava.
For example, you'll need to implement backpressure, or how to unsubscribe.
In you case, you want to emit an item, without having to deal with backpressure or subscription. So Observable.fromCallable is a good call. RxJava will deal with the rest.

RxJava and Cached Data

I'm still fairly new to RxJava and I'm using it in an Android application. I've read a metric ton on the subject but still feel like I'm missing something.
I have the following scenario:
I have data stored in the system which is accessed via various service connections (AIDL) and I need to retrieve data from this system (1-n number of async calls can happen). Rx has helped me a ton in simplifying this code. However, this entire process tends to take a few seconds (upwards of 5 seconds+) therefore I need to cache this data to speed up the native app.
The requirements at this point are:
Initial subscription, the cache will be empty, therefore we have to wait the required time to load. No big deal. After that the data should be cached.
Subsequent loads should pull the data from cache, but then the data should be reloaded and the disk cache should be behind the scenes.
The Problem: I have two Observables - A and B. A contains the nested Observables that pull data from the local services (tons going on here). B is much simpler. B simply contains the code to pull the data from disk cache.
Need to solve:
a) Return a cached item (if cached) and continue to re-load the disk cache.
b) Cache is empty, load the data from system, cache it and return it. Subsequent calls go back to "a".
I've had a few folks recommend a few operations such as flatmap, merge and even subjects but for some reason I'm having trouble connecting the dots.
How can I do this?

Here are a couple options on how to do this. I'll try to explain them as best I can as I go along. This is napkin-code, and I'm using Java8-style lambda syntax because I'm lazy and it's prettier. :)
A subject, like AsyncSubject, would be perfect if you could keep these as instance states in memory, although it sounds like you need to store these to disk. However, I think this approach is worth mentioning just in case you are able to. Also, it's just a nifty technique to know.
AsyncSubject is an Observable that only emits the LAST value published to it (A Subject is both an Observer and an Observable), and will only start emitting after onCompleted has been called. Thus, anything that subscribes after that complete will receive the next value.
In this case, you could have (in an application class or other singleton instance at the app level):
public class MyApplication extends Application {
private final AsyncSubject<Foo> foo = AsyncSubject.create();
/** Asynchronously gets foo and stores it in the subject. */
public void fetchFooAsync() {
// Gets the observable that does all the heavy lifting.
// It should emit one item and then complete.
FooHelper.getTheFooObservable().subscribe(foo);
}
/** Provides the foo for any consumers who need a foo. */
public Observable<Foo> getFoo() {
return foo;
}
}
Deferring the Observable. Observable.defer lets you wait to create an Observable until it is subscribed to. You can use this to allow the disk cache fetch to run in the background, and then return the cached version or, if not in cache, make the real deal.
This version assumes that your getter code, both cache fetch and non- catch creation, are blocking calls, not observables, and the defer does work in the background. For example:
public Observable<Foo> getFoo() {
Observable.defer(() -> {
if (FooHelper.isFooCached()) {
return Observable.just(FooHelper.getFooFromCacheBlocking());
}
return Observable.just(FooHelper.createNewFooBlocking());
}).subscribeOn(Schedulers.io());
}
Use concatWith and take. Here we assume our method to get the Foo from the disk cache either emits a single item and completes or else just completes without emitting, if empty.
public Observable<Foo> getFoo() {
return FooHelper.getCachedFooObservable()
.concatWith(FooHelper.getRealFooObservable())
.take(1);
}
That method should only attempt to fetch the real deal if the cached observable finished empty.
Use amb or ambWith. This is probably one the craziest solutions, but fun to point out. amb basically takes a couple (or more with the overloads) observables and waits until one of them emits an item, then it completely discards the other observable and just takes the one that won the race. The only way this would be useful is if it's possible for the computation step of creating a new Foo to be faster than fetching it from disk. In that case, you could do something like this:
public Observable<Foo> getFoo() {
return Observable.amb(
FooHelper.getCachedFooObservable(),
FooHelper.getRealFooObservable());
}
I kinda prefer Option 3. As far as actually caching it, you could have something like this at one of the entry points (preferably before we're gonna need the Foo, since as you said this is a long-running operation) Later consumers should get the cached version as long as it has finished writing. Using an AsyncSubject here may help as well, to make sure we don't trigger the work multiple times while waiting for it to be written. The consumers would only get the completed result, but again, that only works if it can be reasonably kept around in memory.
if (!FooHelper.isFooCached()) {
getFoo()
.subscribeOn(Schedulers.io())
.subscribe((foo) -> FooHelper.cacheTheFoo(foo));
}
Note that, you should either keep around a single thread scheduler meant for disk writing (and reading) and use .observeOn(foo) after .subscribeOn(...), or otherwise synchronize access to the disk cache to prevent concurrency issues.

I’ve recently published a library on Github for Android and Java, called RxCache, which meets your needs about caching data using observables.
RxCache implements two caching layers -memory and disk, and it counts with several annotations in order to configure the behaviour of every provider.
It is highly recommended to use with Retrofit for data retrieved from http calls. Using lambda expression, you can formulate expression as follows:
rxCache.getUser(retrofit.getUser(id), () -> true).flatmap(user -> user);
I hope you will find it interesting :)

Take a look at the project below. This is my personal take on things and I have used this pattern in a number of apps.
https://github.com/zsiegel/rxandroid-architecture-sample
Take a look at the PersistenceService. Rather than hitting the database (or MockService in the example project) you could simply have a local list of users that are updated with the save() method and just return that in the get().
Let me know if you have any questions.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.