I had two classes Error and Alert, and a bunch of Rx streams of Error type.
For simplification let's agreed upon two streams.
private val exampleErrorStream1 = PublishSubject.create<Error>()
private val exampleErrorStream2 = PublishSubject.create<Error>()
The whole point is to map error streams to alert streams with according names:
private val exampleAlertStream1 = PublishSubject.create<Alert>()
private val exampleAlertStream2 = PublishSubject.create<Alert>()
Also, I declared a map where:
key is pair of those streams
value is mapper function for each stream transformation
private val errorToAlerts = mutableMapOf<Pair<Subject<Error>, Subject<Alert>>, (Error) -> Alert>(
Pair(exampleErrorStream1, exampleAlertStream1) to { Alert(it.message, 1)},
Pair(exampleErrorStream2, exampleAlertStream2) to { Alert(it.message, 2)}
)
Finally, I run the method once on the app start for mapping those streams:
fun mapErrorsToAlerts() {
errorToAlerts.forEach { (streams, toAlert) ->
val (errorStream, alertStream) = streams
errorStream.map(toAlert).throttleByPriority()
.doOnNext {
Log.d("Alert","New alert: $it")
}
.subscribe(alertStream)
}
}
The only problem I had is throttling alerts depending on one of the Alert fields.
The throttleByPriority() is an extension function:
private fun Observable<Alert>.throttleByPriority(): Observable<Alert> {
return this.flatMap {
val time = if (it.priority == 1) 5L else 10L
throttleFirst(time, TimeUnit.SECONDS)
}
}
The throttling is not working how I imagined it.
I have 10 seconds window but the same events are emitted one by one despite that.
I assumed that the problem is with flatMap where all previous Observables are kept alive, but I'm not sure of that and I don't know how to achieve this differently.
Related
I have a connection to a Bluetooth device that emits data every 250ms
In my viewmodel I wish to subscribe to said data , run some suspending code (which takes approximatelly 1000ms to run) and then present the result.
the following is a simple example of what I'm trying to do
Repository:
class Repo() : CoroutineScope {
private val supervisor = SupervisorJob()
override val coroutineContext: CoroutineContext = supervisor + Dispatchers.Default
private val _dataFlow = MutableSharedFlow<Int>()
private var dataJob: Job? = null
val dataFlow: Flow<Int> = _dataFlow
init {
launch {
var counter = 0
while (true) {
counter++
Log.d("Repo", "emmitting $counter")
_dataFlow.emit(counter)
delay(250)
}
}
}
}
the viewmodel
class VM(app:Application):AndroidViewModel(app) {
private val _reading = MutableLiveData<String>()
val latestReading :LiveData<String>() = _reading
init {
viewModelScope.launch(Dispatchers.Main) {
repo.dataFlow
.map {
validateData() //this is where some validation happens it is very fast
}
.flowOn(Dispatchers.Default)
.forEach {
delay(1000) //this is to simulate the work that is done,
}
.flowOn(Dispatchers.IO)
.map {
transformData() //this will transform the data to be human readable
}
.flowOn(Dispatchers.Default)
.collect {
_reading.postValue(it)
}
}
}
}
as you can see, when data comes, first I validate it to make sure it is not corrupt (on Default dispatcher) then I perform some operation on it (saving and running a long algorithm that takes time on the IO dispatcher) then I change it so the application user can understand it (switching back to Default dispatcher) then I post it to mutable live data so if there is a subscriber from the ui layer they can see the current data (on the Main dispatcher)
I have two questions
a) If validateData fails how can I cancel the current emission and move on to the next one?
b) Is there a way for the dataFlow subscriber working on the viewModel to generate new threads so the delay parts can run in parallel?
the timeline right now looks like the first part, but I want it to run like the second one
Is there a way to do this?
I've tried using buffer() which as the documentation states "Buffers flow emissions via channel of a specified capacity and runs collector in a separate coroutine." but when I set it to BufferOverflow.SUSPEND I get the behaviour of the first part, and when I set it to BufferOverflow.DROP_OLDEST or BufferOverflow.DORP_LATEST I loose emissions
I have also tried using .conflate() like so:
repo.dataFlow
.conflate()
.map { ....
and even though the emissions start one after the other, the part with the delay still waits for the previous one to finish before starting the next one
when I use .flowOn(Dispatchers.Default) for that part , I loose emissions, and when I use .flowOn(Dispatchers.IO) or something like Executors.newFixedThreadPool(4).asCoroutineDispatcher() they always wait for the previous one to finish before starting a new one
Edit 2:
After about 3 hours of experiments this seems to work
viewModelScope.launch(Dispatchers.Default) {
repo.dataFlow
.map {
validateData(it)
}
.flowOn(Dispatchers.Default)
.map {
async {
delay(1000)
it
}
}
.flowOn(Dispatchers.IO) // NOTE (A)
.map {
val result = it.await()
transformData(result)
}
.flowOn(Dispatchers.Default)
.collect {
_readings.postValue(it)
}
}
however I still haven't figured out how to cancel the emission if validatedata fails
and for some reason it only works if I use Dispatchers.IO , Executors.newFixedThreadPool(20).asCoroutineDispatcher() and Dispatchers.Unconfined where I put note (A), Dispatchers.Main does not seem to work (which I expected) but Dispatchers.Default also does not seem to work and I don't know why
First question: Well you cannot recover from an exception in a sense of continuing
the collection of the flow, as per docs "Flow collection can complete with an exception when an emitter or code inside the operators throw an exception." therefore once an exception has been thrown the collection is completed (exceptionally) you can however handle the exception by either wrapping your collection inside try/catch block or using the catch() operator.
Second question: You cannot, while the producer (emitting side) can be made concurrent
by using the buffer() operator, collection is always sequential.
As per your diagram, you need fan out (one producer, many consumers), you cannot
achieve that with flows. Flows are cold, each time you collect from them, they start
emitting from the beginning.
Fan out can be achieved using channels, where you can have one coroutine producing
values and many coroutines that consume those values.
Edit: Oh you meant the validation failed not the function itself, in that case you can use the filter() operator.
The BroadcastChannel and ConflatedBroadcastChannel are getting deprecated. SharedFlow cannot help you in your use case, as they emit values in a broadcast fashion, meaning producer waits until all consumers consume each value before producing the next one. That is still sequential, you need parallelism. You can achieve it using the produce() channel builder.
A simple example:
val scope = CoroutineScope(Job() + Dispatchers.IO)
val producer: ReceiveChannel<Int> = scope.produce {
var counter = 0
val startTime = System.currentTimeMillis()
while (isActive) {
counter++
send(counter)
println("producer produced $counter at ${System.currentTimeMillis() - startTime} ms from the beginning")
delay(250)
}
}
val consumerOne = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerOne consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
val consumerTwo = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerTwo consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
val consumerThree = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerThree consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
Observe production and consumption times.
I'm investigating the use of Kotlin Flow within my current Android application
My application retrieves its data from a remote server via Retrofit API calls.
Some of these API's return 50,000 data items in 500 item pages.
Each API response contains an HTTP Link header containing the Next pages complete URL.
These calls can take up to 2 seconds to complete.
In an attempt to reduce the elapsed time I have employed a Kotlin Flow to concurrently process each page
of data while also making the next page API call.
My flow is defined as follows:
private val persistenceThreadPool = Executors.newFixedThreadPool(3).asCoroutineDispatcher()
private val internalWorkWorkState = MutableStateFlow<Response<List<MyPage>>?>(null)
private val workWorkState = internalWorkWorkState.asStateFlow()
private val myJob: Job
init {
myJob = GlobalScope.launch(persistenceThreadPool) {
workWorkState.collect { page ->
if (page == null) {
} else managePage(page!!)
}
}
}
My Recursive function is defined as follows that fetches all pages:-
private suspend fun managePages(accessToken: String, response: Response<List<MyPage>>) {
when {
result != null -> return
response.isSuccessful -> internalWorkWorkState.emit(response)
else -> {
manageError(response.errorBody())
result = Result.failure()
return
}
}
response.headers().filter { it.first == HTTP_HEADER_LINK && it.second.contains(REL_NEXT) }.forEach {
val parts = it.second.split(OPEN_ANGLE, CLOSE_ANGLE)
if (parts.size >= 2) {
managePages(accessToken, service.myApiCall(accessToken, parts[1]))
}
}
}
private suspend fun managePage(response: Response<List<MyPage>>) {
val pages = response.body()
pages?.let {
persistResponse(it)
}
}
private suspend fun persistResponse(myPage: List<MyPage>) {
val myPageDOs = ArrayList<MyPageDO>()
myPage.forEach { page ->
myPageDOs.add(page.mapDO())
}
database.myPageDAO().insertAsync(myPageDOs)
}
My numerous issues are
This code does not insert all data items that I retrieve
How do complete the flow when all data items have been retrieved
How do I complete the GlobalScope job once all the data items have been retrieved and persisted
UPDATE
By making the following changes I have managed to insert all the data
private val persistenceThreadPool = Executors.newFixedThreadPool(3).asCoroutineDispatcher()
private val completed = CompletableDeferred<Int>()
private val channel = Channel<Response<List<MyPage>>?>(UNLIMITED)
private val channelFlow = channel.consumeAsFlow().flowOn(persistenceThreadPool)
private val frank: Job
init {
frank = GlobalScope.launch(persistenceThreadPool) {
channelFlow.collect { page ->
if (page == null) {
completed.complete(totalItems)
} else managePage(page!!)
}
}
}
...
...
...
channel.send(null)
completed.await()
return result ?: Result.success(outputData)
I do not like having to rely on a CompletableDeferred, is there a better approach than this to know when the Flow has completed everything?
You are looking for the flow builder and Flow.buffer():
suspend fun getData(): Flow<Data> = flow {
var pageData: List<Data>
var pageUrl: String? = "bla"
while (pageUrl != null) {
TODO("fetch pageData from pageUrl and change pageUrl to the next page")
emitAll(pageData)
}
}
.flowOn(Dispatchers.IO /* no need for a thread pool executor, IO does it automatically */)
.buffer(3)
You can use it just like a normal Flow, iterate, etc. If you want to know the total length of the output, you should calculate it on the consumer with a mutable closure variable. Note you shouldn't need to use GlobalScope anywhere (ideally ever).
There are a few ways to achieve the desired behaviour. I would suggest to use coroutineScope which is designed specifically for parallel decomposition. It also provides good cancellation and error handling behaviour out of the box. In conjunction with Channel.close behaviour it makes the implementation pretty simple. Conceptually the implementation may look like this:
suspend fun fetchAllPages() {
coroutineScope {
val channel = Channel<MyPage>(Channel.UNLIMITED)
launch(Dispatchers.IO){ loadData(channel) }
launch(Dispatchers.IO){ processData(channel) }
}
}
suspend fun loadData(sendChannel: SendChannel<MyPage>){
while(hasMoreData()){
sendChannel.send(loadPage())
}
sendChannel.close()
}
suspend fun processData(channel: ReceiveChannel<MyPage>){
for(page in channel){
// process page
}
}
It works in the following way:
coroutineScope suspends until all children are finished. So you don't need CompletableDeferred anymore.
loadData() loads pages in cycle and posts them into the channel. It closes the channel as soon as all pages have been loaded.
processData fetches items from the channel one by one and process them. The cycle will finish as soon as all the items have been processed (and the channel has been closed).
In this implementation the producer coroutine works independently, with no back-pressure, so it can take a lot of memory if the processing is slow. Limit the buffer capacity to have the producer coroutine suspend when the buffer is full.
It might be also a good idea to use channels fan-out behaviour to launch multiple processors to speed up the computation.
I am writing an Android app using Rxjava, the app simply collects sensor data and stores them in files when a buffer is full. For this purpose, I am using a PublishProcessor that emits the value every time a sensor event is detected.
I Have the following helper classes:
interface RxVariableInterface<T,R>{
var value : T
val observable : R
}
//Value is received on subscribing
class ProcessablePublishVariable<T> (defaultValue: T) : RxVariableInterface<T,PublishProcessor<T>>{
override var value: T = defaultValue
set(value) {
field = value
observable.onNext(value)
}
override val observable = PublishProcessor.create<T>()
}
...
var imuProcessablePublishVariable : ProcessablePublishVariable<SensorProto6> = ProcessablePublishVariable(SensorProto6( ... ))
When a sensor event occurs I just do the following:
imuProcessablePublishVariable.value = SensorProto6(...)
In the listener side, I've created an Observer which does the data packing in text files and subscribes with a buffer operator:
class MySubscriber<List<SensorProto6>> : Subscriber<List<SensorProto6>> {
var subscription : Subscription? = null
override fun onError(e: Throwable) {
Log.e("RxJavaHAndlerProcessor","${e.stackTrace}")
}
override fun onSubscribe(s: Subscription) {
subscription = s
subscription!!.request(1)
}
override fun onComplete() { ... }
override fun onNext(t: List<SensorProto6>) {
// Post the buffer list to the writer thread
mWorkerHandler?.post(WriterRunnable(t, mContext,SensorProtos.SensorHeader.SensorType.IMU))
subscription!!.request(1)
}
}
...
imuProcessablePublishVariable.observable
.subscribeOn(Schedulers.io())
.buffer(500)
.observeOn(Schedulers.io())
.subscribe(mySubscriber)
Everything is working as expected, I receive lists of sensor readings containing 500 elements. Is there a way to flush the buffer and emits a partial list?
E.g. the user stops the app when the buffer is 70% full, I would like to retrieve the pending list without waiting for the buffer to be full. Is there another way to implement this functionality?
My app is collecting sensor values from the accelerometer with the highest possible sample rate (~200 Hz on my device) and saves the values inside a Room database. I also want to frequently update some graphs with the latest measurements, lets say a refresh rate of 5 times per second. Ever since the app also collects the linear acceleration (without g) also with ~200 Hz (so two sensors each with roughly 200Hz inserting values into the database) I noticed a strong decrease in the apps performance and I have a lag of a few seconds between collected acceleration values and them showing up in the plot.
From the profiler my guess is that the RxComputationThread is the bottleneck since it is active almost all the time due to the Flowables.
I use sample() to limit the receiver updates since my graphs do not need to update super often. This led to an acceptable performance, when I just collected one sensor. I saw that RxJava provides an interval() method to limit the emit frequency from an emitter side, but that does not seem to available to me ? (Unresolved reference).
Maybe someone has an idea how to improve the performance? I like the concepts of RxJava and Room in general and would like to stick with them, but I am pretty much stuck at this point.
Here is the code I use to observe the Room SQL table and update the graphs:
// Observe changes to the datasource and create a new subscription if necessary
sharedViewModel.dataSource.observe(viewLifecycleOwner, Observer { source ->
Log.d("TAG", "Change observed!")
when (source) {
"acc" -> {
val disposableDataSource =
sharedViewModel.lastSecondsAccelerations
.sample(200, TimeUnit.MILLISECONDS)
.onBackpressureDrop()
.subscribeOn(Schedulers.io())
.subscribe { lastMeasurements ->
Log.d("TAG", Thread.currentThread().name)
if (sharedViewModel.isReset.value == true && lastMeasurements.isNotEmpty()) {
val t =
lastMeasurements.map { (it.time.toDouble() * 1e-9) - (lastMeasurements.last().time.toDouble() * 1e-9) }
val accX = lastMeasurements.map { it.accX.toDouble() }
val accY = lastMeasurements.map { it.accY.toDouble() }
val accZ = lastMeasurements.map { it.accZ.toDouble() }
// Update plots
updatePlots(t, accX, accY, accZ)
}
}
compositeDisposable.clear()
compositeDisposable.add(disposableDataSource)
}
"lin_acc" -> {
val disposableDataSource =
sharedViewModel.lastSecondsLinAccelerations
.sample(200, TimeUnit.MILLISECONDS)
.onBackpressureDrop()
.subscribeOn(Schedulers.io())
.subscribe { lastMeasurements ->
Log.d("TAG", Thread.currentThread().name)
if (sharedViewModel.isReset.value == true && lastMeasurements.isNotEmpty()) {
val t =
lastMeasurements.map { (it.time.toDouble() * 1e-9) - (lastMeasurements.last().time.toDouble() * 1e-9) }
val accX = lastMeasurements.map { it.accX.toDouble() }
val accY = lastMeasurements.map { it.accY.toDouble() }
val accZ = lastMeasurements.map { it.accZ.toDouble() }
// Update plots
updatePlots(t, accX, accY, accZ)
}
}
compositeDisposable.clear()
compositeDisposable.add(disposableDataSource)
}
}
})
The query for getting the last 10 seconds of measurements
#Query("SELECT * FROM acc_measurements_table WHERE time > ((SELECT MAX(time) from acc_measurements_table)- 1e10)")
fun getLastAccelerations(): Flowable<List<AccMeasurement>>
Thanks for your comments, I figured out now, what the bottleneck was. The issue was the huge amount of insertion calls, not too surprising. But it is possible to improve the performance by using some kind of buffer to insert multiple rows at a time.
This is what I added, in case someone runs in the same situation:
class InsertHelper(private val repository: Repository){
var compositeDisposable = CompositeDisposable()
private val measurementListAcc: FlowableList<AccMeasurement> = FlowableList()
private val measurementListLinAcc: FlowableList<LinAccMeasurement> = FlowableList()
fun insertAcc(measurement: AccMeasurement) {
measurementListAcc.add(measurement)
}
fun insertLinAcc(measurement: LinAccMeasurement) {
measurementListLinAcc.add(measurement)
}
init {
val disposableAcc = measurementListAcc.subject
.buffer(50)
.subscribe {measurements ->
GlobalScope.launch {
repository.insertAcc(measurements)
}
measurementListAcc.remove(measurements as ArrayList<AccMeasurement>)
}
val disposableLinAcc = measurementListLinAcc.subject
.buffer(50)
.subscribe {measurements ->
GlobalScope.launch {
repository.insertLinAcc(measurements)
}
measurementListLinAcc.remove(measurements as ArrayList<LinAccMeasurement>)
}
compositeDisposable.add(disposableAcc)
compositeDisposable.add(disposableLinAcc)
}
}
// Dynamic list that can be subscribed on
class FlowableList<T> {
private val list: MutableList<T> = ArrayList()
val subject = PublishSubject.create<T>()
fun add(value: T) {
list.add(value)
subject.onNext(value)
}
fun remove(value: ArrayList<T>) {
list.removeAll(value)
}
}
I basically use a dynamic list to buffer a few dozens measurement samples, then insert them as whole in the Room Database and remove them from the dynamic list. Here is also some information why batch insertion is faster: https://hackernoon.com/squeezing-performance-from-sqlite-insertions-with-room-d769512f8330
Im still quite new to Android Development, so if you see some mistakes or have suggestions, I appreciate every comment :)
I used a PublishSubject and I was sending messages to it and also I was listening for results. It worked flawlessly, but now I'm not sure how to do the same thing with Kotlin's coroutines (flows or channels).
private val subject = PublishProcessor.create<Boolean>>()
...
fun someMethod(b: Boolean) {
subject.onNext(b)
}
fun observe() {
subject.debounce(500, TimeUnit.MILLISECONDS)
.subscribe { /* value received */ }
}
Since I need the debounce operator I really wanted to do the same thing with flows so I created a channel and then I tried to create a flow from that channel and listen to changes, but I'm not getting any results.
private val channel = Channel<Boolean>()
...
fun someMethod(b: Boolean) {
channel.send(b)
}
fun observe() {
flow {
channel.consumeEach { value ->
emit(value)
}
}.debounce(500, TimeUnit.MILLISECONDS)
.onEach {
// value received
}
}
What is wrong?
Flow is a cold asynchronous stream, just like an Observable.
All transformations on the flow, such as map and filter do not trigger flow collection or execution, only terminal operators (e.g. single) do trigger it.
The onEach method is just a transformation. Therefore you should replace it with the terminal flow operator collect. Also you could use a BroadcastChannel to have cleaner code:
private val channel = BroadcastChannel<Boolean>(1)
suspend fun someMethod(b: Boolean) {
channel.send(b)
}
suspend fun observe() {
channel
.asFlow()
.debounce(500)
.collect {
// value received
}
}
Update: At the time the question was asked there was an overload of debounce with two parameters (like in the question). There is not anymore. But now there is one which takes one argument in milliseconds (Long).
It should be SharedFlow/MutableSharedFlow for PublishProcessor/PublishRelay
private val _myFlow = MutableSharedFlow<Boolean>(
replay = 0,
extraBufferCapacity = 1, // you can increase
BufferOverflow.DROP_OLDEST
)
val myFlow = _myFlow.asSharedFlow()
// ...
fun someMethod(b: Boolean) {
_myFlow.tryEmit(b)
}
fun observe() {
myFlow.debounce(500)
.onEach { }
// flowOn(), catch{}
.launchIn(coroutineScope)
}
And StateFlow/MutableStateFlow for BehaviorProcessor/BehaviorRelay.
private val _myFlow = MutableStateFlow<Boolean>(false)
val myFlow = _myFlow.asStateFlow()
// ...
fun someMethod(b: Boolean) {
_myFlow.value = b // same as _myFlow.emit(v), myFlow.tryEmit(b)
}
fun observe() {
myFlow.debounce(500)
.onEach { }
// flowOn(), catch{}
.launchIn(coroutineScope)
}
StateFlow must have initial value, if you don't want that, this is workaround:
private val _myFlow = MutableStateFlow<Boolean?>(null)
val myFlow = _myFlow.asStateFlow()
.filterNotNull()
MutableStateFlow uses .equals comparison when setting new value, so it does not emit same value again and again (versus distinctUntilChanged which uses referential comparison).
So MutableStateFlow ≈ BehaviorProcessor.distinctUntilChanged(). If you want exact BehaviorProcessor behavior then you can use this:
private val _myFlow = MutableSharedFlow<Boolean>(
replay = 1,
extraBufferCapacity = 0,
BufferOverflow.DROP_OLDEST
)
ArrayBroadcastChannel in Kotlin coroutines is the one most similar to PublishSubject.
Like PublishSubject, an ArrayBroadcastChannel can have multiple
subscribers and all the active subscribers are immediately notified.
Like PublishSubject, events pushed to this channel are lost, if there are no active subscribers at the moment.
Unlike PublishSubject, backpressure is inbuilt into the coroutine channels, and that is where the buffer capacity comes in. This number really depends on which use case the channel is being used for. For most of the normal use cases, I just go with 10, which should be more than enough. If you push events faster to this channel than receivers consuming it, you can give more capacity.
Actually BroadcastChannel is obsolete already, Jetbrains changed their approach to use SharedFlows instead. Which is a lot more cleaner, easier to implement and solves a lot of pain points.
Essentially, you can achieve the same thing like this.
class BroadcastEventBus {
private val _events = MutableSharedFlow<Event>()
val events = _events.asSharedFlow() // read-only public view
suspend fun postEvent(event: Event) {
_events.emit(event) // suspends until subscribers receive it
}
}
To read about it more, checkout Roman's Medium article.
"Shared flows, broadcast channels" by Roman Elizarov