Is it safe to use unsynchronized mutable state with Kotlin Flow?

Is it safe to use unsynchronized mutable state with Kotlin Flow? - android

Is following code safe and why?
val flow: Flow<String> = ...
val allStrings = mutableListOf<String>()
var sum = 0
flow.transform {
allStrings += it
emit(it.toInt())
}.collect {
sum += it
}
Following test demonstrates collect {} body being called from different threads:
val ctx = newFixedThreadPoolContext(32, "my-context")
runBlocking(ctx) {
val f = flow<Int> {
(1 .. 1000).forEach {
emit(it)
}
}
var t: Thread? = null
f.collect {
delay(1)
// this requirement will fail
if (t == null) t = Thread.currentThread() else require(t == Thread.currentThread())
}
}
And another one that tests publication:
fun main(args: Array<String>) {
val ctx = newFixedThreadPoolContext(32, "my-context")
runBlocking(ctx) {
val f = flow<Int> {
(1 .. 1000000).forEach {
emit(it)
}
}
var c = 0
f.transform {
c += 1
boo()
c += 1
emit(it)
c += 1
}.collect {
c += 1
boo()
c += 1
}
println(c) // prints 5_000_000
}
}
suspend fun boo() {
withContext(Dispatchers.IO) {
}
}
Therefore it seems kotlin flow ensures publication between coroutine invocations but is this intentional (or even documented) or implementation side effect?

Yes, the code is safe. Flows are sequential by default.
Each individual collection of a flow is performed sequentially unless
special operators that operate on multiple flows are used. The
collection works directly in the coroutine that calls a terminal
operator (collect). No new coroutines are launched by default. Each
emitted value is processed by all the intermediate operators from
upstream to downstream and is then delivered to the terminal operator
after.
Kotlin Flows are based on suspending functions and they are completely sequential, but not single-threaded.
Therefore it seems kotlin flow ensures publication between coroutine invocations but is this intentional (or even documented) or implementation side effect?
Here it says:
a flow is a type that can emit multiple values sequentially
According to this I understand that no new value will be collected until previous value is processed, no matter how much Threads are involved.
As Roman mention in his comment, here is a good article about sequential execution. A great quote from there:
Even though a coroutine in Kotlin can execute on multiple threads it is just like a thread from a standpoint of mutable state. No two actions in the same coroutine can be concurrent. And just like with threads you should avoid sharing your mutable state between coroutines or you’ll have to worry about synchronization yourself.
Avoid sharing mutable state. Confine each mutable object to either a single thread or to a single coroutine and sleep well.
And this is applicable to Flows, because collection of Flow works directly in the coroutine that calls a terminal operator. No new coroutines are launched by default.

Related

kotlin, how to run sequential background threads [duplicate]

I have an instance of CoroutineScope and log() function which look like the following:
private val scope = CoroutineScope(Dispatchers.IO)
fun log(message: String) = scope.launch { // launching a coroutine
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking operation
}
And I use this test code to launch coroutines:
repeat(5) { item ->
log("Log $item")
}
The log() function can be called from any place, in any Thread, but not from a coroutine.
After a couple of tests I can see not sequential result like the following:
Log 0
Log 2
Log 4
Log 1
Log 3
There can be different order of printed logs. If I understand correctly the execution of coroutines doesn't guarantee to be sequential. What it means is that a coroutine for item 2 can be launched before the coroutine for item 0.
I want that coroutines were launched sequentially for each item and "some blocking operation" would execute sequentially, to always achieve next logs:
Log 0
Log 1
Log 2
Log 3
Log 4
Is there a way to make launching coroutines sequential? Or maybe there are other ways to achieve what I want?
Thanks in advance for any help!

One possible strategy is to use a Channel to join the launched jobs in order. You need to launch the jobs lazily so they don't start until join is called on them. trySend always succeeds when the Channel has unlimited capacity. You need to use trySend so it can be called from outside a coroutine.
private val lazyJobChannel = Channel<Job>(capacity = Channel.UNLIMITED).apply {
scope.launch {
consumeEach { it.join() }
}
}
fun log(message: String) {
lazyJobChannel.trySend(
scope.launch(start = CoroutineStart.LAZY) {
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking operation
}
)
}

Since Flows are sequential we can use MutableSharedFlow to collect and handle data sequentially:
class Info {
// make sure replay(in case some jobs were emitted before sharedFlow is being collected and could be lost)
// and extraBufferCapacity are large enough to handle all the jobs.
// In case some jobs are lost try to increase either of the values.
private val sharedFlow = MutableSharedFlow<String>(replay = 10, extraBufferCapacity = 10)
private val scope = CoroutineScope(Dispatchers.IO)
init {
sharedFlow.onEach { message ->
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking or suspend operation
}.launchIn(scope)
}
fun log(message: String) {
sharedFlow.tryEmit(message)
}
}
fun test() {
val info = Info()
repeat(10) { item ->
info.log("Log $item")
}
}
It always prints the logs in the correct order:
Log 0
Log 1
Log 2
...
Log 9
It works for all cases, but need to be sure there are enough elements set to replay and extraBufferCapacity parameters of MutableSharedFlow to handle all items.
Another approach is
Using Dispatchers.IO.limitedParallelism(1) as a context for the CoroutineScope. It makes coroutines run sequentially if they don't contain calls to suspend functions and launched from the same Thread, e.g. Main Thread. So this solution works only with blocking (not suspend) operation inside launch coroutine builder:
private val scope = CoroutineScope(Dispatchers.IO.limitedParallelism(1))
fun log(message: String) = scope.launch { // launching a coroutine from the same Thread, e.g. Main Thread
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // only blocking operation, not `suspend` operation
}
It turns out that the single thread dispatcher is a FIFO executor. So limiting the CoroutineScope execution to one thread solves the problem.

How to run Kotlin coroutines sequentially?

I have an instance of CoroutineScope and log() function which look like the following:
private val scope = CoroutineScope(Dispatchers.IO)
fun log(message: String) = scope.launch { // launching a coroutine
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking operation
}
And I use this test code to launch coroutines:
repeat(5) { item ->
log("Log $item")
}
The log() function can be called from any place, in any Thread, but not from a coroutine.
After a couple of tests I can see not sequential result like the following:
Log 0
Log 2
Log 4
Log 1
Log 3
There can be different order of printed logs. If I understand correctly the execution of coroutines doesn't guarantee to be sequential. What it means is that a coroutine for item 2 can be launched before the coroutine for item 0.
I want that coroutines were launched sequentially for each item and "some blocking operation" would execute sequentially, to always achieve next logs:
Log 0
Log 1
Log 2
Log 3
Log 4
Is there a way to make launching coroutines sequential? Or maybe there are other ways to achieve what I want?
Thanks in advance for any help!

One possible strategy is to use a Channel to join the launched jobs in order. You need to launch the jobs lazily so they don't start until join is called on them. trySend always succeeds when the Channel has unlimited capacity. You need to use trySend so it can be called from outside a coroutine.
private val lazyJobChannel = Channel<Job>(capacity = Channel.UNLIMITED).apply {
scope.launch {
consumeEach { it.join() }
}
}
fun log(message: String) {
lazyJobChannel.trySend(
scope.launch(start = CoroutineStart.LAZY) {
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking operation
}
)
}

Since Flows are sequential we can use MutableSharedFlow to collect and handle data sequentially:
class Info {
// make sure replay(in case some jobs were emitted before sharedFlow is being collected and could be lost)
// and extraBufferCapacity are large enough to handle all the jobs.
// In case some jobs are lost try to increase either of the values.
private val sharedFlow = MutableSharedFlow<String>(replay = 10, extraBufferCapacity = 10)
private val scope = CoroutineScope(Dispatchers.IO)
init {
sharedFlow.onEach { message ->
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // some blocking or suspend operation
}.launchIn(scope)
}
fun log(message: String) {
sharedFlow.tryEmit(message)
}
}
fun test() {
val info = Info()
repeat(10) { item ->
info.log("Log $item")
}
}
It always prints the logs in the correct order:
Log 0
Log 1
Log 2
...
Log 9
It works for all cases, but need to be sure there are enough elements set to replay and extraBufferCapacity parameters of MutableSharedFlow to handle all items.
Another approach is
Using Dispatchers.IO.limitedParallelism(1) as a context for the CoroutineScope. It makes coroutines run sequentially if they don't contain calls to suspend functions and launched from the same Thread, e.g. Main Thread. So this solution works only with blocking (not suspend) operation inside launch coroutine builder:
private val scope = CoroutineScope(Dispatchers.IO.limitedParallelism(1))
fun log(message: String) = scope.launch { // launching a coroutine from the same Thread, e.g. Main Thread
println("$message")
TimeUnit.MILLISECONDS.sleep(100) // only blocking operation, not `suspend` operation
}
It turns out that the single thread dispatcher is a FIFO executor. So limiting the CoroutineScope execution to one thread solves the problem.

How do you make make a subscriber to a kotlin sharedflow run operations in parallel?

I have a connection to a Bluetooth device that emits data every 250ms
In my viewmodel I wish to subscribe to said data , run some suspending code (which takes approximatelly 1000ms to run) and then present the result.
the following is a simple example of what I'm trying to do
Repository:
class Repo() : CoroutineScope {
private val supervisor = SupervisorJob()
override val coroutineContext: CoroutineContext = supervisor + Dispatchers.Default
private val _dataFlow = MutableSharedFlow<Int>()
private var dataJob: Job? = null
val dataFlow: Flow<Int> = _dataFlow
init {
launch {
var counter = 0
while (true) {
counter++
Log.d("Repo", "emmitting $counter")
_dataFlow.emit(counter)
delay(250)
}
}
}
}
the viewmodel
class VM(app:Application):AndroidViewModel(app) {
private val _reading = MutableLiveData<String>()
val latestReading :LiveData<String>() = _reading
init {
viewModelScope.launch(Dispatchers.Main) {
repo.dataFlow
.map {
validateData() //this is where some validation happens it is very fast
}
.flowOn(Dispatchers.Default)
.forEach {
delay(1000) //this is to simulate the work that is done,
}
.flowOn(Dispatchers.IO)
.map {
transformData() //this will transform the data to be human readable
}
.flowOn(Dispatchers.Default)
.collect {
_reading.postValue(it)
}
}
}
}
as you can see, when data comes, first I validate it to make sure it is not corrupt (on Default dispatcher) then I perform some operation on it (saving and running a long algorithm that takes time on the IO dispatcher) then I change it so the application user can understand it (switching back to Default dispatcher) then I post it to mutable live data so if there is a subscriber from the ui layer they can see the current data (on the Main dispatcher)
I have two questions
a) If validateData fails how can I cancel the current emission and move on to the next one?
b) Is there a way for the dataFlow subscriber working on the viewModel to generate new threads so the delay parts can run in parallel?
the timeline right now looks like the first part, but I want it to run like the second one
Is there a way to do this?
I've tried using buffer() which as the documentation states "Buffers flow emissions via channel of a specified capacity and runs collector in a separate coroutine." but when I set it to BufferOverflow.SUSPEND I get the behaviour of the first part, and when I set it to BufferOverflow.DROP_OLDEST or BufferOverflow.DORP_LATEST I loose emissions
I have also tried using .conflate() like so:
repo.dataFlow
.conflate()
.map { ....
and even though the emissions start one after the other, the part with the delay still waits for the previous one to finish before starting the next one
when I use .flowOn(Dispatchers.Default) for that part , I loose emissions, and when I use .flowOn(Dispatchers.IO) or something like Executors.newFixedThreadPool(4).asCoroutineDispatcher() they always wait for the previous one to finish before starting a new one
Edit 2:
After about 3 hours of experiments this seems to work
viewModelScope.launch(Dispatchers.Default) {
repo.dataFlow
.map {
validateData(it)
}
.flowOn(Dispatchers.Default)
.map {
async {
delay(1000)
it
}
}
.flowOn(Dispatchers.IO) // NOTE (A)
.map {
val result = it.await()
transformData(result)
}
.flowOn(Dispatchers.Default)
.collect {
_readings.postValue(it)
}
}
however I still haven't figured out how to cancel the emission if validatedata fails
and for some reason it only works if I use Dispatchers.IO , Executors.newFixedThreadPool(20).asCoroutineDispatcher() and Dispatchers.Unconfined where I put note (A), Dispatchers.Main does not seem to work (which I expected) but Dispatchers.Default also does not seem to work and I don't know why

First question: Well you cannot recover from an exception in a sense of continuing
the collection of the flow, as per docs "Flow collection can complete with an exception when an emitter or code inside the operators throw an exception." therefore once an exception has been thrown the collection is completed (exceptionally) you can however handle the exception by either wrapping your collection inside try/catch block or using the catch() operator.
Second question: You cannot, while the producer (emitting side) can be made concurrent
by using the buffer() operator, collection is always sequential.
As per your diagram, you need fan out (one producer, many consumers), you cannot
achieve that with flows. Flows are cold, each time you collect from them, they start
emitting from the beginning.
Fan out can be achieved using channels, where you can have one coroutine producing
values and many coroutines that consume those values.
Edit: Oh you meant the validation failed not the function itself, in that case you can use the filter() operator.
The BroadcastChannel and ConflatedBroadcastChannel are getting deprecated. SharedFlow cannot help you in your use case, as they emit values in a broadcast fashion, meaning producer waits until all consumers consume each value before producing the next one. That is still sequential, you need parallelism. You can achieve it using the produce() channel builder.
A simple example:
val scope = CoroutineScope(Job() + Dispatchers.IO)
val producer: ReceiveChannel<Int> = scope.produce {
var counter = 0
val startTime = System.currentTimeMillis()
while (isActive) {
counter++
send(counter)
println("producer produced $counter at ${System.currentTimeMillis() - startTime} ms from the beginning")
delay(250)
}
}
val consumerOne = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerOne consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
val consumerTwo = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerTwo consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
val consumerThree = scope.launch {
val startTime = System.currentTimeMillis()
for (x in producer) {
println("consumerThree consumd $x at ${System.currentTimeMillis() - startTime}ms from the beginning.")
delay(1000)
}
}
Observe production and consumption times.

Parallel request with Retrofit, Coroutines and Suspend functions

I'm using Retrofit in order to make some network requests. I'm also using the Coroutines in combination with 'suspend' functions.
My question is: Is there a way to improve the following code. The idea is to launch multiple requests in parallels and wait for them all to finish before continuing the function.
lifecycleScope.launch {
try {
itemIds.forEach { itemId ->
withContext(Dispatchers.IO) { itemById[itemId] = MyService.getItem(itemId) }
}
} catch (exception: Exception) {
exception.printStackTrace()
}
Log.i(TAG, "All requests have been executed")
}
(Note that "MyService.getItem()" is a 'suspend' function.)
I guess that there is something nicer than a foreach in this case.
Anyone with an idea?

I've prepared three approaches to solving this, from the simplest to the most correct one. To simplify the presentation of the approaches, I have extracted this common code:
lifecycleScope.launch {
val itemById = try {
fetchItems(itemIds)
} catch (exception: Exception) {
exception.printStackTrace()
}
Log.i(TAG, "Fetched these items: $itemById")
}
Before I go on, a general note: your getItem() function is suspendable, you have no need to submit it to the IO dispatcher. All your coroutines can run on the main thread.
Now let's see how we can implement fetchItems(itemIds).
1. Simple forEach
Here we take advantage of the fact that all the coroutine code can run on the main thread:
suspend fun fetchItems(itemIds: Iterable<Long>): Map<Long, Item> {
val itemById = mutableMapOf<Long, Item>()
coroutineScope {
itemIds.forEach { itemId ->
launch { itemById[itemId] = MyService.getItem(itemId) }
}
}
return itemById
}
coroutineScope will wait for all the coroutines you launch inside it. Even though they all run concurrently to each other, the launched coroutines still dispatch to the single (main) thread, so there is no concurrency issue with updating the map from each of them.
2. Thread-Safe Variant
The fact that it leverages the properties of a single-threaded context can be seen as a limitation of the first approach: it doesn't generalize to threadpool-based contexts. We can avoid this limitation by relying on the async-await mechanism:
suspend fun fetchItems(itemIds: Iterable<Long>): Map<Long, Item> = coroutineScope {
itemIds.map { itemId -> async { itemId to MyService.getItem(itemId) } }
.map { it.await() }
.toMap()
}
Here we rely on two non-obvious properties of Collection.map():
It performs all the transformation eagerly, so the first transformation to a collection of Deferred<Pair<Long, Item>> is completely done before entering the second stage, where we await on all of them.
It is an inline function, which allows us to write suspendable code in it even though the function itself is not a suspend fun and gets a non-suspendable lambda (Deferred<T>) -> T.
This means that all the fetching is done concurrently, but the map gets assembled in a single coroutine.
3. Flow-Based Approach with Improved Concurrency Control
The above solved the concurrency for us, but it lacks any backpressure. If your input list is very large, you'll want to put a limit on how many simultaneous network requests you're making.
You can do this with a Flow-based idiom:
suspend fun fetchItems(itemIds: Iterable<Long>): Map<Long, Item> = itemIds
.asFlow()
.flatMapMerge(concurrency = MAX_CONCURRENT_REQUESTS) { itemId ->
flow { emit(itemId to MyService.getItem(itemId)) }
}
.toMap()
Here the magic is in the .flatMapMerge operation. You give it a function (T) -> Flow<R> and it will execute it sequentially on all the input, but then it will concurrently collect all the flows it got. Note that I couldn't simplify flow { emit(getItem()) } } to just flowOf(getItem()) because getItem() must be called lazily, while collecting the flow.
Flow.toMap() is not currently provided in the standard library, so here it is:
suspend fun <K, V> Flow<Pair<K, V>>.toMap(): Map<K, V> {
val result = mutableMapOf<K, V>()
collect { (k, v) -> result[k] = v }
return result
}

If you are looking for just a nicer way to write it and eliminate foreach
lifecycleScope.launch {
try {
itemIds.asFlow()
.flowOn(Dispatchers.IO)
.collect{ itemId -> itemById[itemId] = MyService.getItem(itemId)}
} catch (exception: Exception) {
exception.printStackTrace()
}
Log.i(TAG, "All requests have been executed")
}
Also please look at lifecycleScope I suspect it is using Dispatchers.Main. If that is the case you can remove this .flowOn(Dispatchers.IO) extra dispatcher declaration.
For more info: Kotlin Asynchronous Flow

doAsync Kotlin-android doesn't work well

I am using a callback function when async ends. but it doesn't work well :(
my case:
fun function1(callback : (obj1: List<ObjT1>,obj2: List<ObjT1>)-> Unit?){
doAsync {
//long task
uiThread { callback(result1, result2) }
}
}
the callback is called but result1 and result2(lists) are empty. I checked previous the content of the list.
EDIT:
PROBLEM: my callback is a function that receives 2 objects result 1 and result2, the problem is the function callback sometimes receives the results empty, i check their content and is not empty.

It may be because you've declared return type as Unit? but are returning two values. A quick fix would be to put result1 and result2 in an array.

Now this question is about deprecated Kotlin library.
I recommend use coroutines.

Consider using Kotlin's coroutines. Coroutines is a newer feature in Kotlin. It is still technically in it's experimental phase, but JetBrains has told us that it is very stable.
Read more here: https://kotlinlang.org/docs/reference/coroutines.html
Here is some sample code:
fun main(args: Array<String>) = runBlocking { // runBlocking is only needed here because I am calling join below
val job = launch(UI) { // The launch function allows you to execute suspended functions
val async1 = doSomethingAsync(250)
val async2 = doSomethingAsync(50)
println(async1.await() + async2.await()) // The code within launch will
// be paused here until both async1 and async2 have finished
}
job.join() // Just wait for the coroutines to finish before stopping the program
}
// Note: this is a suspended function (which means it can be "paused")
suspend fun doSomethingAsync(param1: Long) = async {
delay(param1) // pause this "thread" (not really a thread)
println(param1)
return#async param1 * 2 // return twice the input... just for fun
}

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.