As part of my ML project, I want to generate training data for analyzing the face of multiple individuals from different images using the face detector Google Firebase ML-Kit Face detection library. I created a very simple service class to encapsulate the initialization and start the process of face detection:
class FaceDetectorService(private val act: MainActivity) {
private var opts: FirebaseVisionFaceDetectorOptions? = null
private var detector: FirebaseVisionFaceDetector? = null
init {
FirebaseApp.initializeApp(act)
opts = FirebaseVisionFaceDetectorOptions.Builder()
.setPerformanceMode(FirebaseVisionFaceDetectorOptions.ACCURATE)
.setLandmarkMode(FirebaseVisionFaceDetectorOptions.NO_LANDMARKS)
.setClassificationMode(FirebaseVisionFaceDetectorOptions.NO_CLASSIFICATIONS)
.setContourMode(FirebaseVisionFaceDetectorOptions.ALL_CONTOURS)
.build()
detector = FirebaseVision.getInstance()
.getVisionFaceDetector(opts!!)
}
suspend fun analyzeAsync(cont: Context, uri: Uri) : Pair<String, Task<List<FirebaseVisionFace>>> {
val image = FirebaseVisionImage.fromFilePath(cont, uri)
// this is for the UI thread
withContext(Main){
act.addItemToAnalyze(uri.lastPathSegment)
}
// return the filename too
return Pair(uri.lastPathSegment, detector!!.detectInImage(image))
}
}
The function detector!!.detectInImage (FirebaseVisionImage.detectInImage ) returns a Task that represents async operations.
In the onResume() function of my MainActivity, inside a CoroutineScope, I fire up the lib and start iterating over the images converting them to an Uri first then passing it to the face detector:
CoroutineScope(IO).launch {
val executeTime = measureTimeMillis {
for (uri in uris){
val fileNameUnderAnalysis = uri.lastPathSegment
//val tsk = withContext(IO) {
// detector!!.analyzeAsync(act, uri)
//}
val tsk = detector!!.analyzeAsync(act, uri)
tsk.second.addOnCompleteListener { task ->
if (task.isSuccessful && task.result!!.isNotEmpty()) {
try {
// my best
} catch (e: IllegalArgumentException) {
// fire
}
} else if (task.result!!.isEmpty()) {
// not today :(
}
}
tsk.second.addOnFailureListener { e ->
// on error
}
}
}
Log.i("MILLIS", executeTime.toString())
}
Now, although my implementation runs concurrently (that is, starting at the same time), what I actually want is to run them in parallel (running in the same time depending on the number of threads, which is 4 in my case on an emulator), so my goal would be to take the number of available threads and assign an analysis operation to each of them quartering the execution time.
What I tried so far is, inside the CoroutineScope(IO).launch block, encapsulating the call to the library in a task:
val tsk = async {
detector!!.analyzeAsync(act, uri)
}
val result = tsk.await()
and a job:
val tsk = withContext(IO) {
detector!!.analyzeAsync(act, uri)
}
but the async operations I manually start always last only as long as the Firebase tasks are started, not waiting for the inner task to run to completion. I also tried adding different withcontext(...) and ...launch {} variations inside the class FaceDetectorService, but to no avail.
I'm obviously very new to kotlin coroutines, so I think I'm missing something very basic here, but I just cannot wrap my head around it.
(PS: please do not comment on the sloppiness of the code, this is just a prototype :) )
analyzeAsync() is a suspend fun and also returns a future-like Task object. Instead it should return the result of Task.await(), which you can easily implement basically by factoring out your addOnCompleteListener call:
suspend fun <T> Task<T>.await(): T = suspendCancellableCoroutine { cont ->
addOnCompleteListener {
val e = exception
when {
e != null -> cont.resumeWithException(e)
isCanceled -> cont.cancel()
else -> cont.resume(result)
}
}
}
An optimzied version is available in the kotlinx-coroutines-play-services module).
Since the face detection API is already async, it means the thread you call it on is irrelevant and it handles its computation resources internally. Therefore you don't need to launch in the IO dispatcher, you can use Main and freely do GUI work at any point, with no context switches.
As for your main point: I couldn't find explicit details on it, but it is highly likely that a single face detection call already uses all the available CPU or even dedicated ML circuits that are now appearing in smartphones, which mean there's nothing to parallelize from the outside. Just a single face detection request is already getting all the resources working on it.
Related
I have an android app that uses CouchBase lite, I'm trying to save a document and get the acknowledgement using coroutin channel, the reason why I use a channel is to make sure every operation is done on the same scope
here is my try based on the selected answer here
How to properly have a queue of pending operations using Kotlin Coroutines?
object DatabaseQueue {
private val scope = CoroutineScope(IOCoroutineScope)
private val queue = Channel<Job>(Channel.UNLIMITED)
init {
scope.launch(Dispatchers.Default) {
for (job in queue) job.join()
}
}
fun submit(
context: CoroutineContext = EmptyCoroutineContext,
block: suspend CoroutineScope.() -> Unit
) {
val job = scope.launch(context, CoroutineStart.LAZY, block)
queue.trySendBlocking(job)
}
fun submitAsync(
context: CoroutineContext = EmptyCoroutineContext,
id: String,
database: Database
): Deferred<Document?> {
val job = scope.async(context, CoroutineStart.LAZY) {
database.getDocument(id)
}
queue.trySendBlocking(job)
return job
}
fun cancel() {
queue.cancel()
scope.cancel()
}
}
fun Database.saveDocument(document: MutableDocument) {
DatabaseQueue.submit {
Timber.tag("quechk").d("saving :: ${document.id}")
this#saveDocument.save(document)
}
}
fun Database.getDocumentQ(id: String): Document? {
return runBlocking {
DatabaseQueue.submitAsync(id = id, database = this#getDocumentQ).also {
Timber.tag("quechk").d("getting :: $id")
}.await()
}
}
my issue here is that when I have many db operations to write and read the reads are performing faster than the writes which gives me a null results, so,what I need to know is
is this the best way to do it or there is another optimal solution
how can I proccess the job and return the result from the channel in order to avoid the null result
By modifying the original solution you actually made it work improperly. The whole idea was to create an inactive coroutine for each submitted block of code and then start executing these coroutines one by one. In your case you exposed a Deferred to a caller, so the caller is able to start executing a coroutine and as a result, coroutines no longer run sequentially, but concurrently.
The easiest way to fix this while keeping almost the same code would be to introduce another Deferred, which is not directly tight to the queued coroutine:
fun submitAsync(
context: CoroutineContext = EmptyCoroutineContext,
id: String,
database: Database
): Deferred<Document?> {
val ret = CompletableDeferred<Document?>()
val job = scope.launch(context, CoroutineStart.LAZY) {
ret.completeWith(runCatching { database.getDocument(id) })
}
queue.trySendBlocking(job)
return ret
}
However, depending on your case it may be an overkill. For example, if you don't need to guarantee a strict FIFO ordering, a simple Mutex would be enough. Also, please note that classic approach of returning futures/deferreds only to await on them is an anti-pattern in coroutines. We should simply use a suspend function and call it directly.
I have still a little bit of trouble putting all information together about the thread-safety of using coroutines to launch network requests.
Let's say we have following use-case, there is a list of users we get and for each of those users, I will do some specific check which has to run over a network request to the API, giving me some information back about this user.
The userCheck happens inside a library, which doesn't expose suspend functions but rather still uses a callback.
Inside of this library, I have seen code like this to launch each of the network requests:
internal suspend fun <T> doNetworkRequest(request: suspend () -> Response<T>): NetworkResult<T> {
return withContext(Dispatchers.IO) {
try {
val response = request.invoke()
...
According to the documentation, Dispatchers.IO can use multiple threads for the execution of the code, also the request function is simply a function from a Retrofit API.
So what I did is to launch the request for each user, and use a single resultHandler object, which will add the results to a list and check if the length of the result list equals the length of the user list, if so, then all userChecks are done and I know that I can do something with the results, which need to be returned all together.
val userList: List<String>? = getUsers()
val userCheckResultList = mutableListOf<UserCheckResult>()
val handler = object : UserCheckResultHandler {
override fun onResult(
userCheckResult: UserCheckResult?
) {
userCheckResult?.let {
userCheckResultList.add(
it
)
}
if (userCheckResultList.size == userList?.size) {
doSomethingWithResultList()
print("SUCCESS")
}
}
}
userList?.forEach {
checkUser(it, handler)
}
My question is: Is this implementation thread-safe? As far as I know, Kotlin objects should be thread safe, but I have gotten feedback that this is possibly not the best implementation :D
But in theory, even if the requests get launched asynchronous and multiple at the same time, only one at a time can access the lock of the thread the result handler is running on and there will be no race condition or problems with adding items to the list and comparing the sizes.
Am I wrong about this?
Is there any way to handle this scenario in a better way?
If you are executing multiple request in parallel - it's not. List is not thread safe. But it's simple fix for that. Create a Mutex object and then just wrap your operation on list in lock, like that:
val lock = Mutex()
val userList: List<String>? = getUsers()
val userCheckResultList = mutableListOf<UserCheckResult>()
val handler = object : UserCheckResultHandler {
override fun onResult(
userCheckResult: UserCheckResult?
) {
lock.withLock {
userCheckResult?.let {
userCheckResultList.add(
it
)
}
if (userCheckResultList.size == userList?.size) {
doSomethingWithResultList()
print("SUCCESS")
}
}
}
}
userList?.forEach {
checkUser(it, handler)
}
I have to add that this whole solution seems very hacky. I would go completely other route. Run all of your requests wrapping those in async { // network request } which will return Deferred object. Add this object to some list. After that wait for all of those deferred objects using awaitAll(). Like that:
val jobs = mutableListOf<Job>()
userList?.forEach {
// i assume checkUser is suspendable here
jobs += async { checkUser(it, handler) }
}
// wait for all requests
jobs.awaitAll()
// After that you can access all results like this:
val resultOfJob0 = jobs[0].getCompleted()
I'm trying to run multiple processes using the ml-kit, I've already searched up and the only solutions I've found was to either do all the tasks in succession (how?) or using RXJava and the zip utility but that doesn't seem to match what I need.
I've tried writing the following code but i'm unsure to how good it is, would this be a good way of doing it?
override fun analyze(image: ImageProxy) {
val inputImage = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)
val tasks = mutableListOf<Task<*>>()
val onComplete = { t: Task<*> ->
tasks.remove(t)
if (tasks.isEmpty()) {
image.close()
}
}
barcodeScanner?.process(inputImage)
?.addOnSuccessListener {
// Do stuff with the result
}
?.addOnCompleteListener(onComplete)
?.also {
tasks.add(it)
}
imageLabeler?.process(inputImage)
?.addOnSuccessListener {
// Do stuff with the result
}
?.addOnCompleteListener(onComplete)
?.also {
tasks.add(it)
}
faceDetector?.process(inputImage)
?.addOnSuccessListener {
// Do stuff with the result
}
?.addOnCompleteListener(onComplete)
?.also {
tasks.add(it)
}
}
You current way will make different inference task queue up in one background thread. If you want multi-threading, you can use XxxOptions.Builder #setExecutor to assign detectors different thread.
https://developers.google.com/android/reference/com/google/mlkit/vision/barcode/BarcodeScannerOptions.Builder#setExecutor(java.util.concurrent.Executor)
I'm investigating the use of Kotlin Flow within my current Android application
My application retrieves its data from a remote server via Retrofit API calls.
Some of these API's return 50,000 data items in 500 item pages.
Each API response contains an HTTP Link header containing the Next pages complete URL.
These calls can take up to 2 seconds to complete.
In an attempt to reduce the elapsed time I have employed a Kotlin Flow to concurrently process each page
of data while also making the next page API call.
My flow is defined as follows:
private val persistenceThreadPool = Executors.newFixedThreadPool(3).asCoroutineDispatcher()
private val internalWorkWorkState = MutableStateFlow<Response<List<MyPage>>?>(null)
private val workWorkState = internalWorkWorkState.asStateFlow()
private val myJob: Job
init {
myJob = GlobalScope.launch(persistenceThreadPool) {
workWorkState.collect { page ->
if (page == null) {
} else managePage(page!!)
}
}
}
My Recursive function is defined as follows that fetches all pages:-
private suspend fun managePages(accessToken: String, response: Response<List<MyPage>>) {
when {
result != null -> return
response.isSuccessful -> internalWorkWorkState.emit(response)
else -> {
manageError(response.errorBody())
result = Result.failure()
return
}
}
response.headers().filter { it.first == HTTP_HEADER_LINK && it.second.contains(REL_NEXT) }.forEach {
val parts = it.second.split(OPEN_ANGLE, CLOSE_ANGLE)
if (parts.size >= 2) {
managePages(accessToken, service.myApiCall(accessToken, parts[1]))
}
}
}
private suspend fun managePage(response: Response<List<MyPage>>) {
val pages = response.body()
pages?.let {
persistResponse(it)
}
}
private suspend fun persistResponse(myPage: List<MyPage>) {
val myPageDOs = ArrayList<MyPageDO>()
myPage.forEach { page ->
myPageDOs.add(page.mapDO())
}
database.myPageDAO().insertAsync(myPageDOs)
}
My numerous issues are
This code does not insert all data items that I retrieve
How do complete the flow when all data items have been retrieved
How do I complete the GlobalScope job once all the data items have been retrieved and persisted
UPDATE
By making the following changes I have managed to insert all the data
private val persistenceThreadPool = Executors.newFixedThreadPool(3).asCoroutineDispatcher()
private val completed = CompletableDeferred<Int>()
private val channel = Channel<Response<List<MyPage>>?>(UNLIMITED)
private val channelFlow = channel.consumeAsFlow().flowOn(persistenceThreadPool)
private val frank: Job
init {
frank = GlobalScope.launch(persistenceThreadPool) {
channelFlow.collect { page ->
if (page == null) {
completed.complete(totalItems)
} else managePage(page!!)
}
}
}
...
...
...
channel.send(null)
completed.await()
return result ?: Result.success(outputData)
I do not like having to rely on a CompletableDeferred, is there a better approach than this to know when the Flow has completed everything?
You are looking for the flow builder and Flow.buffer():
suspend fun getData(): Flow<Data> = flow {
var pageData: List<Data>
var pageUrl: String? = "bla"
while (pageUrl != null) {
TODO("fetch pageData from pageUrl and change pageUrl to the next page")
emitAll(pageData)
}
}
.flowOn(Dispatchers.IO /* no need for a thread pool executor, IO does it automatically */)
.buffer(3)
You can use it just like a normal Flow, iterate, etc. If you want to know the total length of the output, you should calculate it on the consumer with a mutable closure variable. Note you shouldn't need to use GlobalScope anywhere (ideally ever).
There are a few ways to achieve the desired behaviour. I would suggest to use coroutineScope which is designed specifically for parallel decomposition. It also provides good cancellation and error handling behaviour out of the box. In conjunction with Channel.close behaviour it makes the implementation pretty simple. Conceptually the implementation may look like this:
suspend fun fetchAllPages() {
coroutineScope {
val channel = Channel<MyPage>(Channel.UNLIMITED)
launch(Dispatchers.IO){ loadData(channel) }
launch(Dispatchers.IO){ processData(channel) }
}
}
suspend fun loadData(sendChannel: SendChannel<MyPage>){
while(hasMoreData()){
sendChannel.send(loadPage())
}
sendChannel.close()
}
suspend fun processData(channel: ReceiveChannel<MyPage>){
for(page in channel){
// process page
}
}
It works in the following way:
coroutineScope suspends until all children are finished. So you don't need CompletableDeferred anymore.
loadData() loads pages in cycle and posts them into the channel. It closes the channel as soon as all pages have been loaded.
processData fetches items from the channel one by one and process them. The cycle will finish as soon as all the items have been processed (and the channel has been closed).
In this implementation the producer coroutine works independently, with no back-pressure, so it can take a lot of memory if the processing is slow. Limit the buffer capacity to have the producer coroutine suspend when the buffer is full.
It might be also a good idea to use channels fan-out behaviour to launch multiple processors to speed up the computation.
I am facing a weird issue while unit testing Coroutines. There are two tests on the class, when run individually, they both pass and when I run the complete test class, one fails with assertion error.
I am using MainCoroutineRule to use the TestCoroutineScope and relying on the latest Coroutine Testing Library
Here is the test :
#Test
fun testHomeIsLoadedWithShowsAndFavorites() {
runBlocking {
// Stubbing network and repository calls
whenever(tvMazeApi.getCurrentSchedule("US", currentDate))
.thenReturn(getFakeEpisodeList())
whenever(favoriteShowsRepository.allFavoriteShowIds())
.thenReturn(arrayListOf(1, 2))
}
mainCoroutineRule.runBlockingTest {
// call home viewmodel
homeViewModel.onScreenCreated()
// Check if loader is shown
assertThat(LiveDataTestUtil.getValue(homeViewModel.getHomeViewState())).isEqualTo(Loading)
// Observe on home view state live data
val homeViewState = LiveDataTestUtil.getValue(homeViewModel.getHomeViewState())
// Check for success data
assertThat(homeViewState is Success).isTrue()
val homeViewData = (homeViewState as Success).homeViewData
assertThat(homeViewData.episodes).isNotEmpty()
// compare the response with fake list
assertThat(homeViewData.episodes).hasSize(getFakeEpisodeList().size)
// compare the data and also order
assertThat(homeViewData.episodes).containsExactlyElementsIn(getFakeEpisodeViewDataList(true)).inOrder()
}
}
The other test is almost similar which tests for Shows without favorites. I am trying to test HomeViewModel method as:
homeViewStateLiveData.value = Loading
val coroutineExceptionHandler = CoroutineExceptionHandler { _, exception ->
onError(exception)
}
viewModelScope.launch(coroutineExceptionHandler) {
// Get shows from network and favorites from room db on background thread
val favoriteShowsWithFavorites = withContext(Dispatchers.IO) {
val favoriteShowIds = favoriteShowsRepository.allFavoriteShowIds()
val episodes = tvMazeApi.getCurrentSchedule(COUNTRY_US, currentDate)
getShowsWithFavorites(episodes, favoriteShowIds)
}
// Return the combined result on main thread
withContext(Dispatchers.Main) {
onSuccess(favoriteShowsWithFavorites)
}
}
}
I cannot find the actual cause of why the tests if run separately are passing and when the complete class is tested, one of them is failing. Pls help if I am missing something
Retrofit and Room that come with Coroutine support owner the suspend functions and move them off the UI thread by their own. Thus, they reduce the hassles of handling thread callbacks by the developers in a big way. Initially, I was moving the suspend calls of network and DB to IO via Dispatchers.IO explicitly. This was unnecessary and also leading unwanted context-switching leading to flaky test. Since the libraries, automatically do it, it was just about handling the data back on UI when available.
viewModelScope.launch(coroutineExceptionHandler) {
// Get favorite shows from db, suspend function in room will launch a new coroutine with IO dispatcher
val favoriteShowIds = favoriteShowsRepository.allFavoriteShowIds()
// Get shows from network, suspend function in retrofit will launch a new coroutine with IO dispatcher
val episodes = tvMazeApi.getCurrentSchedule(COUNTRY_US, currentDate)
// Return the result on main thread via Dispatchers.Main
homeViewStateLiveData.value = Success(HomeViewData(getShowsWithFavorites(episodes, favoriteShowIds)))
}