Best usage of CameraX for MLKit Text Recognition on Android

Best usage of CameraX for MLKit Text Recognition on Android - android

I need to implement text recognition using MLKit on Android and I have decided to use the new CameraX api as the camera lib. I am struggling with the correct "pipeline" of classes or data flow of the image because CameraX is quite new and not many resources is out there. The use case is that I take the picture, crop it in the middle by some bounds that are visible in the UI and then pass this cropped image to the MLKit that will process the image.
Given that, is there some place for ImageAnalysis.Analyzer
api? From my understanding this analyzer is used only for previews and not the captured image.
My first idea was to use takePicture method that accepts OnImageCapturedCallback but when I've tried access eg. ImageProxy.height the app crashed with an exception java.lang.IllegalStateException: Image is already closed and I could not find any fix for that.
Then I've decided to use another overload of takePicture method and now I save image to the file, then read it to Bitmap, crop this image and now I have an image that can be passed to MLKit. But when I take a look at FirebaseVisionImage that is passed to FirebaseVisionTextRecognizer it has a factory method to which I can pass the Image that I get from OnImageCapturedCallback which seems that I am doing some unnecessary steps.
So my questions are:
Is there some class (CaptureProcessor?) that will take care of the cropping of taken image? I suppose that then I could use OnImageCapturedCallback where I would receive already cropped image.
Should I even use ImageAnalysis.Analyzer if I am not doing realtime processing and I am doing post processing?
I suppose that I can achieve what I want with my current approach but I am feeling that I could use much more of CameraX than I currently am.
Thanks!

Is there some class (CaptureProcessor?) that will take care of the
cropping of taken image?
You can set the crop aspect ratio after you build the ImageCapture use case by using the setCropAspectRatio(Rational) method. This method crops from the center of the rotated output image. So basically what you'd get back after calling takePicture() is what I think you're asking for.
Should I even use ImageAnalysis.Analyzer if I am not doing realtime
processing and I am doing post processing?
No, it wouldn't make sense in your scenario. As you mentioned, only when doing real-time image processing would you want to use ImageAnalysis.Analyzer.
ps: I'd be interested in seeing the code you use for takePicture() that caused the IllegalStateException.
[Edit]
Taking a look at your code
imageCapture?.takePicture(executor, object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(image: ImageProxy) {
// 1
super.onCaptureSuccess(image)
// 2
Log.d("MainActivity", "Image captured: ${image.width}x${image.height}")
}
})
At (1), if you take a look at super.onCaptureSuccess(imageProxy)'s implementation, it actually closes the imageProxy passed to the method. Accessing the image's width and height in (2) throws an exception, which is normal -since the image has been closed-. The documentation states:
The application is responsible for calling ImageProxy.close() to close
the image.
So when using this callback, you should probably not call super..., just use the imageProxy, then before returning from the method, manually close it (ImageProxy.close()).

Related

How to most efficiently use and apply Android CameraX Image Analysis setTargetRotation

Similar to what is laid out in the tutorial I am initializing CameraX's Image Analysis use case with code:
ImageAnalysis imageAnalysis =
new ImageAnalysis.Builder()
.setTargetResolution(new Size(1280, 720))
.setTargetRotation(Surface.ROTATION_90)
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.build();
imageAnalysis.setAnalyzer(executor, new ImageAnalysis.Analyzer() {
#Override
public void analyze(#NonNull ImageProxy image) {
int rotationDegrees = image.getImageInfo().getRotationDegrees();
// insert your code here.
}
});
cameraProvider.bindToLifecycle((LifecycleOwner) this, cameraSelector, imageAnalysis, preview);
I am trying to use setTargetRotation method but I am not clear as to how I am supposed to apply this rotation to the output image as vaguely described in the docs:
The rotation value of ImageInfo will be the rotation, which if applied to the output image, will make the image match target rotation specified here.
If I set a breakpoint in the analyze() method shown above, the image object does not get rotated when changing the setTargetRotation value, so I assume the docs are telling me to grab the orientation with getTargerRotation() in the sense that these two pieces of code (builder vs analyzer) are coded separately and this information can be passed between the two without actually applying any rotation. Did I understand this correctly? This really doesn't make sense to me as the setTargetResolution method actually changes the size sent via the ImageProxy. I'd think setTargetRotation should also apply said rotation, but it appears not.
If my understanding is correct, is there an optimal efficient way to rotate these ImageProxy objects after entering the analyze method? Right now I'm doing it after converting to Bitmap via
Bitmap myImg = BitmapFactory.decodeResource(someInutStream);
Matrix matrix = new Matrix();
matrix.postRotate(30);
Bitmap rotated = Bitmap.createBitmap(myImg, 0, 0, myImg.getWidth(), myImg.getHeight(),
matrix, true);
Above idea came from here, but I'd think this is not the most efficient way to do this. I'm sure I could also come up with a way to transpose the arrays, but this could get tedious and messy quickly. Isn't there any way to setup the ImageAnalysis Builder to send the rotated ImageProxy directly rather than having to make a bitmap of everything?

The rotation value of ImageInfo will be the rotation, which if applied
to the output image, will make the image match target rotation
specified here.
An example to understand the definition would be to assume the target rotation matches the device's orientation. Applying the returned rotation to the output image will result in the image being upright, i.e matching the device's orientation. So if the device is in its natural portrait position, the rotated output image will also be in that orientation.
Output image + Rotated by rotation value --> Upright output image
CameraX's documentation includes a section about rotations, since it can be a confusing topic. You can check it out here.
Going back to your question about setTargetRotation and the ImageAnalysis use case, it isn't meant to rotate the images passed to the Analyzer, but it should affect the rotation information of the images, i.e ImageProxy.getImageInfo().getRotationDegrees(). Transforming images (rotating, cropping, etc) can be an expensive operation, so CameraX does not perform any modification to the analysis frames, but it provides the required metadata to make sense of the output images, metadata that can then be used with image processors that analyze the frames.
If you need to rotate each analysis frame, using the Bitmap approach is one way, but it can be costly. Another more performant way may be to do it in native code.

How do you take a picture with camerax?

I'm still practicing with Kotlin and Android Developing. As far as I understood, Camera class has been deprecated, and Android invites to use Camerax instead, because this high-level class is device-indipendent, and they've made simpler the process of implementing cameras on apps.
I've tried to read the documentation (https://developer.android.com/training/camerax) but it's written so bad I barely understood what they are trying to explain.
So I went to read the entire sample code given in the documentation itself (https://github.com/android/camera-samples/tree/main/CameraXBasic).
The CameraFragment code is about 500 lines long (ignoring imports and various comments).
Do I really need to write 500 lines of code to simply take a picture?
How is this supposed to be considered "simpler than before"?
I mean, Android programming is at the point where I just need to write only 4 lines of code to ask the user to select an Image from his storage and retreive it and show it in an ImageView.
Is there a TRUE simple way to take a picture, or do I really need to stop and lose a whole day of work to write all those lines of code?
EDIT:
Take this page of the documentation:
https://developer.android.com/training/camerax/architecture#kotlin
It starts with this piece of code.
val preview = Preview.Builder().build()
val viewFinder: PreviewView = findViewById(R.id.previewView)
// The use case is bound to an Android Lifecycle with the following code
val camera = cameraProvider.bindToLifecycle(lifecycleOwner, cameraSelector, preview)
cameraProvider comes out of nowhere. What is this supposed to be? I've found out it's a ProcessCameraProvider, but how am I supposed to initialize it?
Should it be a lateinit var or has it already been initialized somewhere else?
Because if I try to write val cameraProvider = ProcessCameraProvider() I get an error, so what am I supposed to do?
What is cameraSelector parameter? It's not defined before. I've found out it's the selector for the front or back camera, but how am I supposed to know it reading that page of the documentation?
How could have this documentation been released with these kind of lackings?
How is someone supposed to learn with ease?

Before you can interact with the device's cameras using CameraX, you need to initialize the library. The initialization process is asynchronous, and involves things like loading information about the device's cameras.
You interact with the device's cameras using a ProcessCameraProvider. It's a Singleton, so the first time you get an instance of if, CameraX performs its initialization.
val cameraProviderFuture: ListenableFuture<ProcessCameraProvider> = ProcessCameraProvider.getInstance(context)
Getting the ProcessCameraProvider singleton returns a Future because it might need to initialize the library asynchronously. The first time you get it, it might take some time (usually well under a second), subsequent calls though will return immediately, as the initialization will have already been performed.
With a ProcessCameraProvider in hand, you can start interacting with the device's cameras. You choose which camera to interact with using a CameraSelector, which wraps a set of filters for the camera you want to use. Typically, if you're just trying to use the main back or front camera, you'd use CameraSelector.DEFAULT_BACK_CAMERA or CameraSelector.DEFAULT_FRONT_CAMERA.
Now that you've defined which camera you'll use, you build the use cases you'll need. For example, you want to take a picture, so you'll use the ImageCapture use case. It allows taking a single capture frame (typically a high quality one) using the camera, and providing it either as a raw buffer, or storing it in a file. To use it, you can configure it if you'd wish, or you can just let CameraX use a default configuration.
val imageCapture = ImageCapture.Builder().build()
In CameraX, a camera's lifecycle is controlled by a LifecycleOwner, meaning that when the LifecycleOwner's lifecycle starts, the camera opens, and when it stops, the camera closes. So you'll need to choose a lifecycle that will control the camera. If you're using an Activity, you'd typically want the camera to start as the Activity starts, and stop when it stops, so you'd use the Activity instance itself as the LifecycleOwner, if you were using a Fragment, you might want to use its view lifecycle (Fragment.getViewLifecycleOwner()).
Lastly, you need to put the pieces of the puzzle together.
processCameraProvider.bindToLifecycle(
lifecycleOwner,
cameraSelector,
imageCapture
)
An app typically includes a viewfinder that displays the camera's preview, so you can use a Preview use case, and bind it with the ImageCapture use case. The Preview use case allows streaming camera frames to a Surface. Since setting up the Surface and correctly drawing the preview on it can be complex, CameraX provides PreviewView, a View that can be used with the Preview use case to display the camera preview. You can check out how to use them here.
// Just like ImageCapture, you can configure the Preview use case if you'd wish.
val preview = Preview.Builder().build()
// Provide PreviewView's Surface to CameraX. The preview will be drawn on it.
val previewView: PreviewView = findViewById(...)
preview.setSurfaceProvider(previewView.surfaceProvider)
// Bind both the Preview and ImageCapture use cases
processCameraProvider.bindToLifecycle(
lifecycleOwner,
cameraSelector,
imageCapture,
preview
)
Now to actually take a picture, you use on of ImageCapture's takePicture methods. One provides a JPEG raw buffer of the captured image, the other saves it in a file that you provide (make sure you have the necessary storage permissions if you need any).
imageCapture.takePicture(
ContextCompat.getMainExecutor(context), // Defines where the callbacks are run
object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(imageProxy: ImageProxy) {
val image: Image = imageProxy.image // Do what you want with the image
imageProxy.close() // Make sure to close the image
}
override fun onError(exception: ImageCaptureException) {
// Handle exception
}
}
)
val imageFile = File("somePath/someName.jpg") // You can store the image in the cache for example using `cacheDir.absolutePath` as a path.
val outputFileOptions = ImageCapture.OutputFileOptions
.Builder(imageFile)
.build()
takePicture(
outputFileOptions,
CameraXExecutors.mainThreadExecutor(),
object : ImageCapture.OnImageSavedCallback {
override fun onImageSaved(outputFileResults: ImageCapture.OutputFileResults) {
}
override fun onError(exception: ImageCaptureException) {
}
}
)
Do I really need to write 500 lines of code to simply take a picture?
How is this supposed to be considered "simpler than before"?
CameraXBasic is not as "basic" as its name might suggest x) It's more of a complete example of CameraX's 3 use cases. Even though the CameraFragment is long, it explains things nicely so that it's more accessible to everyone.
CameraX is "simpler than before", before referring mainly to Camera2, which was a bit more challenging to get started with at least. CameraX provides a more developer-friendly API with its approach to using use cases. It also handles compatibility, which was a big issue before. Ensuring your camera app works reliably on most of the Android devices out there is very challenging.

Photo capture failed The completer object was garbage collected this future would otherwise never complete

I'm setting the target resolution like below
var imageResolution = Size(480, 640)
imageCapture = ImageCapture.Builder()
.setTargetResolution(imageResolution)
.build()
Now, I need to change the resolution. So, I tried
var imageResolution = Size(1200, 1600)
imageCapture?.updateSuggestedResolution(imageResolution)
but, it is giving a error
error 1
UseCase.updateSuggestedResolution can only be called from within the same library group (groupId=androidx.camera)
error 2
Photo capture failed: The completer object was garbage collected - this future would otherwise never complete. The tag was: issueTakePicture[stage=0]
I didn't able to figure it out when error 1 & 2 will occur and noticed that error 2 will not let the image to get saved.
And all images if taken and successfully saved the resolution was 1080*1080 only, If I have tried to change its resolution at least once. else, after ImageCapture.Builder() step If I didn't tried to change the resolution it would retain the resolution what I mentioned.
why it is coming and How to avoid this warning ?

ok, I had the same issue today (Error 2).
I was using an old way to manage onActivityResult (the depreciated way), after I receive the result, the app open de camera and it takes the picture. For some reason, if the camera is started after de activityResult it throws the garbage collected problem.
To solve that, I used the new way to manage the onActivityResult.
This links helped me:
OnActivityResult method is deprecated, what is the alternative?
https://developer.android.com/training/basics/intents/result#java

Android cwac-camera to take multiple photos?

The title may be unclear, but I'm using this awesome library by CommonsWare(nice meeting you at DroidCon btw) to deal with the notorious issues with Android's fragmented camera api.
I want to take 5 photos, or frames..but not simultaneously. Each frame should capture another shot a few milliseconds apart, or presumably after the previous photo has been successfully captured. Can this be done?
I'm following the standalone implementation in the demos, and simply taking a photo using
mCapture.setOnClickListener(new View.OnClickListener() {
#Override
public void onClick(View view) {
try {
takePicture(true, false);
}catch(Exception e){
e.printStackTrace();
}
}
});
Passing in true to takePicture() because I will need the resulting Bitmap. I also disabled single shot mode since I will want to take another photo right after the previous has be snapped, and the preview is resumed
By default, the result of taking a picture is to return the
CameraFragment to preview mode, ready to take the next picture.
If, instead, you only need the one picture, or you want to send the
user to some other bit of UI first and do not want preview to start up
again right away, override useSingleShotMode() in your CameraHost to
return true. Or, call useSingleShotMode() on your
SimpleCameraHost.Builder, passing in a boolean to use by default. Or,
call useSingleShotMode() on your PictureTransaction, to control this
for an individual picture.
I was looking for a callback like onPictureTaken() or something similar inside CameraHost, that would allow me to go ahead and snap another photo right away before releasing the camera, but I don't see anything like this. Anyone ever done something like this using this library? Can the illustious CommonsWare please shed some light on this as well(if you see this?)
Thank you!

Read past the quoted paragraph to the next one, which begins with:
You will then probably want to use your own saveImage() implementation in your CameraHost to do whatever you want instead of restarting the preview. For example, you could start another activity to do something with the image.
If what you want is possible, you would call takePicture() again in saveImage() of your CameraHost, in addition to doing something with the image you received.
However:
Even with large heap enabled, you may not have enough heap space for what you are trying to do. You may need to explicitly choose a lower resolution image for the pictures.
This isn't exactly within the scope of the library. It may work, and I don't have a problem with it working, but being able to take N pictures in M seconds isn't part of the library's itch that I am (very very slowly) scratching. In particular, I don't think I have tested taking a picture with the preview already off, and there may be some issues in my code in that area.
Long-term, you may be better served with preview frame processing, rather than actually taking pictures.

CWAC Camera - What's the best way to customize the ImageCleanupTask?

I'm using the cwac-camera library to take photos with a custom in-app camera.
I'm overriding adjustPreviewParameters in SimpleCameraHost and am setting the JPEG quality.
#Override
public Parameters adjustPreviewParameters(Parameters parameters) {
super.adjustPreviewParameters(parameters);
parameters.setJpegQuality(80);
return (parameters);
}
Unfortunately, as per this question, the setJpegQuality method doesn't work on some devices (e.g. the S3).
I can see that the cwac-camera ImageCleanupTask always saves the manipulated image at 100% JPEG quality.
What's the best way to customize the ImageCleanupTask?
Should I expose a setJpegQuality method in PictureTransaction? Or do we want a more versatile solution (like allowing the ImageCleanupTask to be injected)?

I can see that the cwac-camera ImageCleanupTask always saves the manipulated image at 100% JPEG quality.
Ideally, that would be configurable. There are lots of things the library would ideally do. :-)
What's the best way to customize the ImageCleanupTask?
If you mean "how would one get the JPEG percentage in there?", augment PictureTransaction.
Should I expose a setJpegQuality method in PictureTransaction?
I would do jpegQuality(), as PictureTransaction uses the builder/fluent API pattern.
Note that with this change, you would want to remove parameters.setJpegQuality(80); from your existing code. Otherwise, the image will be degraded twice, once on capture (for devices that support it) and once when the image is written to disk, and that's probably not what you want.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.