How do you take a picture with camerax? - android

I'm still practicing with Kotlin and Android Developing. As far as I understood, Camera class has been deprecated, and Android invites to use Camerax instead, because this high-level class is device-indipendent, and they've made simpler the process of implementing cameras on apps.
I've tried to read the documentation (https://developer.android.com/training/camerax) but it's written so bad I barely understood what they are trying to explain.
So I went to read the entire sample code given in the documentation itself (https://github.com/android/camera-samples/tree/main/CameraXBasic).
The CameraFragment code is about 500 lines long (ignoring imports and various comments).
Do I really need to write 500 lines of code to simply take a picture?
How is this supposed to be considered "simpler than before"?
I mean, Android programming is at the point where I just need to write only 4 lines of code to ask the user to select an Image from his storage and retreive it and show it in an ImageView.
Is there a TRUE simple way to take a picture, or do I really need to stop and lose a whole day of work to write all those lines of code?
EDIT:
Take this page of the documentation:
https://developer.android.com/training/camerax/architecture#kotlin
It starts with this piece of code.
val preview = Preview.Builder().build()
val viewFinder: PreviewView = findViewById(R.id.previewView)
// The use case is bound to an Android Lifecycle with the following code
val camera = cameraProvider.bindToLifecycle(lifecycleOwner, cameraSelector, preview)
cameraProvider comes out of nowhere. What is this supposed to be? I've found out it's a ProcessCameraProvider, but how am I supposed to initialize it?
Should it be a lateinit var or has it already been initialized somewhere else?
Because if I try to write val cameraProvider = ProcessCameraProvider() I get an error, so what am I supposed to do?
What is cameraSelector parameter? It's not defined before. I've found out it's the selector for the front or back camera, but how am I supposed to know it reading that page of the documentation?
How could have this documentation been released with these kind of lackings?
How is someone supposed to learn with ease?

Before you can interact with the device's cameras using CameraX, you need to initialize the library. The initialization process is asynchronous, and involves things like loading information about the device's cameras.
You interact with the device's cameras using a ProcessCameraProvider. It's a Singleton, so the first time you get an instance of if, CameraX performs its initialization.
val cameraProviderFuture: ListenableFuture<ProcessCameraProvider> = ProcessCameraProvider.getInstance(context)
Getting the ProcessCameraProvider singleton returns a Future because it might need to initialize the library asynchronously. The first time you get it, it might take some time (usually well under a second), subsequent calls though will return immediately, as the initialization will have already been performed.
With a ProcessCameraProvider in hand, you can start interacting with the device's cameras. You choose which camera to interact with using a CameraSelector, which wraps a set of filters for the camera you want to use. Typically, if you're just trying to use the main back or front camera, you'd use CameraSelector.DEFAULT_BACK_CAMERA or CameraSelector.DEFAULT_FRONT_CAMERA.
Now that you've defined which camera you'll use, you build the use cases you'll need. For example, you want to take a picture, so you'll use the ImageCapture use case. It allows taking a single capture frame (typically a high quality one) using the camera, and providing it either as a raw buffer, or storing it in a file. To use it, you can configure it if you'd wish, or you can just let CameraX use a default configuration.
val imageCapture = ImageCapture.Builder().build()
In CameraX, a camera's lifecycle is controlled by a LifecycleOwner, meaning that when the LifecycleOwner's lifecycle starts, the camera opens, and when it stops, the camera closes. So you'll need to choose a lifecycle that will control the camera. If you're using an Activity, you'd typically want the camera to start as the Activity starts, and stop when it stops, so you'd use the Activity instance itself as the LifecycleOwner, if you were using a Fragment, you might want to use its view lifecycle (Fragment.getViewLifecycleOwner()).
Lastly, you need to put the pieces of the puzzle together.
processCameraProvider.bindToLifecycle(
lifecycleOwner,
cameraSelector,
imageCapture
)
An app typically includes a viewfinder that displays the camera's preview, so you can use a Preview use case, and bind it with the ImageCapture use case. The Preview use case allows streaming camera frames to a Surface. Since setting up the Surface and correctly drawing the preview on it can be complex, CameraX provides PreviewView, a View that can be used with the Preview use case to display the camera preview. You can check out how to use them here.
// Just like ImageCapture, you can configure the Preview use case if you'd wish.
val preview = Preview.Builder().build()
// Provide PreviewView's Surface to CameraX. The preview will be drawn on it.
val previewView: PreviewView = findViewById(...)
preview.setSurfaceProvider(previewView.surfaceProvider)
// Bind both the Preview and ImageCapture use cases
processCameraProvider.bindToLifecycle(
lifecycleOwner,
cameraSelector,
imageCapture,
preview
)
Now to actually take a picture, you use on of ImageCapture's takePicture methods. One provides a JPEG raw buffer of the captured image, the other saves it in a file that you provide (make sure you have the necessary storage permissions if you need any).
imageCapture.takePicture(
ContextCompat.getMainExecutor(context), // Defines where the callbacks are run
object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(imageProxy: ImageProxy) {
val image: Image = imageProxy.image // Do what you want with the image
imageProxy.close() // Make sure to close the image
}
override fun onError(exception: ImageCaptureException) {
// Handle exception
}
}
)
val imageFile = File("somePath/someName.jpg") // You can store the image in the cache for example using `cacheDir.absolutePath` as a path.
val outputFileOptions = ImageCapture.OutputFileOptions
.Builder(imageFile)
.build()
takePicture(
outputFileOptions,
CameraXExecutors.mainThreadExecutor(),
object : ImageCapture.OnImageSavedCallback {
override fun onImageSaved(outputFileResults: ImageCapture.OutputFileResults) {
}
override fun onError(exception: ImageCaptureException) {
}
}
)
Do I really need to write 500 lines of code to simply take a picture?
How is this supposed to be considered "simpler than before"?
CameraXBasic is not as "basic" as its name might suggest x) It's more of a complete example of CameraX's 3 use cases. Even though the CameraFragment is long, it explains things nicely so that it's more accessible to everyone.
CameraX is "simpler than before", before referring mainly to Camera2, which was a bit more challenging to get started with at least. CameraX provides a more developer-friendly API with its approach to using use cases. It also handles compatibility, which was a big issue before. Ensuring your camera app works reliably on most of the Android devices out there is very challenging.

Related

How to disable noise reduction with CameraX

So I have an application that uses CameraX ImageCapture use case to take a selfie picture than then is passed to an AI algorithm to do some stuff with it.
Now, I have an user with a Samsung Galaxy S21 who when taking pictures in one specific place with some specific light conditions is producing an image that doesn't work as expected with the AI algorithm. I have examined these images myself and noticed that the problem seems to be that ImageCapture applies strong noise reduction, so strong that even for the human eye it looks wrong, like if it was a painting instead of a photograph.
I sent a modified version of such app to this user for trying which captures the image from the Analysis use case instead and the produced image does not have that problem, so it seems whatever it is, it's some post-processing done by the ImageCapture use case that's not done in the Analysis use case.
Now, I don't seem to find a way to tweak this post-processing in CameraX, in fact I haven't even found how to do it with Camera2 neither. At first I thought it may be HDR, and I found there are some extensions to enable HDR, Night Mode and such in CameraX, but all these are disabled by default according to the documentation, and as far as you use the DEFAULT_FRONT_CAMERA none should be applied, and that's what I'm using.
CameraSelector.DEFAULT_FRONT_CAMERA
In any case it's clear that some heavy post-processing is being done to these images in the ImageCapture use case, so I'm wondering how could I disable these.
BTW, I tried initialising the ImageCapture use case with CAPTURE_MODE_MINIMIZE_LATENCY in the hope that such flag would reduce the post-processing and hopefully remove noise reduction, but that didn't work.
imageCapture = new ImageCapture.Builder()
.setTargetResolution(resolution)
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY)
.build();
Any ideas on how to go beyond this to get that noise reduction filter disabled?
Thanks,
Fran
I found a way using Camera2Interop.Extender:
private void initImageCapture(Size resolution) {
Log.d(TAG, "initCameraCapture: ");
ImageCapture.Builder imageCaptureBuilder = new ImageCapture.Builder();
imageCaptureBuilder
.setTargetResolution(resolution)
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY);
Camera2Interop.Extender extender = new Camera2Interop.Extender(imageCaptureBuilder);
extender.setCaptureRequestOption(CaptureRequest.NOISE_REDUCTION_MODE, CaptureRequest.NOISE_REDUCTION_MODE_OFF);
imageCapture = imageCaptureBuilder.build();
}

What is the distinct difference between an ImageAnalyzer and VisionProcessor in Android MLKit, if any?

I'm new to MLKit.
One of the first thing I've noticed from looking at the docs as well as the sample MLKit apps is that there seems to be multiple ways to attach/use image processors/analyzers.
In some cases they demonstrate using the ImageAnalyzer api https://developers.google.com/ml-kit/vision/image-labeling/custom-models/android
private class YourImageAnalyzer : ImageAnalysis.Analyzer {
override fun analyze(imageProxy: ImageProxy) {
val mediaImage = imageProxy.image
if (mediaImage != null) {
val image = InputImage.fromMediaImage(mediaImage, imageProxy.imageInfo.rotationDegrees)
// Pass image to an ML Kit Vision API
// ...
}
}
}
It seems like analyzers can be bound to the lifecycle of CameraProviders
cameraProvider.bindToLifecycle(this, cameraSelector, preview, imageCapture, imageAnalyzer)
In other cases shown in MLKit showcase apps, the CameraSource has a frame processor that can be set.
cameraSource?.setFrameProcessor(
if (PreferenceUtils.isMultipleObjectsMode(this)) {
MultiObjectProcessor(graphicOverlay!!, workflowModel!!)
} else {
ProminentObjectProcessor(graphicOverlay!!, workflowModel!!)
}
)
So are these simply two different approaches of doing the same thing? Can they be mixed and matched? Are there performance benefits in choosing one over the other?
As a concrete example: if I wanted to use the MLKit ImageLabeler, should I wrap it in a processor and set it as the ImageProcessor for CameraSource, or use it in the Image Analysis callback and bind that to the CameraProvider?
Lastly in the examples where CameraSource is used (MLKit Material showcase app) there is no use of CameraProvider... is this simply because CameraSource makes it irrelevant and unneeded? In that case, is binding an ImageAnalyzer to a CameraProvider not even an option? Would one simply set different ImageProcessors to the CameraSource on demand as they ran through different scenarios such as ImageLabelling, Object Detection, Text Recognition etc ?
The difference is due to the underlying camera implementation. The analyzer interface is from CameraX while the processor needs to be written by developer for camera1.
If you want to use android.hardware.Camera, you need to follow the example to create a processor and feed camera output to MLKit.
If you want to use cameraX, then you can follow the example in the vision sample app and find CameraXLivePreviewActivity.

How to take a picture where all settings are set manually including the flash without missing the image that contains the full flash?

I used the latest Camera2Basic sample program as a source for my trials:
https://github.com/android/camera-samples.git
Basically I configured the CaptureRequest before I call the capture() function in the takePhoto() function like this:
private fun prepareCaptureRequest(captureRequest: CaptureRequest.Builder) {
//set all needed camera settings here
captureRequest.set(CaptureRequest.CONTROL_MODE, CaptureRequest.CONTROL_MODE_OFF)
captureRequest.set(CaptureRequest.CONTROL_AF_MODE, CaptureRequest.CONTROL_AF_MODE_OFF);
//captureRequest.set(CaptureRequest.CONTROL_AF_TRIGGER, CaptureRequest.CONTROL_AF_TRIGGER_CANCEL);
//captureRequest.set(CaptureRequest.CONTROL_AWB_LOCK, true);
captureRequest.set(CaptureRequest.CONTROL_AWB_MODE, CaptureRequest.CONTROL_AWB_MODE_OFF);
captureRequest.set(CaptureRequest.CONTROL_AE_MODE, CaptureRequest.CONTROL_AE_MODE_OFF);
//captureRequest.set(CaptureRequest.CONTROL_AE_LOCK, true);
//captureRequest.set(CaptureRequest.CONTROL_AE_PRECAPTURE_TRIGGER, CaptureRequest.CONTROL_AE_PRECAPTURE_TRIGGER_CANCEL);
//captureRequest.set(CaptureRequest.NOISE_REDUCTION_MODE, CaptureRequest.NOISE_REDUCTION_MODE_FAST);
//flash
if (mState == CaptureState.PRECAPTURE){
//captureRequest.set(CaptureRequest.CONTROL_AE_MODE, CaptureRequest.CONTROL_AE_MODE_OFF);
captureRequest.set(CaptureRequest.FLASH_MODE, CaptureRequest.FLASH_MODE_OFF)
}
if (mState == CaptureState.TAKEPICTURE) {
//captureRequest.set(CaptureRequest.FLASH_MODE, CaptureRequest.FLASH_MODE_SINGLE)
//captureRequest.set(CaptureRequest.CONTROL_AE_MODE, CaptureRequest.CONTROL_AE_MODE_ON_ALWAYS_FLASH);
captureRequest.set(CaptureRequest.FLASH_MODE, CaptureRequest.FLASH_MODE_SINGLE)
}
val iso = 100
captureRequest.set(CaptureRequest.SENSOR_SENSITIVITY, iso)
val fractionOfASecond = 750.toLong()
captureRequest.set(CaptureRequest.SENSOR_EXPOSURE_TIME, 1000.toLong() * 1000.toLong() * 1000.toLong() / fractionOfASecond)
//val exposureTime = 133333.toLong()
//captureRequest.set(CaptureRequest.SENSOR_EXPOSURE_TIME, exposureTime)
//val characteristics = cameraManager.getCameraCharacteristics(cameraId)
//val configs: StreamConfigurationMap? = characteristics[CameraCharacteristics.SCALER_STREAM_CONFIGURATION_MAP]
//val frameDuration = 33333333.toLong()
//captureRequest.set(CaptureRequest.SENSOR_FRAME_DURATION, frameDuration)
val focusDistanceCm = 20.0.toFloat() //20cm
captureRequest.set(CaptureRequest.LENS_FOCUS_DISTANCE, 100.0f / focusDistanceCm)
//captureRequest.set(CaptureRequest.COLOR_CORRECTION_MODE, CameraMetadata.COLOR_CORRECTION_MODE_FAST)
captureRequest.set(CaptureRequest.COLOR_CORRECTION_MODE, CaptureRequest.COLOR_CORRECTION_MODE_TRANSFORM_MATRIX)
val colorTemp = 8000.toFloat();
val rggb = colorTemperature(colorTemp)
//captureRequest.set(CaptureRequest.COLOR_CORRECTION_TRANSFORM, colorTransform);
captureRequest.set(CaptureRequest.COLOR_CORRECTION_GAINS, rggb);
}
but the picture that is returned never is the picture where the flash is at its brightest. This is on a Google Pixel 2 device.
As I only take one picture I am also not sure how to check some CaptureResult states to find the correct one as there is only one.
I already looked at the other solutions to similar problems here but they were either never really solved or somehow took the picture during capture preview which I don't want.
Other strange observations are that on different devices the images are taken (also not always at the right moment), but then the manual values I set are not observed in the JPEG metadata of the image.
If needed I can put my git fork on github.
Long exposure time in combination with flash seems to be the basic issue and when the results are not that good, this means that the timing of your preset isn't that good. You'd have to optimize the exposure time's duration, in relation to the flash's timing (just check the EXIF of some photos for example values). You could measure the luminosity with an ImageAnalysis.Analyzer (this had been removed from the sample application, but elder revisions still have an example). And I've tried with the default Motorola camera app; there the photo also seems to be taken shortly after the flash, when the brightness is already decaying (in order to avoid the dazzling bright). That's the CaptureState.PRECAPTURE, where you switch the flash off. Flashing in two stages is rather the default and this might yield better results.
If you want it to be dazzlingly bright (even if this is generally not desired), you could as well first switch on the torch, that the image, switch off the torch again (I use something alike this, but only for barcode scanning). This would at least prevent any expose/flash timing issues.
When changed values are not represented in EXIF, you'd need to use ExifInterface, in order to update them (there's an example which updates the orientation, but one can update any value).

Best usage of CameraX for MLKit Text Recognition on Android

I need to implement text recognition using MLKit on Android and I have decided to use the new CameraX api as the camera lib. I am struggling with the correct "pipeline" of classes or data flow of the image because CameraX is quite new and not many resources is out there. The use case is that I take the picture, crop it in the middle by some bounds that are visible in the UI and then pass this cropped image to the MLKit that will process the image.
Given that, is there some place for ImageAnalysis.Analyzer
api? From my understanding this analyzer is used only for previews and not the captured image.
My first idea was to use takePicture method that accepts OnImageCapturedCallback but when I've tried access eg. ImageProxy.height the app crashed with an exception java.lang.IllegalStateException: Image is already closed and I could not find any fix for that.
Then I've decided to use another overload of takePicture method and now I save image to the file, then read it to Bitmap, crop this image and now I have an image that can be passed to MLKit. But when I take a look at FirebaseVisionImage that is passed to FirebaseVisionTextRecognizer it has a factory method to which I can pass the Image that I get from OnImageCapturedCallback which seems that I am doing some unnecessary steps.
So my questions are:
Is there some class (CaptureProcessor?) that will take care of the cropping of taken image? I suppose that then I could use OnImageCapturedCallback where I would receive already cropped image.
Should I even use ImageAnalysis.Analyzer if I am not doing realtime processing and I am doing post processing?
I suppose that I can achieve what I want with my current approach but I am feeling that I could use much more of CameraX than I currently am.
Thanks!
Is there some class (CaptureProcessor?) that will take care of the
cropping of taken image?
You can set the crop aspect ratio after you build the ImageCapture use case by using the setCropAspectRatio(Rational) method. This method crops from the center of the rotated output image. So basically what you'd get back after calling takePicture() is what I think you're asking for.
Should I even use ImageAnalysis.Analyzer if I am not doing realtime
processing and I am doing post processing?
No, it wouldn't make sense in your scenario. As you mentioned, only when doing real-time image processing would you want to use ImageAnalysis.Analyzer.
ps: I'd be interested in seeing the code you use for takePicture() that caused the IllegalStateException.
[Edit]
Taking a look at your code
imageCapture?.takePicture(executor, object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(image: ImageProxy) {
// 1
super.onCaptureSuccess(image)
// 2
Log.d("MainActivity", "Image captured: ${image.width}x${image.height}")
}
})
At (1), if you take a look at super.onCaptureSuccess(imageProxy)'s implementation, it actually closes the imageProxy passed to the method. Accessing the image's width and height in (2) throws an exception, which is normal -since the image has been closed-. The documentation states:
The application is responsible for calling ImageProxy.close() to close
the image.
So when using this callback, you should probably not call super..., just use the imageProxy, then before returning from the method, manually close it (ImageProxy.close()).

Android CameraX - face detection while recording video

I'm using the new library CameraX with Firebase ML Kit in Android and detecting faces every frame the device can.
So I set CameraX like that:
CameraX.bindToLifecycle(this, preview, imageCapture, faceDetectAnalyzer)
All working flowless, now, while I'm doing that, I want to record a video.
So basically I want to to detect faces while recording a video.
I tried:
CameraX.bindToLifecycle(this, preview, imageCapture, faceDetectAnalyzer, videoCapture)
But I'm getting an error saying that there are too many parameters so I guess that's not the right way.
I know that this library still in alpha but I guess there is a way to do that.
Even if there is not jet, what's another way to implement face detection while recording a video with Firebase ML?
I didn't use CameraX a lot, but I'm usually working with Camera 2 API and Firebase ML Kit.
For use both API together, you should get the Image callbacks from your Preview Size ImageReader. On that callback you can use that Images to create a FirebaseVisionFace through the API and do whatever you want with it.
Using Kotlin and Coroutines it should look like these:
private val options: FirebaseVisionFaceDetectorOptions = FirebaseVisionFaceDetectorOptions.Builder()
.setContourMode(FirebaseVisionFaceDetectorOptions.ALL_CONTOURS)
.build()
val detector = FirebaseVision.getInstance().getVisionFaceDetector(options)
suspend fun processImage(image: Image): FirebaseVisionFace {
val metadata = FirebaseVisionImageMetadata.Builder()
.setWidth(image.width) // 480x360 is typically sufficient for image recognition
.setHeight(image.height)
.setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
.build()
val visionImage = FirebaseVisionImage.fromMediaImage(image)
val firebaseVisionFace = detector.detectInImage(visionImage).await()
return firebaseVisionFace
}
If you want to use the await method for Coroutine support you can give a loot to https://github.com/FrangSierra/Firebase-Coroutines-Android

Categories

Resources