YUV (NV21) to BGR conversion on mobile devices (Native Code)

YUV (NV21) to BGR conversion on mobile devices (Native Code) - android

I'm developing a mobile application that runs on Android and IOS. It's capable of real-time-processing of a video stream. On Android I get the Preview-Videostream of the camera via android.hardware.Camera.PreviewCallback.onPreviewFrame. I decided to use the NV21-Format, since it should be supported by all Android-devices, whereas RGB isn't (or just RGB565).
For my algorithms, which mostly are for pattern recognition, I need grayscale images as well as color information. Grayscale is not a problem, but the color conversion from NV21 to BGR takes way too long.
As described, I use the following method to capture the images;
In the App, I override the onPreviewFrame-Handler of the Camera. This is done in CameraPreviewFrameHandler.java:
#Override
public void onPreviewFrame(byte[] data, Camera camera) {
{
try {
AvCore.getInstance().onFrame(data, _prevWidth, _prevHeight, AvStreamEncoding.NV21);
} catch (NativeException e)
{
e.printStackTrace();
}
}
The onFrame-Function then calls a native function which fetches data from the Java-Objects as local references. This is then converted to an unsigned char* bytestream and calls the following c++ function, which uses OpenCV to convert from NV21 to BGR:
void CoreManager::api_onFrame(unsigned char* rImageData, avStreamEncoding_t vImageFormat, int vWidth, int vHeight)
{
// rImageData is a local JNI-reference to the java-byte-array "data" from onPreviewFrame
Mat bgrMat; // Holds the converted image
Mat origImg; // Holds the original image (OpenCV-Wrapping around rImageData)
double ts; // for profiling
switch(vImageFormat)
{
// other formats
case NV21:
origImg = Mat(vHeight + vHeight/2, vWidth, CV_8UC1, rImageData); // fast, only creates header around rImageData
bgrMat = Mat(vHeight, vWidth, CV_8UC3); // Prepare Mat for target image
ts = avUtils::gettime(); // PROFILING START
cvtColor(origImg, bgrMat, CV_YUV2BGRA_NV21);
_onFrameBGRConversion.push_back(avUtils::gettime()-ts); // PROFILING END
break;
}
[...APPLICATION LOGIC...]
}
As one might conclude from comments in the code, I profiled the conversion already and it turned out that it takes ~30ms on my Nexus 4, which is unacceptable long for such a "trivial" pre-processing step. (My profiling methods are double-checked and working properly for real-time measurement)
Now I'm trying desperately to find a faster implementation of this color conversion from NV21 to BGR. This is what I've already done;
Adopted the code "convertYUV420_NV21toRGB8888" to C++ provided in this topic (multiple of the conversion-time)
Modified the code from 1 to use only integer operations (doubled conversion-time of openCV-Solution)
Browsed through a couple other implementations, all with similar conversion-times
Checked OpenCV-Implementation, they use a lot of bit-shifting to get performance. Guess I'm not able to do better on my own
Do you have suggestions / know good implementations or even have a completely different way to work around this Problem? I somehow need to capture RGB/BGR-Frames from the Android-Camera and it should work on as many Android-devices as possible.
Thanks for your replies!

Did you try libyuv? I used it in the past and if you compile it with NEON support, it uses an asm code optimized for ARM processors, you can start from there to further optimize for your special situation.

Related

Is it possible to grap one pixel from CameraPreview?

I followed the Android Studio tutorial to get the CameraPreview to work (Camera API Android Developer Guide). This works fine for me and i can view the camera stream in my FrameLayout.
But I would like to get the RGB values from a specific Pixel in the Preview everytime it changes. I did not find a method which gives me the previewImage as a bitmap and was not able to understand the usage of the onPreviewFrame method
#Override
public void onPreviewFrame(byte[] data, Camera camera) {}
How can I get the RGB values from a Camerapreview Pixel?

If you are using the Camera2 API, you can implement the ImageReader.OnImageAvailableListener class in your application. After that, you override the onImageAvailable function , which gets an ImageReader as argument. Then you can access the image just recorded with imageReader.acquireNextImage().

With either API, you need to handle processing YUV data yourself, unfortunately.
Camera devices natively produce YUV data, not RGB, so the API doesn't spend extra resources to auto-convert the data. The main easy exception is piping data to the GPU, where the GPU driver auto-converts YUV to RGB for you within your pixel shader.
But if you're just in regular app code, you need to parse the data.
For the deprecated android.hardware.Camera API, the output is NV21 by default, and you can usually select YV12 as another option.
The wikipedia article on YUV is relatively helpful: https://en.wikipedia.org/wiki/YUV
But it does have the wrong conversion coefficients for YUV->RGB conversion; they should be:
R = Y + 1.402 (Cr-128)
G = Y - 0.34414 (Cb-128) - 0.71414 (Cr-128)
B = Y + 1.772 (Cb-128)
(Cb = U, Cr = V)
You can also take a look at this stackoverflow post:
Extract black and white image from android camera's NV21 format
which has code that looks to be correct for the conversion.

Android Camera2 take picture while processing frames

I am using Camera2 API to create a Camera component that can scan barcodes and has ability to take pictures during scanning. It is kinda working but the preview is flickering - it seems like previous frames and sometimes green frames are interrupting realtime preview.
My code is based on Google's Camera2Basic. I'm just adding one more ImageReader and its surface as a new output and target for CaptureRequest.Builder. One of the readers uses JPEG and the other YUV. Flickering disappears when I remove the JPEG reader's surface from outputs (not passing this into createCaptureSession).
There's quite a lot of code so I created a gist: click - Tried to get rid of completely irrelevant code.

Is the device you're testing on a LEGACY-level device?
If so, any captures targeting a JPEG output may be much slower since they can run a precapture sequence, and may briefly pause preview as well.
But it should not cause green frames, unless there's a device-level bug.

If anyone ever struggles with this. There is table in the docs showing that if there are 3 targets specified, the YUV ImageReader can use images with maximum size equal to the preview size (maximum 1920x1080). Reducing this helped!

Yes you can. Assuming that you configure your preview to feed the ImageReader with YUV frames (because you could also put JPEG there, check it out), like so:
mImageReaderPreview = ImageReader.newInstance(mPreviewSize.getWidth(), mPreviewSize.getHeight(), ImageFormat.YUV_420_888, 1);
You can process those frames inside your OnImageAvailable listener:
#Override
public void onImageAvailable(ImageReader reader) {
Image mImage = reader.acquireNextImage();
if (mImage == null) {
return;
}
try {
// Do some custom processing like YUV to RGB conversion, cropping, etc.
mFrameProcessor.setNextFrame(mImage));
mImage.close();
} catch (IllegalStateException e) {
Log.e("TAG", e.getMessage());
}

Capture image without saving

Based on this thread, is there a way to process an image from camera in QML without saving it?
Starting from the example of the doc the capture() function save the image to Pictures location.
What I would like to achieve, is to process the camera image every second using onImageCaptured but I don't want to save it to the drive.
I've tried to implement a cleanup operation using onImageSaved signal but it's affecting onImageCaptured too .

As explained in this answer you can bridge C++ and QML via the mediaObject. That can be done via objectName (as in the linked answer) or by using a dedicated Q_PROPERTY (more on that later). In either case you should end up with a code like this:
QObject * source // QML camera pointer obtained as described above
QObject * cameraRef = qvariant_cast<QMediaObject*>(source->property("mediaObject"));
Once you got the hook to the camera, use it as a source for a QVideoProbe object, i.e.
QVideoProbe *probe = new QVideoProbe;
probe->setSource(cameraRef);
Connect the videoFrameProbed signal to an appropriate slot, i.e.
connect(probe, SIGNAL(videoFrameProbed(QVideoFrame)), this, SLOT(processFrame(QVideoFrame)));
and that's it: you can now process your frames inside the processFrame function. An implementation of such a function looks like this:
void YourClass::processFrame(QVideoFrame frame)
{
QVideoFrame cFrame(frame);
cFrame.map(QAbstractVideoBuffer::ReadOnly);
int w {cFrame.width()};
int h {cFrame.height()};
QImage::Format f;
if((f = QVideoFrame::imageFormatFromPixelFormat(cFrame.pixelFormat())) == QImage::Format_Invalid)
{
QImage image(cFrame.size(), QImage::Format_ARGB32);
// NV21toARGB32 convertion!!
//
// DECODING HAPPENS HERE on "image"
}
else
{
QImage image(cFrame.bits(), w, h, f);
//
// DECODING HAPPENS HERE on "image"
}
cFrame.unmap();
}
Two important implementation details here:
Android devices use YUV format which is not supported currently by QImage and which should be converted by hand. I've made the strong assumption here that all the invalid formats are YUV. That would be better managed via ifdef's conditionals over the current OS.
The decoding can be quite costy so you can skip frames (simply add a counter to this method) or off load the work to a dedicated thread. That also depend on the pace at which frames are elaborated. Also reducing their size, e.g. taking only a portion of the QImage can greatly improve performances.
For that matter I would avoid at all the objectName approach for fetching the mediaObject and instead I would register a new type so that the Q_PROPERTY approach can be used. I'm thinking about something along the line of this:
class FrameAnalyzer
{
Q_OBJECT
Q_PROPERTY(QObject* source READ source WRITE setSource)
QObject *m_source; // added for the sake of READ function
QVideoProbe probe;
// ...
public slots:
void processFrame(QVideoFrame frame);
}
where the setSource is simply:
bool FrameAnalyzer::setSource(QObject *source)
{
m_source = source;
return probe.setSource(qvariant_cast<QMediaObject*>(source->property("mediaObject")));
}
Once registered as usual, i.e.
qmlRegisterType<FrameAnalyzer>("FrameAnalyzer", 1, 0, "FrameAnalyzer");
you can directly set the source property in QML as follows:
// other imports
import FrameAnalyzer 1.0
Item {
Camera {
id: camera
// camera stuff here
Component.onCompleted: analyzer.source = camera
}
FrameAnalyzer {
id: analyzer
}
}
A great advantage of this approach is readibility and the better coupling between the Camera code and the processing code. That comes at the expense of a (slightly) higher implementation effort.

How to improve OpenCV face detection performance in android?

I am working on a project in android in which i am using OpenCV to detect faces from all the images which are in the gallery. The process of getting faces from the images is performing in the service. Service continuously working till all the images are processed. It is storing the detected faces in the internal storage and also showing in the grid view if activity is opened.
My code is:
CascadeClassifier mJavaDetector=null;
public void getFaces()
{
for (int i=0 ; i<size ; i++)
{
File file=new File(urls.get(i));
imagepath=urls.get(i);
defaultBitmap=BitmapFactory.decodeFile(file, bitmapFatoryOptions);
mJavaDetector = new CascadeClassifier(FaceDetector.class.getResource("lbpcascade_frontalface").getPath());
Mat image = new Mat (defaultBitmap.getWidth(), defaultBitmap.getHeight(), CvType.CV_8UC1);
Utils.bitmapToMat(defaultBitmap,image);
MatOfRect faceDetections = new MatOfRect();
try
{
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));
}
catch(Exception e)
{
e.printStackTrace();
}
if(faceDetections.toArray().length>0)
{
}
}
}
Everything is fine but it is detection faces very slow. The performance is very slow. When i debug the code then i found the line which is taking time is:
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));
I have checked multiple post for this problem but i didn't get any solution.
Please tell me what should i do to solve this problem.
Any help would be greatly appreciated. Thank you.

You should pay attention to the parameters of detectMultiScale():
scaleFactor – Parameter specifying how much the image size is reduced at each image scale. This parameter is used to create a scale pyramid. It is necessary because the model has a fixed size during training. Without pyramid the only size to detect would be this fix one (which can be read from the XML also). However the face detection can be scale-invariant by using multi-scale representation i.e., detecting large and small faces using the same detection window.
scaleFactor depends on the size of your trained detector, but in fact, you need to set it as high as possible while still getting "good" results, so this should be determined empirically.
Your 1.1 value can be a good value for this purpose. It means, a relative small step is used for resizing (reduce size by 10%), you increase the chance of a matching size with the model for detection is found. If your trained detector has the size 10x10 then you can detect faces with size 11x11, 12x12 and so on. But in fact a factor of 1.1 requires roughly double the # of layers in the pyramid (and 2x computation time) than 1.2 does.
minNeighbors – Parameter specifying how many neighbours each candidate rectangle should have to retain it.
Cascade classifier works with a sliding window approach. By applying this approach, you slide a window through over the image than you resize it and search again until you can not resize it further. In every iteration the true outputs (of cascade classifier) are stored but unfortunately it actually detects many false positives. And to eliminate false positives and get the proper face rectangle out of detections, neighbourhood approach is applied. 3-6 is a good value for it. If the value is too high then you can lose true positives too.
minSize – Regarding to the sliding window approach of minNeighbors, this is the smallest window that cascade can detect. Objects smaller than that are ignored. Usually cv::Size(20, 20) are enough for face detections.
maxSize – Maximum possible object size. Objects bigger than that are ignored.
Finally you can try different classifiers based on different features (such as Haar, LBP, HoG). Usually, LBP classifiers are a few times faster than Haar's, but also less accurate.
And it is also strongly recommended to look over these questions:
Recommended values for OpenCV detectMultiScale() parameters
OpenCV detectMultiScale() minNeighbors parameter

Instead reading images as Bitmap and then converting them to Mat via using Utils.bitmapToMat(defaultBitmap,image) you can directly use Mat image = Highgui.imread(imagepath); You can check here for imread() function.
Also, below line takes too much time because the detector is looking for faces with at least having Size(20, 20) which is pretty small. Check this video for visualization of face detection using OpenCV.
mJavaDetector.detectMultiScale(image,faceDetections,1.1, 10, 0, new Size(20,20), new Size(image.width(), image.height()));

Grabbing consecutive frames in android using opencv

I am trying to grab consecutive frames from android using opencv VideoCapture class. Actually I want to implement optical flow on android for which i need 2 frames. I implemented optical flow in C first where I grabbed the frames using using cvQueryFrame and every thing work fine. But in android when I call
if(capture.grab())
{
if(capture.retrieve(mRgba))
Log.i(TAG, "first frame retrived");
}
if(capture.grab())
{
if(capture.retrieve(mRgba2))
Log.i(TAG, "2nd frame retrived");
}
and then subtract the matrices using Imgproc.subtract(mRgba,mRgba2,output) and then display the output it give me black image indicating that mRgba and mRgba2 are image frames with same data. Can any one help how to grab two different images. According to opencv documentation mRgba and mRgba2 should be different.

This question is an exact duplicate of
read successive frames OpenCV using cvQueryframe
You have to copy the image to another memory block, because the capture always returns the same pointer.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.