In Android development, I have obtained the Yuv video stream byte array, and I want to perform real-time video recognition. I have tried this: first convert the obtained byte array into Bitmap format with the yuvByteArrayToBitmap method, and then use the yolov5ncnn.Detect method to detect , the returned object contains data such as the coordinates of the detection target. Finally, use these data to display the video stream on the ImageView component through the showObjects method and detect the detection frame of the corresponding target in the video. Although object detection can detect rectangular box results. But the whole process is stuck and not smooth.
I have tried commenting out the target detection code, not executing the target detection code, and only displaying the video stream on the Imageview component, but it is still very stuck, so I think it should be slower to display the video stream using the ImageView component. I heard that it is possible to use surface and SurfaceTexture. Is it possible to change each frame of image in the onSurfaceTextureUpdated method to output an image with a target detection rectangle, but I am not familiar with the implementation of this.
Now I want to perform object detection more smoothly and draw rectangles of detected objects on the UI. The byte array of the Yuv video stream has been obtained, and the coordinates of the target rectangle can also be obtained through the target detection algorithm, but I don't know how to display it on the UI smoothly.
Hope to get your help, thank you all.
Related
I am trying to use android MediaPlayer to render frames to the display using a Surface.
I have created an android SurfaceTexture (from an OpenGL texture) and then using it create a Surface and set that Surface as render target for the MediaPlayer object.
I have setup a callback for the SurfaceTexture using setOnFrameAvailableListener() to listen to frames , so that whenever a frame is available I can draw it using my shader code.
Everything works perfect except that the number of onFrameAvailable() callbacks I get is inconsistent. First of all it doesnt match the number of frames in the video (it always seems to be more) and also the number of callbacks also varies from one to another invokation.
I guess the MediaPlayer tries to adjust the frame rate and pushes more frame (A wild guess though)
I need exact number of callbacks as there are frames in the video , as I am having some per frame data offline and match them with the frames.
In my Application, I am using Android Camera API to access Camera of device. I'm receiving camera frames as byte array on callback onPreviewFrame(). I have to process the image/bye array and give to OpenGL display.
I'm not configuring camera to setPreviewDisplay(holder) or setPreviewTexture(surface) for the frames to be rendered on view directly.
I have been googling.. still no useful reference found.
Please suggest useful information or source for image buffer rendering on OPENGL?
What you are looking for is how to render a texture, which in this case is your image byte array data. You will want to create a vertex buffer object with attributes - position and texture coordinates. Basically you will be creating a quad and mapping the corners of the texture to it. You can find a good example here
Currently I'm showing a preview of the camera on the screen providing the camera preview texture - camera.setPreviewTexture(...) (doing it using opengl of course).
I have a native library which get bytes[] as an image, and return a byte[] - the result image related to the input image. I want to call it, and then draw the input image and the result to the screen - one on each other.
I know that in Opengl, in order to get the data of texture back in the CPU we must be read it using glReadPixel and after process i will have to load the result to a texture - which will have big impact on performances to do it each frame.
I thought about using camera.setPreviewCallback(...), There i'm getting the frame (Calling the process method and transfer the result to the my SurfaceView), and parallel continue using the texture preview Technic for drawing on the screen, but than i'm afraid of synchronizing between the frames that i got in the previewCallback to those i got in the texture.
Am i missing anything ? or there is not easy way to solve this issue?
One approach that may be useful is to direct the output of the Camera to an ImageReader, which provides a Surface. Each frame sent to the Surface is made available as YUV data without a copy, which makes it faster than some of the alternatives. The variations in color formats (stride, alignment, interleave) are handled by ImageReader.
Since you want the camera image to be presented simultaneously with the processing output, you can't send frames down two independent paths.
When the frame is ready, you will need to do a color-space conversion and upload the pixels with glTexImage2D(). This will likely be the performance-limiting factor.
From the comments it sounds like you're familiar with image filtering using a fragment shader; for anyone else who finds this, you can see an example here.
In Android, I need an efficient way of modifying the camera stream before displaying it on the screen. This post discusses a couple of ways of doing so and I was able to implement the first one:
Get frame buffer from onPreviewFrame
Convert frame to YUV
Modify frame
Convert modified frame to jpeg
Display frame to ImageView which was placed on SurfaceView used for the
preview
That worked but brought down the 30 fps I was getting with a regular camera preview to 5 fps or so. Converting frames back and forth from different image spaces is also power hungry, which I don't want.
Are there examples on how to get access to the raw frames directly and not have to go through so many conversions? Is using OpenGL the right way of doing this? It must be a very common thing to do but I can't find good examples.
Note: I'd rather avoid using the Camera2 APIs for backward compatibility sake.
The most efficient form of your CPU-based pipeline would look something like this:
Receive frames from the Camera on a Surface, rather than as byte[]. With Camera2 you can send the frames directly to an ImageReader; that will get you CPU access to the raw YUV data from the Camera without copying it or converting it. (I'm not sure how to rig this up with the old Camera API, as it wants either a SurfaceTexture or SurfaceHolder, and ImageReader doesn't provide those. You can run the frames through a SurfaceTexture and get RGB values from glReadPixels(), but I don't know if that'll buy you anything.)
Perform your modifications on the YUV data.
Convert the YUV data to RGB.
Either convert the RGB data into a Bitmap or a GLES texture. glTexImage2D will be more efficient, but OpenGL ES comes with a steep learning curve. Most of the pieces you need are in Grafika (e.g. the texture upload benchmark) if you decide to go that route.
Render the image. Depending on what you did in step #4, you'll either render the Bitmap through a Canvas on a custom View, or render the texture with GLES on a SurfaceView or TextureView.
I think the most significant speedup will be from eliminating the JPEG compression and uncompression, so you should probably start there. Convert the output of your frame editor to a Bitmap and just draw it on the Canvas of a TextureView or custom View, rather than converting to JPEG and using ImageView. If that doesn't get you the speedup you want, figure out what's slowing you down and work on that piece of the pipeline.
If you're restricted to the old camera API, then using a SurfaceTexture and doing your processing in a GPU shader may be most efficient.
This assumes whatever modifications you want to do can be expressed reasonably as a GL fragment shader, and that you're familiar enough with OpenGL to set up all the boilerplate necessary to render a single quadrilateral into a frame buffer, using the texture from a SurfaceTexture.
You can then read back the results with glReadPixels from the final rendering output, and save that as a JPEG.
Note that the shader will provide you with RGB data, not YUV, so if you really need YUV, you'll have to convert back to a YUV colorspace before processing.
If you can use camera2, as fadden says, ImageReader or Allocation (for Java/JNI-based processing or Renderscript, respectively) become options as well.
And if you're only using the JPEG to get to a Bitmap to place on an ImageView, and not because you want to save it, then again as fadden says you can skip the encode/decode step and draw to a view directly. For example, if using the Camera->SurfaceTexture->GL path, you can just use a GLSurfaceView as the output destination and render directly into a GLSurfaceView if that's all you need to do with the data.
Am working in H264 video rendering in Android application using SurfaceView. It has one feature to take snapshot while rendering the video on surface view. Whenever I take a snapshot, I get the Transparent/Black screen only. I use getDrawingCache() method to capture the screen that returns a null value only. I use the below code to capture the screen.
SurfaceView mSUrfaceView = new SurfaceView(this); //Member variable
if(mSUrfaceView!=null)
mSUrfaceView.setDrawingCacheEnabled(true); // After video render on surfaceview i enable the drawing cache
Bitmap bm = mSUrfaceView.getDrawingCache(); // return null
Unless you're rendering H.264 video frames in software with Canvas onto a View, the drawing-cache approach won't work (see e.g. this answer).
You cannot read pixels from the Surface part of the SurfaceView. The basic problem is that a Surface is a queue of buffers with a producer-consumer interface, and your app is on the producer side. The consumer, usually the system compositor (SurfaceFlinger), is able to capture a screen shot because it's on the other end of the pipe.
To grab snapshots while rendering video you can render video frames to a SurfaceTexture, which provides both producer and consumer within your app process. You can then render the texture for display with GLES, optionally grabbing pixels with glReadPixels() for the snapshot.
The Grafika app demonstrates various pieces, though none of the activities specifically solves your problem. For example, "continuous capture" directs the camera preview to a SurfaceTexture and then renders it twice (once for display, once for video encoding), which is similar to what you want to do. The GLES utility classes include a saveFrame() function that shows how to use glReadPixels() to create a bitmap.
See also the Android System-Level Graphics Architecture document.