is it possible to detect the object on a static image using TensorFlow. Most of the tutorials I found on the internet are using a live camera. I currently working on an android app that can detect an object after taking a photo. I'm wondering if is it possible.
TIA.
Defenitely, all basic object detections are running on images only. The live feed from a camera or a video file is taken frame by frame for processing with object detection methods. Unless, a temporal analysis is used, the object detections are simply running inference on each of the frames of the video.
Yes you can run on captured images, just sharing a demo link where its demonstrated for prototyping -
https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/object_detection.ipynb
Github link-
https://github.com/tensorflow/hub/blob/master/examples/colab/object_detection.ipynb
You can run this in colab to test with any image of your choice.
Related
I am trying to develop a camera app which captures image and process it using OpenCV in Kotlin language. I am trying to develop it for Android Things Odroid N2+ board.
For now, I am struggling with the camera2 API.
I have a question. For image processing using OpenCV, can we use the camera2 API or does OpenCV provide separate library/tools for camera capturing and processing image for Android ?
Having no experience in OpenCV library I have heard that VideoCapture class is used in python for this purpose.
The processing part involves first capturing a reference image and then comparing other images with the reference image.
How can I go about camera capture issue for image processing ?
I am working on an Android project which should capture an image by the camera API and the do some image processing to classify the image and give result to user (I used opencv to process the image and classify result). My question is which is the best camera API? Shall I use Java camera view in opencv or use Camera API using intent or finally use camera 2 API which can give me control to manage some characteristics related to ambient conditions.
Please clear my confusion and suggest which is the best one to control the quality of the image and and other conditions that affect the image taken.
native camera:
higher framerate
capture RGBA, no need to convert from android yuv format
and much more features
so i would say use standard Camera API
try gpuImagePlus library which is available for both android and ios.
here is the link for android version.
https://github.com/wysaid/android-gpuimage-plus
I have been working with an app that uses the Tesseract API in order to support OCR. This is done by using a Surfaceview which shows the camera output (Camera2 API) and a ImageReader instance which is used to get the images from the camera. The camera is setup to be of the type setRepeatingRequest so new images are available very frequent. When I make a call to the getutf8text() method to get the readable text in images it makes the preview of the camera which is showed on the Surfaceview lag.
Are there any settings in the Tesseract API which can be set so it speeds up the getutf8text() method call or anything else I can do in order to get the preview Surfaceview to not lag?
Any help or guidance is appreciated!
Most of the things that you can do to speed up performance occur separately from the Tesseract API itself:
Run the OCR on a separate, non-UI thread
Grab a new image to start OCR on after OCR has finished on the last image. Try capture instead of setRepeatingRequest.
Downsample the image before OCR, so that it's smaller
Experiment with different Tesseract page segmentation modes to see what's the fastest on your data
Re-train the Tesseract trained data file to use fewer characters and a smaller dictionary, depending on what your app is used for
Modify Tesseract to perform only recognition pass #1
Don't forget to consider OpenCV or other approaches altogether
You didn't say what Tesseract API settings you're using now, and you didn't describe what your app does in a general sense, so it's hard to tell you where to start, but these points should get you started.
There are a few other things which you can try.
Init tesseract with OEM_TESSERACT_ONLY
Instead of using full-blown training data, use a faster alternative from https://github.com/tesseract-ocr/tessdata_fast.
Move the recognition to the computation thread.
I am testing imaging algorithms using a android phone's camera as input, and need a way to consistently test the algorithms. Ideally I want to take a pre-recorded video feed and have the phone 'pretend' that the video feed is a live video from the camera.
My ideal solution would be where the app running the algorithms has no knowledge that the video is pre-recorded. I do not want to load the video file directly into the app, but rather read it in as sensor data if at all possible.
Is this approach possible? If so, any pointers in the right direction would be extremely helpful, as Google searches have failed me so far
Thanks!
Edit: To clarify, my understanding is that the Camera class uses a camera service to read video from the hardware. Rather than do something application-side, I would like to create a custom camera service that reads from a video file instead of the hardware. Is that doable?
When you are doing processing on a live android video feed you will need to build your own custom camera application that feeds you individual frames via the PreviewCallback interface that Android provides.
Now, simulating this would be a little bit tricky seen as the format for the preview frames will generally be in the NV21 format. If you are using a pre-recorded video, I don't think there is any clear way of reading frames one by one unless you try the getFrameAtTime method which will give you bitmaps in an entirely different format.
This leads me to suggest that you could probably test with these Bitmaps (though I'm really not sure what you are trying to do here) from the getFrameAtTime method. In order for this code to then work on a live camera preview, you would need to have to convert your NV21 frames from the PreviewCallback interface into the same format as the Bitmaps from getFrameAtTime, or you could then adapt your algorithm to process NV21 format frames. NV21 is a pretty neat format, presenting color and luminance data separately, but it can be tricky to use.
In a project on Android, I'm trying to capture the video and process it in realtime (like a Kinect). I tried with two method: using OpenCV keep calling mCamera.grab() and capture.retrieve(mRgba,Highgui.CV_CAP_ANDROID_COLOR_FRAME_RGBA); or the Android's Camera by keep capturing image.
I feel that the OpenCV camera's ability to capture image faster than the Android one. But why?
OpenCV uses a hack to get low level access to the Android camera. It allows to avoid several data copyings and transitions between native and managed layers.