I'm beginner in android.
I’m working on a project that I'm supposed to convert smart phone movement into mouse movement via smart phone camera with android. The smart phone moves on a checkboard surface and the movement information is sent to computer by Bluetooth. Should I use image processing techniques to do that? Has anyone have a relative experience or a similar code to help me out?
If I understand correctly image processing would be a good way to go to discover movement on a 2d plane. The checkerboard pattern should make for relatively easy pixel image comparison.
You could implement this using object detection in simple way.
But for your method you will need to implement optical flow analysis algorithm.
Optical mice internally uses the similar technique called Digital image correlation, it captures the video frames contentiously and compares consecutive frames to detect the motion.
You should read about optical flow detection techniques on Wikipedia.
& this slide
Related
I'm working on a robot that is controller via the VR headset and sends a real-time video feed to the headset.
I've chosen to go the native way on Android and now have everything I need to receive the video stream and encode it (using GStreamer) and also to send the control data to the robot via UDP.
The last thing to do (and the one I most struggle with as I nave no prior experience with computer graphics) is to draw the image (encoded camera feed) to the screen. In the last few days, I've been reading stuff about how Vulkan and OpenGL works, I've also went through the examples provided in Oculus Mobile SDK (mainly VRCubeWorld_SurfaceView) but that's way to complex for what I need, I've tried to simplify it so I could just draw two images, but then I thought.
Do I even need any of that? And this question might sound stupid, but I really don't have any prior experience doing this.
I mean, the example is using OpenGL to basically compute all the layers of the 3D scene, apply colors and then fuse them together to get a final frame that is passed to VR_API via the function:
vrapi_SubmitFrame2(appState.Ovr, &frameDesc);
Can I just take those images, and somehow force them into the frameDesc structure to skip the whole OpenGL pipeline? If so, can anyone knowledgeable enough point me to a working solution?
I don't need any kind of panning over the images, just to render them. Later I'll be using head sensor data, but it won't actually do anything with the "scene".
I want to create a 3d view (360 degree view) for an object captured using camera like the apps Fyuse or Phogy doing it. I researched on this but did not found something useful to start with.
I have some questions like:
What tool should I use for this e.g unity or Android Studio is enough?
Should I use any sdk (like Rajawali for 3d modeling) and some other tool to accomplish this or can this be implemented without using any third party sdk?
Can this be implemented by capturing a video of object and then extracting its frames and then combining them to show 360 degree view?
Can anyone please guide me on this? Any help is appreciated.
In fact, those apps are not really 3D.
You can get similar results by recording a video together with information from motion / pose sensor so you can assign a cellphone pose to every frame.
Then you can control the playback in respect to actual cellphone rotation.
This project might help you: https://github.com/e-lab/VideoSensors
I'm building an Android app that has to identify, in realtime, a mark/pattern which will be on the four corners of a visiting card. I'm using a preview stream of the rear camera of the phone as input.
I want to overlay a small circle on the screen where the mark is present. This is similar to how reference dots will be shown on screen by a QR reader at the corner points of the QR code preview.
I'm aware about how to get the frames from camera using native Android SDK, but I have no clue about the processing which needs to be done and optimization for real time detection. I tried messing around with OpenCV and there seems to be a bit of lag in its preview frames.
So I'm trying to write a native algorithm usint raw pixel values from the frame. Is this advisable? The mark/pattern will always be the same in my case. Please guide me with the algorithm to use to find the pattern.
The below image shows my pattern along with some details (ratios) about the same (same as the one used in QR, but I'm having it at 4 corners instead of 3)
I think one approach is to find black and white pixels in the ratio mentioned below to detect the mark and find coordinates of its center, but I have no idea how to code it in Android. I looking forward for an optimized approach for real-time recognition and display.
Any help is much appreciated! Thanks
Detecting patterns on four corners of a visiting card:
Assuming background is white, you can simply try this method.
Needs to be done and optimization for real time detection:
Yes, you need OpenCV
Here is an example of real-time marker detection on Google Glass using OpenCV
In this example, image showing in tablet has delay (blutooth), Google Glass preview is much faster than that of tablet. But, still have lag.
The requirement is to create an Android application running on one specific mobile device that records video of a human eye pupil dilating in response to a bright light (which is physically attached to the mobile device). The video is then post-processed frame by frame on the device to detect & measure the diameter of the pupil AND the iris in each frame. Note the image processing does NOT need doing in real-time. The end result will be a dataset describing the changes in pupil (& iris) size over time. It's expected that the iris size can be used to enhance confidence in the pupil diameter data (eg removing pupil size data that's wildly wrong), but also as a relative measure for how dilated the eye is at any point.
I am familiar with developing Android mobile apps, but my experience with image processing is very limited. I've researched solutions and it seems that the answer may lie with the OpenCV/JavaCv libraries, which should provide shape detection (eg http://opencvlover.blogspot.co.uk/2012/07/hough-circle-in-javacv.html) but can anyone provide guidance on these specific questions:
Am I right to think it can detect the two circle shapes within a bitmap, one inside the other? ie shapes inside each other is not a problem.
Is it true that JavaCv can detect a circle, and return a position & radius/diameter? ie it doesn't return a set of vertices that then require further processing to compare with a circle? It seems to have a HoughCircle method, so I think yes.
What processing of each frame is typically used before doing shape detection? For example an algorithm to enhance edges, smooth, or remove colour?
Can I use it to not just detect presence of, but measure the diameter of the detected circles? (in pixels, but then can easily be converted to real-world measurements because known hardware is being used). I think yes, but would be great to hear confirmation from those more familiar.
This project is a non-commercial charitable project, so any help especially appreciated.
I would really suggest using ndk as it is a bit richer in features. Also it allows you to run and test your algorithms on a laptop with images before pushing it to a device, speeding up development.
Pre-processing steps:
Typically one would use thresholding or canny edge detection and morphological operations like erode dilate.
For detection of iris / pupil, houghcircles is not a very good method, feature detection methods like MSER work better for not-so-well-defined circles. Here is another answer I wrote on the same topic which has code that could help.
If you are looking to measure the regions, I would suggest going through this blog. It has a clear explanation on the steps involved for a reasonably accurate measurement.
It seems I've found myself in the deep weeds of the Google Vision API for barcode scanning. Perhaps my mind is a bit fried after looking at all sorts of alternative libraries (ZBar, ZXing, and even some for-cost third party implementations), but I'm having some difficulty finding any information on where I can implement some sort of scan region limiting.
The use case is a pretty simple one: if I'm a user pointing my phone at a box with multiple barcodes of the same type (think shipping labels here), I want to explicitly point some little viewfinder or alignment straight-edge on the screen at exactly the thing I'm trying to capture, without having to worry about anything outside that area of interest giving me some scan results I don't want.
The above case is handled in most other Android libraries I've seen, taking in either a Rect with relative or absolute coordinates, and this is also a part of iOS' AVCapture metadata results system (it uses a relative CGRect, but really the same concept).
I've dug pretty deep into the sample app for the barcode-reader
here, but the implementation is a tad opaque to get anything but the high level implementation details down.
It seems an ugly patch to, on successful detection of a barcode anywhere within the camera's preview frame, to simple no-op on barcodes outside of an area of interest, since the device is still working hard to compute those frames.
Am I missing something very simple and obvious on this one? Any ideas on a way to implement this cleanly, otherwise?
Many thanks for your time in reading through this!
The API currently does not have an option to limit the detection area. But you could crop the preview image before it gets passed into the barcode detector. See here for an outline of how to wrap a detector with your own class:
Mobile Vision API - concatenate new detector object to continue frame processing
You'd implement the "detect" method to take the frame received from the camera, create a cropped version of the frame, and pass that through to the underlying detector.