To move an image towards 3 dimensions in android application

To move an image towards 3 dimensions in android application - android

I want to move an image in in 3 dimensional way in my android application according to my device movement, for this, I am getting my x y z co-ordinate values through sensorEvent,But I am unable to find APIs to move image in 3 dimesions. Could any one please provide a way(any APIs) to get the solution.

Depending on the particulars of your application, you could consider using OpenGL ES for manipulations in three dimensions. A quite common approach then would be to render the image onto a 'quad' (basically a flat surface consisting of two triangles) and manipulate that using matrices you construct based on the accelerometer data.
An alternative might be to look into extending the standard ImageView, which out of the box supports manipulations by 3x3 matrices. For rotation this will be sufficient, but obviously you will need an extra dimension for translation - which you're probably after, seen your remark about 'moving' an image.
If you decide to go with the first suggestion, this example code should be quite useful to start with. You'll probably be able to plug your sensor data straight into that and simply add the required math for the matrix manipulations.

Related

OpenCV: Finding the pixel width from squares of known size

In OpenCV I use the camera to capture a scene containing two squares a and b, both at the same distance from the camera, whose known real sizes are, say, 10cm and 30cm respectively. I find the pixel widths of each square, which let's say are 25 and 40 pixels (to get the 'pixel-width' OpenCV detects the squares as cv::Rect objects and I read their width field).
Now I remove square a from the scene and change the distance from the camera to square b. The program gets the width of square b now, which let's say is 80. Is there an equation, using the configuration of the camera (resolution, dpi?) which I can use to work out what the corresponding pixel width of square a would be if it were placed back in the scene at the same distance as square b?

The math you need for your problem can be found in chapter 9 of "Multiple View Geometry in Computer Vision", which happens to be freely available online: https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZepipolar.pdf.
The short answer to your problem is:
No not in this exact format. Given you are working in a 3D world, you have one degree of freedom left. As a result you need to get more information in order to eliminate this degree of freedom (e.g. by knowing the depth and/or the relation of the two squares with respect to each other, the movement of the camera...). This mainly depends on your specific situation. Anyhow, reading and understanding chapter 9 of the book should help you out here.
PS: to me it seems like your problem fits into the broader category of "baseline matching" problems. Reading around about this, in addition to epipolar geometry and the fundamental matrix, might help you out.

Since you write of "squares" with just a "width" in the image (as opposed to "trapezoids" with some wonky vertex coordinates) I assume that you are considering an ideal pinhole camera and ignoring any perspective distortion/foreshortening - i.e. there is no lens distortion and your planar objects are exactly parallel to the image/sensor plane.
Then it is a very simple 2D projective geometry problem, and no separate knowledge of the camera geometry is needed. Just write down the projection equations in the first situation: you have 4 unknowns (the camera focal length, the common depth of the squares, the horizontal positions of their left sides (say), and 4 equations (the projections of each of the left and right sides of the squares). Solve the system and keep the focal length and the relative distance between the squares. Do the same in the second image, but now with known focal length, and compute the new depth and horizontal location of square b. Then add the previously computed relative distance to find where square a would be.

In order to understand the transformations performed by the camera to project the 3D world in the 2D image you need to know its calibration parameters. These are basically divided into two sets:
Intrensic parameters: These are fixed parameters that are specific for each camera. They are normally represented by a Matrix called k.
Extrensic parameters: These depend on the camera position in the 3D world. Normally they are represented by two matrices: R and T where the first one represents the rotation and the second one represents the translation
In order to calibrate a camera your need some pattern (basically a set of 3D points which coordinates are known). There are several examples for this in OpenCV library which provides support to perform the camera calibration:
http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
Once you have your camera calibrated you can transform from 3D to 2D easily by the following equation:
Pimage = K · R · T · P3D
So it will not only depend on the position of the camera but it depends on all the calibration parameters. The following presentation go through the camera calibration details and the different steps and equations that are used during the 3D <-> Image transformations.
https://www.cs.umd.edu/class/fall2013/cmsc426/lectures/camera-calibration.pdf
With this in mind you can project whatever 3D point to the image and get its coordinate on it. The reverse transformation is not unique since going back from 2D to 3D will give you a line instead of a unique point.

Displaying a route on a floor plan image

I have a floor plan on which the walls are black, the doors are orange and the target is red. What I want is to make an app where given a specific point on the image, the route to the target is calculated and displayed. I already have a routing method, but it is in matlab and each position and object is defined in the code and it doesn't use an image. What I would like to know is how to scan the image to identify the walls, the doors and the target by color in order to apply the routing method and then display the route over the image of the map (I guess I should use drawable for that).

This are some steps to implement a pathfinding algorithmm from an image.
Upload your image
Apply a color detection HSV(in the real life is most easy control the
light changes with this format) algorithm to obtain the objects
separately.
Make a new binary Matrix with 1 for your floor and 0 to the
obstacles.
Apply to that binary Matrix an Occupancy grid algorithm(this reduce
your matrix because in the pathfinding algorithm you need
processing).
and now ur path finding algorithm. I recommend use the diijistrak or A star algorithm, in this two cases
you need construct an adjacency matrix.
The graph theory will help you to understand better.Good Luck!!
You can work in processing IDE for rapid prototipyng and migrate all the processing IDE core to eclipse, you need implement the PApplet class in your eclipse project, and can compile your app to Android.

I would use somekind of occupancy grid/map where each grid cell = one pixel (or possibly a small collection of pixels like 2x2 3x3, etc) And just do k-means clustering on the image. There are a few choices for k
k=2
you have walls is one group (the black lines)
everything else is considered opened space (this assumes doors can be opened).
You will need to know where the red point is located, but it doens't need to be visible in your map. It is just another open space in your map. that your program internally knows is the endpoint.
k=4
a group for everything black=walls(occupied), orange=doors(may or may not look like occupied cells depending on whether or not they can be opened),red=target(unoccupied), white=open space(unoccupied).
In both cases you can generate labels for your clusters and use those in your map. I'm not sure what exactly your path finding algorithm is, but typically the goal is to minimize some cost function, and as such you assign a extremely high cost to walls (so they will never be crossed), possibly assign a medium cost to doors (in case they can't be opened). Just some ideas, good luck

Android: How to detect these objects in images? (Image included). Tried OpenCV and metaioSDK, but both are not working good enough

i have been working with object detection / recognition in images captured from an android device camera recently.
the object i am trying to detect are all kinds of buttons that look like this:
Picture of buttons
so far i have been trying with OpenCV and also with the metaio SDK. results:
OpenCV was always detecting something, but gave lots of false hits. also it is too much work to collect all the pictures for what i have in mind. i have tried three ways with OpenCV:
FeatureDetection (SURF, ORB and so on) -> was way too slow and not enough features on my objects.
Template Matching -> seems to only work when the template is exactly a part out of the scene image
Training classifiers -> this worked the best so far, but is too much work for my goal, and still gives too many false detections.
metaioSDK was working ok when i took my reference images (the icon part of each button) out of a picture like shown above, then printed the full image and pointed my android device camera at the printed picture. but when i tried with the real buttons (not a picture of them) then almost nothing got detected anymore. in the metaio documentation it is said that the reference images need to have lots of features and color differences and also should not only consist of white text. well, as you see my reference images are exactly the opposite from what they should be. but thats just how the buttons look ;)
so, my question would be: does any of you have a suggestion about what else i could try to detect and recognize each of those buttons when i point my android camera at them?

As a suggestion can you try the following approach:
Class-Specific Hough Forest for Object Detection
they provide a C code implementation. Compile and run it and see the results, then replace positive and negative training images with the ones you have according the following rules:
In a car you will need to define the following 3 areas:
target region (the image you provided is a good representation of a target region)
nearby working area (this area have information regarding you target relative location) I would recommend: area 3-5 times the target regions, around the target, can be a good working area
everything outside the above can be used as negative images
then,
Use "many" positive images (100-1000) at different viewing angles (-30 - +30 degrees) and various distances.
You will have to make assumptions at which viewing angles and distances your users will use the application. The more strict they are the better performance you will get. A simple "hint" camera overlay can give a good idea to people what you expect the working area to be.
Use few times (3-5) more different negative image set which includes pictures of things that might be in the camera but should not contribute any target position information.
Do not use big images, somewhere around 100-300px in width should be enough
Assemble the database, and modify the configuration file that the code comes with. Run the program, see if performance is OK for your needs.
The program will return a voting map cloud of the object you are looking fore. Add gaussian blur to it, and apply some threshold to it (you will have to make another assumption for this threshold value).
Extracted mask will define the area you are looking for. The size of the masked region can give you good estimate of the object scale. Given this information it will be much easier to select proper template and perform template matching.
(Also some thoughts) You can also try to do a small trick by using goodFeaturesToTrack function with the mask you got, to get a set of locations and compare them with the corresponding locations on a template. Constuct an SSD and solve it for rotation, scale and transition parameters, by mimizing alignment error (but not sure if this approach will work)

Collision detection for rotated bitmaps on Android

I need pixel-perfect collision detection for my Android game. I've written some code to detect collision with "normal" bitmaps (not rotated); works fine. However, I don’t get it for rotated bitmaps. Unfortunately, Java doesn’t have a class for rotated rectangles, so I implemented one myself. It holds the position of the four corners in relation to the screen and describes the exact location/layer of its bitmap; called "itemSurface". My plan for solving the detection was to:
Detect intersection of the different itemSurfaces
Calculating the overlapping area
Set these areas in relation to its superior itemSurface/bitmap
Compare each single pixel with the corresponding pixel of the other bitmap
Well, I’m having trouble with the first one and the second one. Does anybody has an idea or got some code? Maybe there is already code in Java/Android libs and I just didn’t find it.

I understand that you want a collision detection between rectangles (rotated in different way). You don't need to calculate the overlapping area. Moreover, comparing every pixel will be ineffective.
Implement a static boolean isCollision function which will tell you is there a collision between one rectangle and another. Before you should take a piece of paper do some geometry to find out the exact formulas. For performance reasons do not wrap a rectangle in some Rectangle class, just use primitive types like doubles etc.
Then (pseudo code):
for (every rectangle a)
for (every rectangle b)
if (a != b && isCollision(a, b))
bounce(a, b)
This is O(n^2), where n is number of rectangles. There are better algorithms if you need more performance. bounce function changes vectors of moving rectangles so that imitates a collision. If the weight of objects was the same (you can aproximate weight with size of the rectangles), you just need to swap two speed vectors.
To bounce elements correctly you could need to store auxiliary table boolean alreadyBounced[][] to determine which rectangles do not need a change of their vectors after bounce (collision), because they were already bounced.
One more tip:
If you are making a game under Android you have to watch out to not allocate memory during gameplay, because it will faster invoke GC, which takes a long time and slow downs your game. I recommend you watching this video and related. Good luck.

Best Strategy for Storing Handwriting

I am writing a mobile app (Android) that will allow the user to 'write' to a canvas using a single-touch device with 1 pixel accuracy. The app will be running on a tablet device that will be approximately standard 8 1/2" x 11" size. My strategy is to store the 'text' as vector data, in that each stroke of the input device will essentially be a vector consisting of a start point, and end point, and some number of intermediate points that help to define the shape of the vector (generated by the touchscreen/OS on touch movement). This should allow me to keep track of the order that the strokes were put down (to support undo, etc) and be flexible enough to allow this text to be re-sized, etc like any other vector graphic.
However, doing some very rough back of the envelope calculations, with a highly accurate input device and a large screen such that you can emulate on a one for one basis the standard paper notepad, that means you will have ~1,700 strokes per full page of text. Figuring, worst-case, that each stroke could be composed of up to ~20-30 individual points (a point for every pixel or so of the stroke), that means ~50,000 data points per page... WAY too big for SQLite/Android to handle with any expectation of reliability when a page is being reloaded and the vector strokes are being recreated (I have to imagine that pulling 50,000+ results from the SQLite db will exceed the CursorWindow limit of 1Mb)
I know I could break up the data retrieval into multiple queries, or could modify the stroke data such that I only add an intermediate point to help define the stroke vector shape if it is more than X pixels from a start, finish or other intermediate pixel, but I am wondering if I need to rethink this strategy from the ground up...
Any suggestions on how to tackle this problem in a more efficient way?
Thanks!
Paul

Is there any reason of using vector data in the first place? Without knowing your other requirements, it seems to me that you just need to store the data in raster / bitmap and compress it with regular compression methods such as PNG / zip / djvu (or if performance suffers, simple things like run-length-encoding / RLE).
EDIT: sorry I didn't read the question clearly. However if you only need things like "undo" and "resize", you can take a snapshot of the bitmap for every stroke (of course you only need to take a snapshot of the regions that change).
Also it might be possible to take a hybrid approach where you display a snapshot bitmap first while waiting for the (real) vector images to load.
Furthermore, I am not familiar about the android cursor limit, but SQL queries can always be rewritten to split the result in pieces (via LIMIT... OFFSET).

Solution I am using for now, although I would be open to any further suggestion!
Create a canvas View that can both convert SVG paths to Android paths, and can intercept motion events, converting them to android Paths while also storing them as SVG paths.
Display the Android Paths to the screen in onDraw()
Write the SVG Paths out to an .svg file
Paul

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.