I need to take a measure (Euclidean distance) over a selected points on an image. In order to make this simpler I broke down all the process in steps:
Taking/Loading a photo (with a known pattern in it)
Recognizing the pattern to calibrate measure
Selecting points "from" and "to" in order to measure the distance between them
For my first iteration I will:
load a picture (half of number 1),
select the pattern manually (a rough approximation of number 2), and
select point to measure distances between them.
I'm just beginning with OpenCV; do I need it for my first iteration?
For described steps only OpenCV too massive library. But if you plan some kind of evolution for your project (recognition, detection,complicated image processing tasks), I think OpenCV can speed up your development process.
Related
I try to use Dynamic Time Warping (DTW) to detect gestures performed with a smartphone by using the accelerometer sensor. I already implemented a simple DTW-algorithm.
So basicly I am comparing arrays of accelerometer-data (x,y,z) with DTW. The one array contains my predefiend gesture, the other should contain the measured values. My problem is, that the accelerometer-sensor measures continously new values and I don't know when to start the comparison with my predefined value-sequence.
I would need to know when the gesture starts and when it ends, but this might be different with different gestures. In my case all supported gestures start and end at the same point, but as far as I know I can't calculate the traveled distance from acceleration reliably.
So to sum things up: How would you determine the right time to compare my arrays using DTW?
Thanks in advance!
The answer is, you compare your predefined gesture to EVERY
subsequence.
You can do this in much faster than real time (see [a]).
You need to z-normalize EVERY subsequence, and z-normalize your predefined gesture.
So, by analogy, if you stream was.....
NOW IS THE WINTER OF OUR DISCONTENT, MADE GLORIOUS SUMMER..
And your predefined word was made, you can compare with every marked word beginning (denoted by white space)
DTW(MADE,NOW)
DTW(MADE,IS)
DTW(MADE,THE)
DTW(MADE,WINTER)
etc
In your case, you don’t have makers, you have this...
NOWISTHEWINTEROFOURDISCONTENTMADEGLORIOUSSUMMER..
So you just test every offset
DTW(MADE,NOWI)
DTW(MADE, OWIS)
DTW(MADE, WIST)
DTW(MADE, ISTH)
::
DTW(MADE, TMAD)
DTW(MADE, MADE) // Success!
eamonn
[a] https://www.youtube.com/watch?v=d_qLzMMuVQg
You want to apply DTW not only to a time-series, but to a continously evolving stream. Therefore you will have to use a sliding window of n recent data points.
This is exactly, what eamonn described in his second example. His target pattern consists of 4 events (M,A,D,E) and therefore he uses a sliding window with length of 4.
Yet in this case, he makes the assumption, that the data stream contains no distortions, such as (M,A,A,D,E). The advantage of DTW is that it allows these kind of distortions and yet recognizes the distorted target pattern as a match. In your case, distortions in time are likely to happen. I assume that you want equal gestures performed either slow or fast as the same gesture.
Thus, the length of the sliding window must be higher than the length of the target pattern (to be able to detect a slow target gesture). This is computationally expensive.
Finally, my point is: I want to recommed you this paper
Spring algorithm by Sakurai, Faloutsos and Yamamuro.
They optimized the DTW algorithm for datastreams. You will no longer need more than n*n computations per incoming event but only n. It basically is DTW but cutting down all unneccesary computations and only taking the best possible alignment of the template onto the stream into account.
p.s. most of what I know about time-series and pattern matching, I learned by reading what Eamonn Keogh provided. Thanks a lot, Mr. Keogh.
I have a floor plan on which the walls are black, the doors are orange and the target is red. What I want is to make an app where given a specific point on the image, the route to the target is calculated and displayed. I already have a routing method, but it is in matlab and each position and object is defined in the code and it doesn't use an image. What I would like to know is how to scan the image to identify the walls, the doors and the target by color in order to apply the routing method and then display the route over the image of the map (I guess I should use drawable for that).
This are some steps to implement a pathfinding algorithmm from an image.
Upload your image
Apply a color detection HSV(in the real life is most easy control the
light changes with this format) algorithm to obtain the objects
separately.
Make a new binary Matrix with 1 for your floor and 0 to the
obstacles.
Apply to that binary Matrix an Occupancy grid algorithm(this reduce
your matrix because in the pathfinding algorithm you need
processing).
and now ur path finding algorithm. I recommend use the diijistrak or A star algorithm, in this two cases
you need construct an adjacency matrix.
The graph theory will help you to understand better.Good Luck!!
You can work in processing IDE for rapid prototipyng and migrate all the processing IDE core to eclipse, you need implement the PApplet class in your eclipse project, and can compile your app to Android.
I would use somekind of occupancy grid/map where each grid cell = one pixel (or possibly a small collection of pixels like 2x2 3x3, etc) And just do k-means clustering on the image. There are a few choices for k
k=2
you have walls is one group (the black lines)
everything else is considered opened space (this assumes doors can be opened).
You will need to know where the red point is located, but it doens't need to be visible in your map. It is just another open space in your map. that your program internally knows is the endpoint.
k=4
a group for everything black=walls(occupied), orange=doors(may or may not look like occupied cells depending on whether or not they can be opened),red=target(unoccupied), white=open space(unoccupied).
In both cases you can generate labels for your clusters and use those in your map. I'm not sure what exactly your path finding algorithm is, but typically the goal is to minimize some cost function, and as such you assign a extremely high cost to walls (so they will never be crossed), possibly assign a medium cost to doors (in case they can't be opened). Just some ideas, good luck
This is my first post on this forum and I'm very new in programming. I want to build an application where I can see exactly where some gps-values are on my phone. I know a lot of applications, like junaio, mixare and others, but they only show the direction to the objects and they are not very accurate (they don't have the goal to project it on the exact position on screen) - so I want to build it myself. I program in android, but I think it would be the same on iPhone.
I followed the steps suggested from dabhaid :
There are three steps.
1) Determine your position and orientation using sensors.
2) Convert from GPS coordinate space to a planar coordinate space by determining the relative position and bearing of known GPS coordinates using e.g great circle distance and bearing. (your devices stays at the origin of the coordinate space with this scheme)
3) Do a perspective projection http://en.wikipedia.org/wiki/3D_projection#Perspective_projection to figure out where on the plane that is your display (ok, your camera sensor) the objects should appear, so you can augment them.
Step 1: easy, I have the gps-position and all orientations from my mobile device (x,y,z). For further refinements, I can use some algorithm to smooth this values (average, low filter, whatever).
Step 2: I don't know, what is exactly meant by planar coordinate space. I have some different approaches to convert my gps coordinate space. One of them is ECEF (earth centered), where 0,0,0 is the center of the earth. Somehow, this doesn't look good to me, because every little change of ONE axis, results in changes of the other two axis. So if I change the altitude, all of the 3 axis will change. I don't know if I can follow step 3 with this coordinate system.
In step 2 is mentioned: using haversine - this would give me the distance to the point, but I don't get x,y,z from it. Do I have to calculate x,y by using trigometry (bearing (alpha) + distance (hypotenuse)) ?
Step 3: This method looks really cool! If I have my coordinate space from Step 2, I can calculate d_x,d_y,d_z by using the formula on wikipedia. But after this step, I'm not finished yet because i just have the coordinates and for projecting it on my screen, I only need two coordinates? The text from wikipedia is continued by calculating b_x,b_y They use e_x,e_y,e_z which is the viewer's position relative to the display surface -> How can I get these values from my mobile device? (android/ios). Another approach, which is suggested from wikipedia is: Calculating b_x,b_y by by using the formula mentioned on wikipedia. In this formula they use s_x,s_y, which is the screen size and r_x,r_y which is the recording surface size. Again, how can I get the recording surface from my mobile device?
I can't find anything for it on the internet. It seems that nobody on android/ios has ever implemented a perspective projection before...
Thank you very much for all of your answeres! Also, links to useful sites would help!
I think you can find many answers in this other thread: Transform GPS-Points to Screen-Points with Perspective Projection in Android.
Hope it helped, bye!
Here's a simple solution I did on this issue.
A: Mapping GPS locations on the camera preview in Android
Hope it helped. :D
So what I want to do is write an application that, at least in the future, could be ported to mobile platforms(such as android) that can scan an image of a protein gel and return data such as the number of bands(ie weights) in a column, relative concentration(thickness of the band), and the weights of each in each column.
For those who aren't familiar, mixtures of denatured proteins(basically, molecules made completed straight) are loaded into each column, and with the use of electricity the proteins are pulled through a gel(because the proteins are polar molecules). The end columns of each side of this image http://i52.tinypic.com/205cyrl.gif are where you place a mixture of proteins of known weights(so if you have 4 different weights, the band on top is the largest weight, and the weight/size of the protein decreases the further it travels down). Is something like this possible to analyze using OpenCV? The given image is a really clean looking gel, they can often get really messy(see google images). I figured if I allowed a user to enter the number of columns, which columns contain known weight markers and their actual weights, as well as provide an adjustable rectangle to size around the edges of the gel, that maybe it would be possible to scan and extract data from the images of these gels? I skimmed through a textbook on OpenCV but I didn't see any obvious and reliable way I could approach this. Any ideas? Maybe a different library would be better suited?
I believe you can do this using OpenCV
My approach would be a color based separation. And then counting the separate different components.
In big steps your app would do the following steps:
Load the image, rotate it scale manually through the GUI of your app, to match your needs
Create a second grayscale image in which each pixel contains a value between [0,255], that represents how good the color of the original point matches the target color (in the case of this image the shade of blue)
In one of my experiments I've used the concept of fuzzy sets and alpha cuts to extract objects of a certain color. The triangular membership function gave me pretty good results. This simply meant that I've defined triangular functions for all three color channels RGB, and summed their result for each color given as input. If the values of the color were close to the centers of the triangles, then I had a strong similarity. Plus, by controlling the width of the triangles you can define the tolerance of the matches. (another option would be to use trapezoidal membership functions)
At this point you have a grayscale image, where the background (gel) is black and the proteins are gray/white. If you wish to clear up some noise use the morphological operators (page 127) erode and dilate (cvErode and cvDelate in openCV).
After it, can use this great openCV based blob extraction library to extract the bounding boxes of the remaining gray areas - representing the proteins
Having all the coordinates of the bounding boxes you can apply your own algorithms, to extract whatever data you wish
In my opinion OpenCV gives you all the necesarry tools. However a fully automated solution might be hard to obtain. But I'm sure you can easily build a GUI where you can set the parameters of the operators you apply during the above described steps
As for Android: I didn't develop for mobile platforms, but I know that you can create C++ apps for these devices - have read several questions regarding iPhone & openCV -, so I think that your app would be portable, or at least the image processing part of it (the GUI might be too platform specific).
I usually play a game called Burako.
It has some color playing pieces with numbers from 1-13.
After a match finishes you have to count your points.
For example:
1 == 15 points
2 == 20 points
I want to create an app that takes a picture and count the pieces for me.
So I need something that recognizes an image inside an image.
I was about to read about OpenCV since there is an Android port but it feels there should be something simpler to do this.
What do you think?
I had not used the Android port, but i think it's doable under good lighting conditions.
I would obtain the minimal bounding boxes of each of the pieces and rotate it accordingly so you can compare it with a model image.
Another way could be to get the contour of the numbers written on the piece ( which i guess are in color) and do some contour matching with the numbers.
Opencv is a big and complex framework but it's also suitable for simple tasks like this.