Detecting billiard balls with OpenCV - android

I'm making an android app that takes an image of a billiards game in progress and detects the positions of the various balls. The image is taken from someone's phone, so of course I don't have a perfect overhead view of the table. Right now I'm using houghcircles to find the balls, and it's doing an ok job, but it seems to miss a few balls here and there, and then there are the false positives.
My biggest problem right now is, how do I cut down on the false positives found outside the table? I'm using an ROI to cut off the top portion of the image because it's mostly wasted space, but I can't make it any smaller or I risk cutting off portions of the table since it's a trapezoidal shape. My current idea is to overlay the guide that the user sees when taking the picture on top of the image, but the problem with that is that I don't know what the resolution of the their cameras would be, and therefore the overlay might cover up the wrong spots. Ideally I think I would want to use houghlines but when I tried it my app crashed from what I believe was a lack of memory. Any ideas?
Here is a link to the results I'm getting:
http://graphiquest.com/cvhoughcircles.html
Here is my code:
IplImage img = cvLoadImage("/sdcard/DCIM/test/picture"+i+".jpg",1);
IplImage gray = opencv_core.cvCreateImage( opencv_core.cvSize( img.width(), img.height() ), opencv_core.IPL_DEPTH_8U, 1);
cvCvtColor(img, gray, opencv_imgproc.CV_RGB2GRAY );
cvSetImageROI(gray, cvRect(0, (int)(img.height()*.15), (int)img.width(), (int)(img.height()-(img.height()*.20))));
cvSmooth(gray,gray,opencv_imgproc.CV_GAUSSIAN,9,9,2,2);
Pointer circles = CvMemStorage.create();
CvSeq seq = cvHoughCircles(gray, circles, CV_HOUGH_GRADIENT, 2.5d, (double)gray.height()/30, 70d, 100d, 0, 80);
for(int j=0; j<seq.total(); j++){
CvPoint3D32f point = new CvPoint3D32f(cvGetSeqElem(seq, j));
float xyr[] = {point.x(),point.y(),point.z()};
CvPoint center = new CvPoint(Math.round(xyr[0]), Math.round(xyr[1]));
int radius = Math.round(xyr[2]);
cvCircle(gray, center, 3, CvScalar.GREEN, -1, 8, 0);
cvCircle(gray, center, radius, CvScalar.BLUE, 3, 8, 0);
}
String path = "/sdcard/DCIM/test/";
File photo=new File(path, "picture"+i+"_2.jpg");
if (photo.exists())
{
photo.delete();
}
cvSaveImage("/sdcard/DCIM/test/picture"+i+"_2.jpg", gray);

There are some very helpful constraints you could apply. In addition to doing a rectangular region of interest, you should mask your results with the actual trapezoidal shape of the pool table. Use the color information of the image to find the pool table region. You know that the pool table is a solid color. It doesn't have to be green - you can use some histogram techniques in HSV color space to find the most prevalent color in the image, perhaps favoring pixels toward the center. It's very likely to detect the color of the pool table. Select pixels matching this color, perform morphological operations to remove noise, and then you can treat the mask as a contour, and find its convexHull. Fill the hull to remove the holes created by the pool balls.
What I've said so far should suggest a different approach than Hough circles. Hough circles is probably not working too well since the billiard balls are not evenly illuminated. So, another way to find billiard balls is to subtract the pool table color mask from its convexHull. You'll be left with the areas of the table that are obscured by balls.

I've thought about working on this problem, too, since I play pool and snooker.
A few points:
Judging from the Hough circle fits, it looks like you're not filtering the edge points, or your threshold for edge strength isn't high enough. Are you simply using a binary indicator for edge points, or are you selecting edge points based on edge strength?
Can you work in RGB space? That'd help with detecting the table bed, the rails, and also in identifying the balls. A blue blob on the table bed could be the 2-ball, the 10-ball, or maybe a hunk of chalk.
In your parameter space, you should be able to limit the search for circles of a very limited radius. This would be helped in part if...
Detect the table surface and the rails. A Stroke Width Transform could help you find the rails, especially if you search in a color plane (green) in which the rails will have high contrast. You can also use the six pockets (or at least three pockets) to help identify the pose (position and orientation) of the table.
Once the rails are detected, you can use an affine transform to correct for perspective distortion. You'll need to do this anyway to place the balls with any sort of accuracy, especially if you want the ball placement to satisfy a serious pool player such as someone who plays One Pocket or Straight Pool. Once you have the affine transform, you can set fairly tight tolerances for radius in your Hough parameter space.
Once you've detected the table bed, you could perform an initial segmentation (that is, region labeling or blob finding) and search only for blobs of a certain area and roundness.
A strong, even, diffuse overhead light could help eliminate shadows.
You can help filter edge points by accepting (or at least favoring) edge points that have gradients that are pointed towards other edge points with parallel gradients. If a local collection of edge point pairs "point" at each other via their edge gradients, then they are good candidates for detection.
Once you've detected a candidate ball, perform further processing to accept/reject. A ball should be a relatively uniform hue (cue ball, 1 - 8, or a stripe viewed from the proper angle), or it should have a detectable color stripe and white. The ball surface will not be highly textured like the wood grain of the table.
Have an option that the user take two pictures from slightly different angles. You then have two chances to find balls, and could conceivably solve the correspondence problem of matching the tables and balls in the two images to help locate the balls in the 2D space of the table bed.
Consider having a second algorithm such as normalized cross-correlation (simple template matching) to help identify balls or at least likely ball locations.
Insist that the center point of the image be located somewhere within the table bed. This can help you identifying the positions of the rails since you can then search radially outward for the edges of the rails, and once four (or even just three) rails are found you can reject edge points at radial distances beyond them.
Good luck! It's a fun problem.
EDIT:
I was reading another StackOverflow post and read about this paper. The paper which will give you a much more thorough introduction to the technique I suggested to filter edge points (item 8).
"Fast Circle Detection Using Gradient Pair Vectors" by Rad, Faez, and Qaragozlou
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.9956
I haven't implemented their algorithm myself yet, but it looks promising. Here's the post where the paper was mentioned:
Three Dimensional Hough Space

Related

OpenCV: Finding the pixel width from squares of known size

In OpenCV I use the camera to capture a scene containing two squares a and b, both at the same distance from the camera, whose known real sizes are, say, 10cm and 30cm respectively. I find the pixel widths of each square, which let's say are 25 and 40 pixels (to get the 'pixel-width' OpenCV detects the squares as cv::Rect objects and I read their width field).
Now I remove square a from the scene and change the distance from the camera to square b. The program gets the width of square b now, which let's say is 80. Is there an equation, using the configuration of the camera (resolution, dpi?) which I can use to work out what the corresponding pixel width of square a would be if it were placed back in the scene at the same distance as square b?
The math you need for your problem can be found in chapter 9 of "Multiple View Geometry in Computer Vision", which happens to be freely available online: https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZepipolar.pdf.
The short answer to your problem is:
No not in this exact format. Given you are working in a 3D world, you have one degree of freedom left. As a result you need to get more information in order to eliminate this degree of freedom (e.g. by knowing the depth and/or the relation of the two squares with respect to each other, the movement of the camera...). This mainly depends on your specific situation. Anyhow, reading and understanding chapter 9 of the book should help you out here.
PS: to me it seems like your problem fits into the broader category of "baseline matching" problems. Reading around about this, in addition to epipolar geometry and the fundamental matrix, might help you out.
Since you write of "squares" with just a "width" in the image (as opposed to "trapezoids" with some wonky vertex coordinates) I assume that you are considering an ideal pinhole camera and ignoring any perspective distortion/foreshortening - i.e. there is no lens distortion and your planar objects are exactly parallel to the image/sensor plane.
Then it is a very simple 2D projective geometry problem, and no separate knowledge of the camera geometry is needed. Just write down the projection equations in the first situation: you have 4 unknowns (the camera focal length, the common depth of the squares, the horizontal positions of their left sides (say), and 4 equations (the projections of each of the left and right sides of the squares). Solve the system and keep the focal length and the relative distance between the squares. Do the same in the second image, but now with known focal length, and compute the new depth and horizontal location of square b. Then add the previously computed relative distance to find where square a would be.
In order to understand the transformations performed by the camera to project the 3D world in the 2D image you need to know its calibration parameters. These are basically divided into two sets:
Intrensic parameters: These are fixed parameters that are specific for each camera. They are normally represented by a Matrix called k.
Extrensic parameters: These depend on the camera position in the 3D world. Normally they are represented by two matrices: R and T where the first one represents the rotation and the second one represents the translation
In order to calibrate a camera your need some pattern (basically a set of 3D points which coordinates are known). There are several examples for this in OpenCV library which provides support to perform the camera calibration:
http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
Once you have your camera calibrated you can transform from 3D to 2D easily by the following equation:
Pimage = K · R · T · P3D
So it will not only depend on the position of the camera but it depends on all the calibration parameters. The following presentation go through the camera calibration details and the different steps and equations that are used during the 3D <-> Image transformations.
https://www.cs.umd.edu/class/fall2013/cmsc426/lectures/camera-calibration.pdf
With this in mind you can project whatever 3D point to the image and get its coordinate on it. The reverse transformation is not unique since going back from 2D to 3D will give you a line instead of a unique point.

opencv detect cubes (corners)

Problem that I am trying to solve is to detect cubes and get colours from them. I use live images from camera captured by Android phone. Recognition has to be fast (<1s) Example of a cube:
I also have differently coloured cubes. They can be placed randomly (for example when they touch each other).
I can easily detect one cube, in same cases even two cubes, but the problem is when I have 3 or more and for 2 cubes when they are really close to each other.
Currently processing looks like this:
blur image with Gaussian
convert to hsv and use only s channel
detect edges with Canny
dilate and erode edges
use HoughLinesP to get lines
from lines (I reject too long and too short lines) calculate intersection points and from that get corners of cubes
knowing corners (must be precise) get colours
nothing detected
2 cubes detected (red and orange points are corners and cyan points are intersection points, black lines are detected lines by hough lines)
nothing detected, some lines found
Basically what I need is to find correct corners of the cubes. I tried using Imgproc.goodFeaturesToTrack and Imgproc.cornerHarris, but it finds too many of them and usually not the most important ones.
I also tried using findContours with no success even for two objects. findContours was also crashing my app after minute of running. At some point I tried using Feature Matching + Homography to find matches to a grayscale image of a cube with the one from camera, but results were messy. Also Template Matching didn't give me good results.
Do you have any idea how to make detection more reliable and precise?
Thanks for help

Nutrition Facts Detection with OpenCV

I would like to detect some nutrition facts on food package with an Android Application, with OpenCV.
So far I managed to do it with one image of a nutrition table, but of course it only works with this one.
The goal is to detect and retrieve the value of Energy, Proteines, and Glucides, for 100g of product. These informations are present in almost every table, that is why I focus only on them for the moment.
So I was wondering if there a good method to do so ? For the moment, I try to detect each block of text, recognise it with Tesseract, and if it fits the word I'm looking for, I get the corresponding column and line in the picture, to finally get the value I want.
Is there any way to track the words straightly, and get the value that fits best in the image (in terms of alignement with the "100g" column).
Typical image : hpics.li/4231f79
Sorry if my problem is not well explained, just ask if something is not clearor if you want me to explain more what I've done for the moment.. Also sorry for my english
Cheers
Just a few ideas:
1. Convert image to HSV color space and look only for black and white regions (using inRange function). Blobs which contains only those 2 colors probably will be your informations (but unfortunetely some other things too - barcode, maybe some drawing or logo).
2 You regions should be rectangles, so if the blob is not rectangle - discard it.
3. If founded rectangle is rotated use affineTransform function to align it vertically - here i've explained how to do it. Note that rectangle width and height should stay the same.
4. After using affine transform your rectangle might be rotated by 90, 180 or 270 degrees. In the example you provided on the top there is a black region - it it's true for all you images than finding top is quite easy - just find black rectangle within you region. In other case finding top might be harder - a quick idea, which might be worth testing is to look for black pixels in each white rectangle. In most cases there are aligned to center (not interesting case for us) or to the left - if you find left sied of rectangle, finding top is obvious :) Alternatively you may look for characters which are always on the right side - %, g and mg
If you will have any problems, give us more examples and describe what you have done already - right know it's hard to tell something more.

Detection of different shape's dynamically like ( Circle, square and Rectangle ) from the camera? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I want to create an application to detect the shape of the objects like ( circle, square and rectangle only geometry shapes ) that should not be using Marker less or Edge based way to detect the shape in augmentation.
I have used the following things for this like gone through the procedures of the tutorial that are already existing there in the metaio sdk
1) Metaio : http://dev.metaio.com/sdk/tutorials/hello-world/
2) OpenCV : http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html#canny-detector
these are the thing i have tried to implement.
Geometry shapes:
1) Circle in realtime could be any circular object-->
2) Square in realtime could be any square object-->
3) Rectangle in realtime could be any rectangle object-->
How can i achieve this scenario of the augmentation.
Thanks in advance
Update: This StackOverflow post (with some nice sample pictures included) seems to have solved the circles detection-part of your problem at least. The reference of the excellent write-up he's pointing to can be found on this wiki page (only through the wayback machine unfortunately).
In case that new link doesn't hold either, here is the relevant section:
Detecting Images:
There are a few fiddly bits that need to taken care of to detect circles in an image. Before you process an image with cvHoughCircles - the function for circle detection, you may wish to first convert it into a gray image and smooth it. Following is the general procedure of the functions you need to use with examples of their usage.
Create Image
Supposing you have an initial image for processing called 'img', first you want to create an image variable called 'gray' with the same dimensions as img using cvCreateImage.
IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 );
// allocate a 1 channel byte image
CvMemStorage* storage = cvCreateMemStorage(0);
IplImage* cvCreateImage(CvSize size, int depth, int channels);
size: cvSize(width,height);
depth: pixel depth in bits: IPL_DEPTH_8U, IPL_DEPTH_8S, IPL_DEPTH_16U,
IPL_DEPTH_16S, IPL_DEPTH_32S, IPL_DEPTH_32F, IPL_DEPTH_64F
channels: Number of channels per pixel. Can be 1, 2, 3 or 4. The channels
are interleaved. The usual data layout of a color image is
b0 g0 r0 b1 g1 r1 ...
Convert to Gray
Now you need to convert it to gray using cvCvtColor which converts between colour spaces.
cvCvtColor( img, gray, CV_BGR2GRAY );
cvCvtColor(src,dst,code); // src -> dst
code = CV_<X>2<Y>
<X>/<Y> = RGB, BGR, GRAY, HSV, YCrCb, XYZ, Lab, Luv, HLS
e.g.: CV_BGR2GRAY, CV_BGR2HSV, CV_BGR2Lab
Smooth Image
This is done so as to prevent a lot of false circles from being detected. You might need to play around with the last two parameters, noting that they need to multiply to an odd number.
cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 );
// smooth it, otherwise a lot of false circles may be detected
void cvSmooth( const CvArr* src, CvArr* dst,
int smoothtype=CV_GAUSSIAN,
int param1, int param2);
src
The source image.
dst
The destination image.
smoothtype
Type of the smoothing:
CV_BLUR_NO_SCALE (simple blur with no scaling) - summation over a pixel param1×param2 neighborhood. If the neighborhood size is not fixed, one may use cvIntegral function.
CV_BLUR (simple blur) - summation over a pixel param1×param2 neighborhood with subsequent scaling by 1/(param1•param2).
CV_GAUSSIAN (gaussian blur) - convolving image with param1×param2 Gaussian.
CV_MEDIAN (median blur) - finding median of param1×param1 neighborhood (i.e. the neighborhood is square).
CV_BILATERAL (bilateral filter) - applying bilateral 3x3 filtering with color sigma=param1 and space sigma=param2
param1
The first parameter of smoothing operation.
param2
The second parameter of smoothing operation.
In case of simple scaled/non-scaled and Gaussian blur if param2 is zero, it is set to param1
Detect using Hough Circle
The function cvHoughCircles is used to detect circles on the gray image. Again the last two parameters might need to be fiddled around with.
CvSeq* circles =
cvHoughCircles( gray, storage, CV_HOUGH_GRADIENT, 2, gray->height/4, 200, 100 );
CvSeq* cvHoughCircles( CvArr* image, void* circle_storage,
int method, double dp, double min_dist,
double param1=100, double param2=100,
int min_radius=0, int max_radius=0 );
======= End of relevant section =========
The rest of that wiki page is actually very good (although, I'm not going to recopy it here since the rest is off-topic to the original question and StackOverflow has a size limit for answers). Hopefully, that link to the cached copy on the Wayback machine will keep on working indefinitely.
Previous Answer Before my Update:
Great! Now that you posted some examples, I can see that you're not only after rectangles, square rectangles, and circles, you also want to find those shapes in a 3D environment, thus potentially hunting for special cases of parallelograms and ovals that from video frame to video frame can eventually reveal themselves to be rectangles, squares, and/or circles (depending on how you pan the camera).
Personally, I find it easier to work through a problem myself than trying to understand how to use an existing (often times very mature) library. This is not to say that my own work will be better than a mature library, it certainly won't be. It's just that once I can work myself through a problem, then it becomes easier for me to understand and use a library (the library itself which will often run much faster and smarter than my own solution).
So the next step I would take is to change the color space of the bitmap into grayscale. A color bitmap, I have trouble understanding and I have trouble manipulating, especially since there are so many different ways it can be represented, but a grayscale bitmap, that's both much easier to understand and manipulate. For a grayscale bitmap, just imagine a grid of values, with each value representing a different light intensity.
And for now, let's limit the scope of the problem to finding parallelograms and ovals inside a static 2D environment (we'll worry about processing 3D environments and moving video frames later, or should I say, you'll worry about that part yourself since that problem is already becoming too complicated for me).
And for now also, let's not worry about what tool or language you use. Just use whatever is easiest and most expeditive. For instance, just about anything can be scripted to automatically convert an image to grayscale assuming time is no issue. ImageMagick, Gimp, Marvin, Processing, Python, Ruby, Java, etc.
And with any of those tools, it should be easy to group pixels with similar enough intensities (to make the calculations more manageable) and to sort each pixel coordinates in a different array for each light intensity bucket. In other words, it shouldn't be too difficult to arrange some sort of crude histogram of arrays sorted by intensity that contain each pixel's x and y positions.
After that, the problem becomes a problem more like this one (which can be found on StackOverflow) and thus can be worked upon with its suggested solution.
And once you're able to work through the problem in that way, then converting the solution you come up with to a better language suited for the task shouldn't be too difficult. And it should be much easier also to understand and use the underlying function of any existing library you end choosing for the task as well. At least, that's what I'm hoping for, since I'm not familiar enough and I can't really help you with the OpenCV libraries themselves.

Randomly Generating Patterns Using Hexagonal Images

Okay so I have these images:
Basically what I'm trying to do is to create a "mosaic" of about 5 to 12 hexagons, with most of them roughly centralised, and where all of the lines meet up.
For example:
I'm aware that I could probably just brute-force it, but as I'm developing for Android I need a faster, more efficient and less processor-intensive way of doing it.
Can anybody provide me with a solution, or even just point me in the right direction?
A random idea that I had is to go with what Deepak said about defining a class that tracks the state of each of its six edges (say, in an int[] neighbor in which neighbor[0] states if top edge has neighbor, neighbor[1] states if top-right edge has neighbor, and so on going clockwise)
Then for each hexagon on screen, convert its array to an integer via binary. Based on that integer, use a lookup table to determine which hexagon image to use + how it should be oriented/flipped, then assign that hexagon object to that image.
For instance, let's take the central hexagon with four neighbors in your first screenshot. Its array would be [1, 0, 1, 1, 0, 1] based on the scheme mentioned above. Take neighbor[0] to be the least-significant bit (2^0) and neighbor[5] to be the most-significant bit (2^5), and we have [1, 0, 1, 1, 0, 1] --> 45. Somewhere in a lookup table we would have already defined 45 to mean the 5th hexagon image, flipped horizontally*, among the seven base hexagon icons you've posted.
Yes, brute-force is involved, but it's a "smarter" brute-force since you're not rotating to see if a hexagon will fit. Rather, it involves a more efficient look-up table.
*or rotated 120 degrees clockwise if you prefer ;)
Nice and tricky question. What you can start with is define object for each image which has attributes that specify which edge has a line attached to it. Then while adding the images in the layout you can rotate it in such a way that the edge with line in one image lies adjacent to the other image's edge with line. It may be little complicated but I hope you can at least start with something like this.

Categories

Resources