I'm trying to work out how to go about creating an on-the-fly simplification of incoming RGB values.
I'm trying to write an android app that utilizes a live camera view and sample colors. I've worked out how to detect and save individual color values, but my aim is to simplify these incoming values using clear ranges.
Example: When we detect Firebrick Red 178,34,34 it would recognize that value within a predefined range defined as Red and will be converted to a simple 255,0,0 upon saving the color.
The app is being put together in unity. If anyone has read a guide that goes over the process that would be ideal, so I can learn what is going on and how it is achieved. I'm stumped.
Thanks in advance for any help.
So the problem is its hard to define what "red" is. Its not just that different people have different definitions, different cultures also have an effect (some cultures don't consider red and yellow to be different colors. In fact at least one tribal culture still present today has no word for colors. See https://www.sapiens.org/language/color-perception/) on what we think colors are. So doing this is always a best effort type deal.
One simple thing you could do is just do a least difference algorithm. Have a set of color references, and see which one has the smallest delta between it and the color you're looking up. Then treat it as the reference color. That will work- kinda. If you have enough colors in your set to not get too far.
That will only kind of work though- rgb aren't actually equally distinct to the human eye, and some differences matter more than others- its non linear. A difference of 10 in green is more important than in red. A difference of 10 in the range [0,20] may be more or less stark than a difference of 10 in the range [100,120]. If you need this to work really well you may need to talk to someone who's studied color and how the human eye works to come up with a custom algorithm. Having worked on printers once upon a time, we had teams of experts figuring out the definition of digital colors to ink. Its much the same here.
Related
I am implementing the Tensorflow object detection in on of my android app, I have followed the demo and tutorial of 'Tensorflow-for-poets' and successfully created a model with that.
I need one help with this,
I have this requirement of detecting the traffic signal, I have a dataset and I have created a model for that and in the general case, it works great.
What I want is to detect what color of the traffic signal is lightened, It means is it Green or Red?
I have added dataset which has two types of images of green and red lightened traffic signals but it is just detecting a traffic signal.
Can anyone help me with this or guide how can I achieve this?
Why would you like to detect the color of the traffic light?
IMHO it would be more robust to determine which of the lights is shining, e.g. create two classes "red traffic light" and "green traffic light" and train your model on them.
The Answer given in the post referenced in the comments by sladomic is not invariant against noise. Say, you have an Image in the late afernoon, when the sun sets, you likely will have a redish lightened environment. So determing the amount of red pixels within the bounding box of your detected traffic light may will fail, because the amount of red pixels caused by the environmend is larger than the amount of green pixels coming from the flash light.
I am using the grabcut algorithm of OpenCV for the background subtraction of an image in android. Algorithms runs fine but the result it gives is not accurate.
E.g. My input image is:
Output image look like:
so How can we increase accuracy of Grabcut Algorithm?
P.S: Apology for not uploading example images due to low reputation :(
I have been battling with the same problem for quite some time now. I have a few tips and tricks for this
1> Improve your seeds. Considering that GrabCut is basically a black box, to whom you give seeds and expect the segmented image as output, the seeds are all you can control and it becomes imperative to select good seeds. There are a number of things you can do in this regard if you have some expectation for the image you want to segment. For a few cases consider these:
a> Will your image have humans? Use a face detector to find the face and mark those pixels as Probable/definite foreground, as you deem fit. You could also use skin colour models within some region of interest to further refine your seeds
b> If you have some data on what kind of foreground you expect after segmentation, you can train colour models and use them as well to mark even more pixels
The list will go on. You need to creatively come up with different ways to adds more accurate seeds.
2> Post Processing: Try simple post processing techniques like the Opening and Closing operations to smoothen your fgmask. They will help you get rid of a lot of noise in the final output.
In general graphcut (and hence grabcut) tends to snap quickly to edges and hence if you have strong edges close to your foreground boundary, you can expect inaccuracies in the result.
Last week i have chosen my major project. It is a vision based system to monitor cyclists in time trial events passing certain points on the course. It should detect the bright yellow race number on a cyclist's back and extract the number from it, and besides record the time.
I done some research about it and i decided to use Tesseract Android Tools by Robert Theis called Tess Two. To speed up the process of recognizing the text i want to use a fact that the number is mend to be extracted from bright (yellow) rectangle on the cyclist back and to focus the actual OCR only on it. I have not found any piece of code or any ideas how to detect the geometric figures with specific color. Thank you for any help. And sorry if i made any mistakes I am pretty new on this website.
Where are the images coming from? I ask because I was asked to provide some technical help for the design of a similar application (we were working with footballer's shirts) and I can tell you that you'll have a few problems:
Use a high quality video feed rather than rely on a couple of digital camera images.
The number will almost certainly be 'curved' or distorted because of the movement of the rider and being able to use a series of images will sometimes allow you to work out what number it really is based on a series of 'false reads'
Train for the font you're using but also apply as much logic as you can (if the numbers are always two digits and never start with '9', use this information to help you get the right number
If you have the luxury of being able to position the camera (we didn't!), I would have thought your ideal spot would be above the rider and looking slightly forward so you can capture their back with the minimum of distortions.
We found that merging several still-frames from the video into one image gave us the best overall image of the number - however, the technology that was used for this was developed by a third-party and they do not want to release it, I'm afraid :(
Good luck!
I have an application where I want to track 2 objects at a time that are rather small in the picture.
This application should be running on Android and iPhone, so the algorithm should be efficient.
For my customer it is perfectly fine if we deliver some patterns along with the software that are attached to the objects to be tracked to have a well-recognizable target.
This means that I can make up a pattern on my own.
As I am not that much into image processing yet, I don't know which objects are easiest to recognize in a picture even if they are rather small.
Color is also possible, although processing several planes separately is not desired because of the generated overhead.
Thank you for any advice!!
Best,
guitarflow
If I get this straight, your object should:
Be printable on an A4
Be recognizeable up to 4 meters
Rotational invariance is not so important (I'm making the assumption that the user will hold the phone +/- upright)
I recommend printing a large checkboard and using a combination of color-matching and corner detection. Try different combinations to see what's faster and more robust at difference distances.
Color: if you only want to work on one channel, you can print in red/green/blue*, and then work only on that respective channel. This will already filter a lot and increase contrast "for free".
Otherwise, a histogram backprojection is in my experience quite fast. See here.
Also, let's say you have only 4 squares with RGB+black (see image), it would be easy to get all red contours, then check if it has the correct neighbouring colors: a patch of blue to it's right and a patch of green below it, both of roughly the same area. This alone might be robust enough, and is equivalent to working on 1 channel since for each step you're only accessing one specific channel (search for contours in red, check right in blue, check below in green).
If you're getting a lot of false-positives, you can then use corners to filter your hits. In the example image, you have 9 corners already, in fact even more if you separate channels, and if it isn't enough you can make a true checkerboard with several squares in order to have more corners. It will probably be sufficient to check how many corners are detected in the ROI in order to reject false-positives, otherwise you can also check that the spacing between detected corners in x and y direction is uniform (i.e. form a grid).
Corners: Detecting corners has been greatly explored and there are several methods here. I don't know how efficient each one is, but they are fast enough, and after you've reduced the ROIs based on color, this should not be an issue.
Perhaps the simplest is to simply erode/dilate with a cross to find corners. See here .
You'll want to first threshold the image to create a binary map, probably based on color as metnioned above.
Other corner detectors such as Harris detector are well documented.
Oh and I don't recommend using Haar-classifiers. Seems unnecessarily complicated and not so fast (though very robust for complex objects: i.e. if you can't use your own pattern), not to mention the huge amount of work for training.
Haar training is your friend mate.
This tutorial should get you started: http://note.sonots.com/SciSoftware/haartraining.html
Basically you train something called a classifier based on sample images (2000 or so of the object you want to track). OpenCV already has the tools required to build these classifiers and functions in the library to detect objects.
I am wondering how to do as I said in the title:
I want to have some objects counted reading an image from the camera of the portable device (such as iPhone or Android phones)
I need only two specific functions.
Recognize and count the amount of objects
Recognize the color of the object (so I can count how many of each color I have).
A very simple example.
I have a stack of pieces of LEGO, all of them the same dimensions. I know they always will be aligned horizontaly, sometimes they are not verically aligned. I need to count how many of each colour I have.
I know that I have pieces with the same dimensions, only the colour change.
I have i think only 10 colour avaible.
I can elaborate the image (such as blur and other stuff) but I don't know how to read how many pieces I have.
Can you tell me some Ideas how to do (and what kind of libraries to use both for iOS and Android -android first-) or maybe some publication (free pdf or books or even publicated books even if they're not free) teaching how to read data from images.
The program should be act as the same:
I start the program, when the program recognize it is looking at (using the integrated cam) some specific objects, ittake a picture and elaborate it, telling how much of each color I have.
Thanks in advance, ANY kind of help will be
I'll admit it is 10 years since I last dabbled with computer vision, but back then I used the OpenCV libraries, and these still seem to be going strong, and support on Android:
http://opencv.willowgarage.com/wiki/Android
and iOS:
http://www.eosgarden.com/en/opensource/opencv-ios/overview/