I want to recognize numbers using gestures through coding. I've recognized using gesture library. Is there any possibility to recognize numbers perfectly?
Please suggest any sample code.
What do you mean by perfectly? As in successfully detect the number the user intended to gesture 100% of the time? As long as your users are human, this isn't possible. 4 can look like 9, 1 can look like 7, and depending on how quickly they swipe, what started out as a 0 can end up looking like a 6 (or vice versa). Every individual gestures differently than everyone else, and every time you gesture a 4, it's going to look just a little more or less different from your 9, as well as all your other 4's.
One possible solution is to have your app contain a "learning mode" which asks the user to gesture out a specific digits several times, so that you can pick up on patterns (where they start, where they stop, how many swipes are included, how big it is), and use those to narrow things down when the app is actually used. Sort of like a good spam filter- It won't get you 100% detection rate, but it'll definitely get you a lot closer than not having a data set to work off of.
Related
Context
I'm building an app which performs real-time object detection throught the camera module of the device. The render is like the image below.
Let's say I try to recognize an apple, most of the time the app will recognize an apple. However, sometimes, the app will recognize the wrong fruit (let's say a lemon) on a few camera frames.
Goal
As the recognition of a fruit triggers an action in my code, my goal is to programmatically prevent a brief wrong recognition to trigger an action, and only take into account the majority result.
What I've tried
I tried this way : if the same fruit is recognized several frames in a row, I assumed the result is supposed to be the right one. But as my device process image recognition several times per second, even a wrong guess can be recognized several times in a row, and leads to the wrong action.
Question
Is there any known techniques for avoiding this behavior ?
I feel like you've already answered your own question. In general the interpretation of a model's inference is it's own tuning step. You know for example in logistic regression tasks that the threshold does NOT have to be 0.5. In fact, it's quite common to flex the threshold to see what the recall and precision are at various thresholds, and you can pick a threshold that works given your business/product problem. (Fraud detection might favor high recall if you never want to miss any fraud... or high precision if you don't want to annoy users with lots of false positives).
In video this broad concept is extended to multiple frames as you know. You now have the tune the hyperparameters, "how many frames total?" and "how many frames voting [apple]"?
If you are analyzing fruit going down a conveyer belt one by one, and you know each piece of fruit will be in frame for X seconds and you are shooting at 60 fps, maybe you want 60 * X frames. And maybe you want 90% of the frames to agree.
You'll want to visualize how often your detector "flips" detections so you can make a business/product judgement call on what your threshold ought to be.
This answer hasn't been very helpful in giving you a bright line rule here, but I hope it's helpful in suggesting that there is in fact NO bright line rule. You have to understand the problem to set the key hyperparameters:
For each frame, is top-1 acc sufficient, or do I need [.75] or higher confidence?
How many frames get to vote? Say [100].
How many correlated votes are necessary to trigger an actual signal? maybe it's [85].
The above algo assumes you take a hardmax after step 1. another option would be to just average all 100 frames and pick a threshold. that's kind of a soft label version of the above algo.
I am working on a project for which I have to measure the touch surface area. This works both for android and iOS as long as the surface area is low (e.g. using the thumb). However, when the touch area increases (e.g. using the ball of the hand), the touch events are no longer passed to the application.
On my IPhone X (Software Version 14.6), the events where no longer passed to the app when the UITouch.majorRadius exceeded 170. And on my Android device (Redmi 9, Android version 10) when MotionEvent.getPressure exceeded 0.44.
I couldn't find any documentation on this behavior. But I assume its to protect from erroneous inputs.
I looked in the settings of both devices, but I did not find a way to turn this behavior off.
Is there any way to still receive touch events when the touch area is large?
Are there other Android or iOS devices that don't show this behavior?
I would appreciate any help.
So I've actually done some work in touch with unusual areas. I was focusing on multitouch, but its somewhat comparable. The quick answer is no. Because natively to the hardware there is no such thing as a "touch event".
You have capacitance changes being detected. That is HEAVILY filtered by the drivers which try to take capacitance differences and turn it into events. The OS does not deliver raw capacitance data to the apps, it assumes you always want the filtered versions. And if it did deliver that- it would be very hardware specific, and you'd have to reinterpret them into touch events
Here's a few things you're going to find out about touch
1)Pressure on android isn't what you should be looking at. Pressure is meant for things like styluses. You want getSize, which returns the normalized size. Pressure is more for how hard someone is pushing, which really doesn't apply to finger touches these days.
2)Your results will vary GREATLY by hardware. Every single different sensor will differ from each other.
3)THe OS will confuse large touch areas and multitouch. Part of this is because when you make contact with a large area like your heel of your hand, the contact is not uniform throughout. Which means the capacitances will differ, which will make it think you're seeing multiple figures. Also when doing heavy multitouch, you'll see the reverse as well (several nearby fingers look like 1 large touch). This is because the difference between the two, on a physical level, is hard to tell.
4)We were writing an app that was enabling 10 finger multitouch actions on keyboards. We found that we missed high level multitouch from women (especially asian women) more than others- size of your hand greatly effected this, as does how much they hover vs press down. The idea that there were physical capacitance differences in the skin was considered. We believed that it was more due to touching the device more lightly, but we can't throw out actual physical differences.
Some of that is just a dump because I think you'll need to know to look out for it as you continue. I'm not sure exactly what you're trying to do, but best of luck.
I have create an application using gestures library and works well when I try to detect numbers from 0 to 9, but now I want to detect from 0 to 99. The application is easy, only just ask for an aritmethic operation and the user must to draw the correct result in screen. How can I implement the two digit reconize ?
You can't make gesture for two separate symbols unless make multi-touch. Which doesn't make much sense.
I think best approach would be to recognize individual digits. And then wait certain amount of time. After which you get other digit or interpret as single one
Last week i have chosen my major project. It is a vision based system to monitor cyclists in time trial events passing certain points on the course. It should detect the bright yellow race number on a cyclist's back and extract the number from it, and besides record the time.
I done some research about it and i decided to use Tesseract Android Tools by Robert Theis called Tess Two. To speed up the process of recognizing the text i want to use a fact that the number is mend to be extracted from bright (yellow) rectangle on the cyclist back and to focus the actual OCR only on it. I have not found any piece of code or any ideas how to detect the geometric figures with specific color. Thank you for any help. And sorry if i made any mistakes I am pretty new on this website.
Where are the images coming from? I ask because I was asked to provide some technical help for the design of a similar application (we were working with footballer's shirts) and I can tell you that you'll have a few problems:
Use a high quality video feed rather than rely on a couple of digital camera images.
The number will almost certainly be 'curved' or distorted because of the movement of the rider and being able to use a series of images will sometimes allow you to work out what number it really is based on a series of 'false reads'
Train for the font you're using but also apply as much logic as you can (if the numbers are always two digits and never start with '9', use this information to help you get the right number
If you have the luxury of being able to position the camera (we didn't!), I would have thought your ideal spot would be above the rider and looking slightly forward so you can capture their back with the minimum of distortions.
We found that merging several still-frames from the video into one image gave us the best overall image of the number - however, the technology that was used for this was developed by a third-party and they do not want to release it, I'm afraid :(
Good luck!
I'd like to create an application that utilizes touch-screen as a "pad". There will be 3 small buttons in the bottom area of touch-screen, and the rest will be used a mouse movement area.
The first button will act as "left-click" in real mouse, the second one will act as "scroll", and the last one as "right-click"
When a user make any movement (event "move", "up" , "down" or "cancel") in that area, the real mouse-pointer in Windows Desktop will also move.
Transmission media will be Bluetooth and Wifi.
so, here's some questions :
1). is it possible to utilize multi-touch in Froyo ? Example for this case is when user want to "block" some text. In real mouse, we just hold left-click and then drag the pointer. In android, this will be touching the first button while at the same time, touching the "pad" area and make some movement.
2). How can I turn this application concept into a real application ? ( general idea or algorithms )
You might want to check out RemoteDroid. It's an open-source app which has most of the functionality you described.
http://code.google.com/p/remotedroid/
An app like this is going to have two main parts. An Android app which generates a series of movement vectors or movement data, and a program on your target operating system which receives this data and translates it into a software mouse. You will also need the bluetooth stack's necessary for that transfer (I get the feeling wifi won't give you the responsiveness you want without some serious optimization)
When it comes to the Android side of matters, I think you'll need to experiment in the best way to capture those mouse movements. I'd think a speed-vector structure might be your best bet, and it seems most similar to what I know of Mouse Movements.