We are working on an android application that involves free hand character recognition.
The application requires to student to draw the free hand image of an alphabet on the android screen,and the application process the image drawn and returns the accuracy of the alphabet written.
We are considering two options
a. Using tesseract.
b. Using our own algorithm on which we are still working
Problems
a. Tesseract is not at all helping in recognizing free hand characters.Any pointers on how to train tesseract for the same will be highly appreciated.
b. None of our algorithm are working to our expectation.
Tesseract is actually the wrong approach for recognizing characters written to the screen because it needlessly discards some very valuable info: how the image was originally drawn. How it breaks down into strokes, the order / direction of each stroke, etc - Tesseract has to spend a tremendous amount of time trying to figure out the underlying structure of each character when your software already knows it.
What you want is an actual "handwriting recognition" library, not just an OCR library like Tesseract; you specifically want an "online" handwriting recognition library, "online" meaning that you can capture the actual sequence of points that the user drew to the screen. There are a number of open-source libraries available for this, for example Unipen and LipiTk.
You might want to check out the built-in support for gesture recognition and training tools:
http://developer.android.com/reference/android/gesture/package-summary.html
They offer a pluggable overlay to detect and recognize gestures using a "dictionary" of predefined shapes. Its not automatic like OCR and requires training, but at this point (its been available since Android 1.6) there might be free or commercial handwriting gesture dictionaries available.
Related
I am trying to create an application like Snapchat that applies face filters while recording the video and saves it with the filter on.
I know there are packages like AR core and flutter_camera_ml_vision but these are not helping me.
I want to provide face filters and apply them at the time of video recording on the face, and also save the video with the filter on the face.
Not The Easiest Question To Answer, But...
I'll give it a go, let's see how things turn out.
First of all, you should fill in some more details about the statements given in the question, especially what you're trying to say here:
I know there are packages like AR core and flutter_camera_ml_vision but these are not helping me.
How did you approach the problem and what makes you say that it didn't help you?
In the Beginning...
First of all, let's get some needed basics out of the way to better understand your current situation and level in the prerequisite areas of knowledge:
Do you have any experience using Computer Vision & Machine Learning frameworks in other languages / in other apps?
Do you have the required math skills needed to use this technology?
As you're using Flutter, my guess is that cross-platform compatibility is high priority, have you done much Flutter programming before and what devices are your main targets?
So, What is required for creating a Snapchat-like filter for use in live video recording?
Well, quite a lot of work happens behind the scenes when you apply a filter to live video using any app that implements this in a decent way.
Snapchat uses in-house software that they've built up over years, using technology acquired from multiple multi-million dollar company acquisitions, often established companies that specialized in Computer Vision and AR technology, in addition to their own efforts, and has steadily grown to be quite impressive through the last 5-6 years in particular.
This isn't something you can throw together by yourself as an "all night'er" and expect good results. But there are tools available for easing the general learning curve, but these tools also require a firm understanding of the underlying concepts and technologies being used, and quite a lot of math.
The Technical Detour
OK, I know I may have went a bit overboard here, but this is fundamental building blocks, not so many are aware of the actual amount of computation needed for seemingly "basic" functionality, so please, TLDR; or not, this is fundamental stuff.
To create a good filter for live capture using a camera on something like an iPhone or Android device, you could, and most probably would, use AR as you mentioned you wanted to use in the end, but realize that this is a sub-set of the broad field of Computer Vision (CV) that uses various algorithms from Artificial Intelligence (AI) and Machine Learning (ML) for the main tasks of:
Facial Recognition
Given frames of video content from the live camera, define the area containing a human face (some also works with animals, but let's keep it as simple as possible) and output a rectangle suitable for use as a starting point in (x, y, for width & height).
The analysis phase alone will require a rather complex combination of algorithms / techniques from different parts of the AI universe, and this being video, not a single static image file, this must be continuously updated as the person / camera moves, so it must be done in close to real-time, in the millisecond range.
I believe different implementations combining HOG (Histogram of Oriented Gradients) from Computer Vision and SVMs (Support Vector Machines / Networks) from Machine Learning are still pretty common.
Detection of Facial Landmarks
This is what will define how well a certain effect / filter will adapt to different types of facial features and detect accessories like glasses, hats etc. Also called "facial keypoint detection", "facial feature detection" and other variants in different literature on the subject.
Head Pose Estimation
Once you know a few landmark points, you can also estimate the pose of the head.
This is an important part of effects like "face swap" to correctly re-align one face with another in an acceptable manner. A toolkit like OpenFace (Uses Python, OpenCV, OpenBLAS, Dlib ++) contains a lot of useful functionality, capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation, delivering pretty decent results.
The Compositing of Effects into the Video Frames
After the work with the above is done, the rest involves applying the target filter, dog ears, rabbit teeth, whatever to the video frames, using compositing techniques.
As this answer is starting to look more like an article, I'll leave it to you to go figure out if you want to know more of the details in this part of the process.
Hey, Dude. I asked for AR in Flutter, remember?
Yep.
I know, I can get a bit carried away.
Well, my point is that it takes a lot more than one would usually imagine creating something like you ask for.
BUT.
My best advice if Flutter is your tool of choice would be to learn how to use the Cloud-Based ML services from Google's Firebase suite of tools, Firebase Machine Learning and Google's MLKit.
Add to this some AR-specific plugins, like the ARCore Plugin, and I'm sure you'll be able to get the pieces together if you have the right background and attitude, plus a good amount of aptitude for learning.
Hope this wasn't digressing too far from your core question, but there are no shortcuts that I know of that cut more corners than what I've already mentioned.
You could absolutely use the flutter_camera_ml_vision plugin and it's face recognition which will give you positions for landmarks of a face, such as, nose, eyes etc. Then simply stack the CameraPreview with a CustomPaint(foregroundPainter: widget in which you draw your filters using the different landmarks as coordinates for i.e. glasses, beards or whatever you want at the correct position of the face in the camera preview.
Google ML Kit also has face recognition that produces landmarks and you could write your own flutter plugin for that.
You can capture frames from the live camera preview and reformat them and then it as a byte buffer to ML kit or ML vision. I am currently writing a flutter plugin for ML kit pose detection with live capture so if you have any specific question about that let me know.
You will then have to merge the two surfaces and save to file in appropriate format. This is unknown territory for me so I can not provide any details about this part.
What is a suggested implementation approach for a real time scrolling raster plot on Android?
I'm not looking for a full source code dump or anything, just some implementation guidance or an outline on the "what" and "how".
what: Should I use built in Android components for drawing or go straight to OpenGL ES2? Or maybe something else I haven't heard of. This is my first bout with graphics of any sort, but I'm not afraid to get a little dirty with OpenGL.
how: Given a certain set of drawing components how would I approach implementation? I feel like the plot is basically a texture that needs updating and translating.
Background
I need do design an Android application that as part of its functionality displays a real time scrolling raster plot (i.e. a spectrogram or waterfall plot). The data will first be coming out of libUSB and passing through native C++ where signal processing will happen. Then, I assume, the plotting can happen either in C++ or Kotlin depending on what is easier and whether passing the data over the JNI is a big enough bottleneck or not.
My main concern is drawing the base raster itself in real time and not so much extra things such as zooming, axes, or other added functionality. I'm trying to start simple.
Constraints
I'm limited to free software.
Platform: Android version 7.0+ on modern device
GPU hardware acceleration is preferred as the CPU will be doing a good amount of number crunching bringing streaming data to the plot.
Thanks in advance!
I am new to Android Development. Now I want to integrate Fingerprint lock in my application. Which is the best. Please help me to find good fingerprint lock.
USING CAMERA AS FINGER LOCK
as refernece check this
Fingerprint Scanner using Camera
As someone who's done significant research on this exact problem, I can tell you it's difficult to get a suitable image for templating (feature extraction) using a stock camera found on any current Android device. The main debilitating issue is achieving significant contrast between the finger's ridges and valleys. Commercial optical fingerprint scanners (which you are attempting to mimic) typically achieve the necessary contrast through frustrated total internal reflection in a prism.
FTIR in Biometrics
In this case, light from the ridges contacting the prism are transmitted to the CMOS sensor while light from the valleys are not. You're simply not going to reliably get the same kind of results from an Android camera, but that doesn't mean you can't get something useable under ideal conditions.
I took the image on the left with a commercial optical fingerprint scanner (Futronics FS80) and the right with a normal camera (15MP Cannon DSLR). After cropping, inverting (to match the other scanner's convention), contrasting, etc the camera image, we got the following results.
enter image description here
The low contrast of the camera image is apparent.
enter image description here
But the software is able to accurately determine the ridge flow.
enter image description here
And we end up finding a decent number of matching minutia (marked with red circles.)
Here's the bad news. Taking these types of up close shots of the tip of a finger is difficult. I used a DSLR with a flash to achieve these results. Additionally most fingerprint matching algorithms are not scale invariant. So if the finger is farther away from the camera on a subsequent "scan", it may not match the original.
The software package I used for the visualizations is the excellent and BSD licensed SourceAFIS. No corporate "open source version"/ "paid version" shenanigans either although it's currently only ported to C# and Java (limited).
Non Camera Based Solutions:
For the frightening small number of devices that have hardware that support "USB Host Mode" you can write a custom driver to integrate a fingerprint scanner with Android. I'll be honest, for the two models I've done this for it was a huge pain. I accomplished it by using wireshark to sniff USB packets between the scanner and a linux box that had a working driver and then writing an Android driver based on the sniffed commands.
Cross Compiling FingerJetFX
Once you have worked out a solution for image acquisition (both potential solutions have their drawbacks) you can start to worry about getting FingerJetFX running on Android. First you'll use their SDK to write a self contained C++ program that takes an image and turns it into a template. After that you really have two options.
Compile it to a library and use JNI to interface with it.
Compile it to an executable and let your Android program call it as a subprocess.
For either you'll need the NDK. I've never used JNI so I'll defer to the wisdom of others on how best us it. I always tend to choose route #2. For this application I think it's appropriate since you're only really calling the native code to do one thing, template your image. Once you've got your native program running and cross compiled you can use the answer to this question to package it with your android app and call it from your Android code.
1 ] There is no APIs or Hardware support for finger print detection in Android platform.
2 ] Existing finger print lock systems are not working on finger print pattern matching.
3 ] They are working on pressure comparison , area of finger impression etc.
Reference : Link
I'm writing an Android app to extract a Sudoku puzzle from a picture. For each cell in the 9x9 Sudoku grid, I need to determine whether it contains one of the digits 1 through 9 or is blank. I start off with a Sudoku like this:
I pre-process the Sudoku using OpenCV to extract black-and-white images of the individual digits and then put them through Tesseract. There are a couple of limitations to Tesseract, though:
Tesseract is large, contains lots of functionality I don't need (I.e. Full text recognition), and requires English-language training data in order to function, which I think has to go onto the device's SD card. At least I can tell it to only look for digits using tesseract.setVariable("tessedit_char_whitelist", "123456789");
Tesseract often misinterprets a single digits as a string of digits, often containing newlines. It also sometimes just plain gets it wrong. Here are a few examples from the above Sudoku:
I have three questions:
Is there any way I can overcome the limitations of Tesseract?
If not, what is a useful, accurate method to detect individual digits (not k-nearest neighbours) that would be feasible to implement on Android - this could be a free library or a DIY solution.
How can I improve the pre-processing to target that method? One possibility I've considered is using a thinning algorithm, as suggested by this post, but I'm not going to bother implementing it unless it will make a difference.
I took a class with one of the computer vision superstars who was/is at the top of the digit recognition algorithm rankings. He was really adamant that the best way to do digit recognition is...
1. Get some hand-labeled training data.
2. Run Histogram of Oriented Gradients (HOG) on the training data, and produce one
long, concatenated feature vector per image
3. Feed each image's HOG features and its label into an SVM
4. For test data (digits on a sudoku puzzle), run HOG on the digits, then ask
the SVM classify the HOG features from the sudoku puzzle
OpenCV has a HOGDescriptor object, which computes HOG features. Look at this paper for advice on how to tune your HOG feature parameters. Any SVM library should do the job...the CvSVM stuff that comes with OpenCV should be fine.
For training data, I recommend using the MNIST handwritten digit database, which has thousands of pictures of digits with ground-truth data.
A slightly harder problem is to draw a bounding box around digits that appear in nature. Fortunately, it looks like you've already found a strategy for doing bounding boxes. :)
Easiest thing is to use Normalized Central Moments for digit recognition.
If you have one font (or very similar fonts it works good).
See this solution: https://github.com/grzesiu/Sudoku-GUI
In core there are things responsible for digit recognition, extraction, moments training.
First time application is run operator must provide information what number is seen. Then moments of image (extracted square roi) are assigned to number (operator input). Application base on comparing moments.
Here first youtube movie shows how application works: http://synergia.pwr.wroc.pl/2012/06/22/irb-komunikacja-pc/
Last week i have chosen my major project. It is a vision based system to monitor cyclists in time trial events passing certain points on the course. It should detect the bright yellow race number on a cyclist's back and extract the number from it, and besides record the time.
I done some research about it and i decided to use Tesseract Android Tools by Robert Theis called Tess Two. To speed up the process of recognizing the text i want to use a fact that the number is mend to be extracted from bright (yellow) rectangle on the cyclist back and to focus the actual OCR only on it. I have not found any piece of code or any ideas how to detect the geometric figures with specific color. Thank you for any help. And sorry if i made any mistakes I am pretty new on this website.
Where are the images coming from? I ask because I was asked to provide some technical help for the design of a similar application (we were working with footballer's shirts) and I can tell you that you'll have a few problems:
Use a high quality video feed rather than rely on a couple of digital camera images.
The number will almost certainly be 'curved' or distorted because of the movement of the rider and being able to use a series of images will sometimes allow you to work out what number it really is based on a series of 'false reads'
Train for the font you're using but also apply as much logic as you can (if the numbers are always two digits and never start with '9', use this information to help you get the right number
If you have the luxury of being able to position the camera (we didn't!), I would have thought your ideal spot would be above the rider and looking slightly forward so you can capture their back with the minimum of distortions.
We found that merging several still-frames from the video into one image gave us the best overall image of the number - however, the technology that was used for this was developed by a third-party and they do not want to release it, I'm afraid :(
Good luck!