I read the card using OCR on Android(or iOS). But in the process, if it is successful it is not upside down. But character is wrong, the process fails. I am using tesserat and opencv algoritms.
Example like this image. How i can detect text orientation and rotate image.
If the OCR technology you are using does not have a dedicated auto-rotate function (most do, so double-check), then the technique I use is to check for either character confidence or to check for words from dictionary. ABBYY OCR, for example, has a dedicated auto-rotate setting. OCR-IT API also has auto-rotate, and also can return flags such as IsWordFromDictionary in the XML result. Every OCR technology may work differently.
If you expect only 4 possible rotations, then the algorithm is:
Perform OCR. Check confidence, or dictionary words, or even just capitalization (incorrect rotation will produce mess like this: DioOpUllltG). Set a threshold over which you accept the result, such as 50%. You are hoping that your first OCR pass is from an image in correct orientation (statistical approach).
If quality is lower than your threshold, then either you have a low quality image in correct orientation, or the orientation is wrong. Rotate and check remaining three orientations. Pick the best one.
In some projects, where images may be at unpredictable extreme angles, such as 30 degrees, OCR will fail in every case when performing 4 flips. Then I usually use an OCR pass at every 10 degrees rotation (36 OCR passes), and pick the best case.
Related
I am using com.google.mlkit:barcode-scanning:17.0.2 to detect QR codes in the pictures.
After getting URI from the gallery I create InputImage and then process this image with BarcodeScanner to find QR codes. When I select a photo of QR codes on paper code is found. But when I take a photo of the QR code on the monitor screen code is never found. What I should do to be able to detect a QR code in a photo of a monitor screen?
(When I use the same scanner with CameraX to do live QR code detection it finds code on the monitor screen)
val image = InputImage.fromFilePath(context, uri)
val scanOptions =
BarcodeScannerOptions.Builder()
.setBarcodeFormats(
Barcode.FORMAT_QR_CODE,
)
.build()
val scanner = BarcodeScanning.getClient(scanOptions)
scanner.process(image)
.addOnSuccessListener {
val code = it.getOrNull(0)?.rawValue
if (code == null) {
// code NOT found
} else {
// code was found
}
}
Example of QR code on paper which is found
Example of QR code on the monitor screen which is NOT found
Chances are that you're fighting against Moiré effect. Depending on the QR detection algorithm, the high frequencies introduced by the Moiré effect can throw the detector off its track. Frustratingly, it is often the better QRcode detectors that are defeated by Moiré patterns.
A good workaround is:
take the picture at the highest resolution you can
perform a blurring of the picture
increase contrast to the max, if possible
(optionally) run a sigma thresholding, or just rewrite all pixels with a luma component below 32 to 0, all those above 224 to 255.
Another way of doing approximately the same operation is
take the picture at the highest resolution you can
increase contrast to the max, if possible
downsample the picture to a resolution which is way lower
The second method gives worse results, but usually can be implemented with device primitives.
Another source of problems with monitors (not in your picture as far as I can see) is the refresh rate. Sometimes, you'll find that the QR code is actually an overexposed QRcode in the upper half of the picture and an underexposed QRcode in the bottom half of the picture. Neither are recognized. This effect is due to the monitor's refresh rate and strategy and is not easy to solve - you can try lowering the monitor's luminosity to increase exposure time, until it exceeds 1/50th or 1/25th of a second, or take the picture from farther away and use digital zooming. Modern monitors have higher refresh rates and actually refresh at more than their own dwell time, so this should not happen; with old analog monitors however it will happen every time.
A third, crazy way
This was discovered half by chance, but it works really well even on cheap hardware provided the QR SDK or library supplies some small extra frills.
Take a video of about 1 second length at the highest frame rate you can get (25 fps?).
From the middle (e.g. 13th) frame, extract the three QR "waypoints" - there might be a low-level function in your SDK called "containsQRCode()" that does this. If it returns true, the waypoints were found and their coordinates are returned to allow performing scaling/estimates. It might return a confidence figure ("this picture seems to contain a QR code with probability X%"). These are the APIs used by apps to show a frame or red dots around candidate QR codes. If your SDK doesn't have these APIs, sorry... you're out of luck.
Get the frames immediately before and after (12th and 14th), then the 11th and 15th, and so on. If any of these returns a valid QR code, you're home free.
If the QR code is found (even if not correctly decoded) in enough frames, but the waypoint coordinates vary much, the hand is not steady - say so to the user.
If you have enough frames with coordinates that vary little, you can center and align on those, and average the frames. Then run the real QRCode recognition on the resulting image. This gets rid of 100% of the Moiré effect, and also drastically reduces monitor dwell noise with next to no information loss. The results are way better than the resolution change, which isn't easy to perform on (some) devices that reset the camera upon resolution change.
This worked on a $19 ESP32 IoT device operating in a noisy, vibration-rich environment (it acquires QR codes from a camera image of carton boxes on a moving transport ribbon).
I am trying to do detection of LSB Steganography using real-time camera on mobile phone. So far i havent had much luck with detecting the LSB Steganography, whether on printed material or on the PC Screen.
I tried using OpenCV and do the conversion of each frame to RBG, and then read the bits from each pixel, but that never detects the steganography.
I also tried using the Camera functionality, and check onFrame whether pixel by pixel the starting string is recognized or not, so i can read the actual hidden data in the remaining pixels.
This provided few times positive result, but then the reading of the data was impossible.
Any suggestions how to approach this?
Little bit more information on the hidden data:
1. It is all over the image, and i know the algorithm works, since if i just read the exact image through Bitmap in the app, the steganography is detected and decoded, but when i try to use the camera no such luck.
2. It is in a grid, 8x5 pixels all over the image, so it is not that it is only on 1 specific area of the image, and it can not be detected in the camera view.
I can post some code as well if needed.
Thanks.
You still haven't clarified on the specifics of how you do it, but I assume you do some flavour of the following:
embed a secret in a digital image,
print this stego image or have it displayed on a pc, and
take a photograph of that and detect the embedded secret.
For all practical purposes, this can't work. LSB pixel embedding steganography is a very fragile technique. You require a perfect copy of the stego pixels images for extraction to work. Even a simple digital manipulation is enough to destroy your secret. Scaling, cropping and rotation are to name a few. Then you have to worry about the angle you take the photo and the ambient light. And we're not even touching upon the colours that are shown on a pc monitor or the printed photo.
The only reason you get positives for the starting sequence is because you use a short one and you're bound to be lucky. Assuming the photographed stego image results in random deviations for each pixel from its true value, you'll still get lucky sometimes. Imagine the first pixel had the value 250 and after photographed it's 248. Well, the LSB in both cases is still 0.
On top of that, some sequences are more likely to come up. In most photos neighbouring pixels are correlated, because the colour gradient is smooth. This means that if the top left of a photo is dark and the top right is bright, the colour will change slowly. For example, the first 4 pixels have the value 10, then the next few have 11, and so on. In terms of LSBs, you have the pattern 00001111 and as I've just explained, that's likely to come up fairly frequently regardless of what image you photograph out there.
I am new to image processing. I have a data set of images and I want to perform calibration on those images based on a target image. I have surfed a lot on image calibration but the majority of the results yield camera calibration. I am confused as to whether these are same or different things. Can anybody explain to me the difference between these two terms?
On reading through one of the results on image calibration, I got to know that there are three steps that I need to perform:
Bias Frame Calibration
Dark Frame Calibration
Flat Field Frame Calibration
Also, I need to perform this in Android. For that, I have figured out that I will need to use OpenCV or JavaCV.
So, I want to know if these 3 steps will be possible using OpenCV/JavaCV or not?
Calibration is process that is exploiting some knowledge about the data to reconstruct measurements to be more accurate or suite a specific need. As we have no idea what is the desired result of your calibration then it is hard to say.
In general the difference is as follows:
Camera calibration
you got camera and want to achieve that captured images will suffice some condition. This process usually mean taking image of some predefined objects like color markers, geometry checker board, LASER sweeps, etc. This way you can obtain camera parameters needed to reconstruct some specific feature of image for any other image taken (assuming important parameters not change with time like camera position or exposure time ...)
Image calibration
Is similar but the input image can be obtained from different sources (different cameras, render, simulation, etc. ) or under different circumstances (exposure, lighting, etc.). In this case we have not the luxury of calibration process so instead we need to find some kind of know feature in the images and correct the rest of image (for example object of known size, color, temperature, etc.)
So the difference is The Camera calibration is when you got single imaging device as a source of image and Image calibration is when you got multiple image sources (often unknown).
I am not using OpenCV but as people using this lib for such tasks then it should have support for operations like this.
Here small example of such operation:
OpenCV Birdseye view without loss of data
What I am doing is attempting to using EMGU to perform and AbsDiff of two images.
Given the following conditions:
User starts their webcam and with the webcam stationary takes a picture.
User moves into the frame and takes another picture (WebCam has NOT moved).
AbsDiff works well but what I'm finding is that the ISO adjustments and White Balance adjustments made by certain cameras (even on Android and iPhone) are uncontrollable to a degree.
Therefore instead of fighting a losing battle I'd like to attempt some image post processing to see if I can equalize the two.
I found the following thread but it's not helping me much: How do I equalize contrast & brightness of images using opencv?
Can anyone offer specific details of what functions/methods/approach to take using EMGUCV?
I've tried using things like _EqualizeHist(). This yields very poor results.
Instead of equalizing the histograms for each image individually, I'd like to compare the brightness/contrast values and come up with an average that gets applied to both.
I'm not looking for someone to do the work for me (although code example would CERTAINLY be appreciated). I'm looking for either exact guidance or some way to point the ship in the right direction.
Thanks for your time.
I am trying to use ZXing to read 1D barcodes and want to be able to read the barcode no matter the orientation since I am assuming the person may not be looking at the image. I noticed that ZXing can read the barcode up to 45 degrees. Is there a reason it doesn't test both orientations of the image, and is it possible to make it do this?
If not are there alternatives that can?
The reason is just that 99.9% of the time people scan a barcode in its natural orientation (or upside down). Scanning for vertical barcodes would usually just be a waste of time, when you could be getting on to another frame to scan. But it's easy to do, just add an extra chunk of code to rotate and re-scan the image.
#user117 it is not necessary to try all orientations. Any rotation for which a horizontal line still passes through the whole barcode works. You would only have to try additional rotations to cover cases beyond those, and it turns out that 4 would be the most that are needed to cover any orientation.