I am using OCR to read black text on a white background. I am frequently running into the following issues:
“8” will be read as “3”
“9” will be read as “3”
A text string that is 1 character long is not recognized at all
I have already tried turning contrast on, setting the “backgroundcolor” to white, and altering the contrast tolerance. Any other ideas for how to improve the accuracy of OCR on these numbers?
I also have these same issues when reading black text on a grey background. I have tried setting the “backgroundcolor” to grey here as well.
I am using eggplant 14.21 on Red Hat 6. My Unit Under Test is also a Red Hat 6 machine.
These are all common issues with eggPlant v14. The OCR engine was updated and significantly improved in v15 and generally does a better job of reading text. It will recognize a single character where v14 almost never would.
Note too that the “backgroundcolor” property is not used with and does not work with the OCR functionality – it is only used with the TIG approach to finding text, which is a completely different mechanism. When working with contrast in the OCR, you need to set the contrastColor property.
This has solved the above issues, but it seems version 15 has its own set of OCR issues. This is what I have observed:
Strings that contain only the character 8 will not be recognized at all. Examples: “8”, “88”, “888”
OCR will insert erroneous spaces in strings. I am aware of the IgnoreSpaces option, but this is not a solution for me as I expect some strings to contain spaces.
Using the Contrast, ContrastColor, and ContrastTolerance settings makes OCR substantially worse in my case. Reading text on a light grey background, I set Contrast to on, ContrastColor to the RGB values, and set the ContrastTolerance to all sorts of values with no results. I started with ContrastTolerance = 20 as this is recommended here: