Time difference in finding via OCR and Image

MSchrempp · September 9, 2011, 5:31am

Hi there,

first of all the Generic OCR coming with v11 feels great.

Though I noted a difference between finding times of an OCR item and the same location found via an image. VNC connection was the same.

How can I address this difference if using OCR?

EggplantMatt · September 9, 2011, 9:15am

Thanks. We’re glad to hear it.

[quote=“MSchrempp”]Though I noted a difference between finding times of an OCR item and the same location found via an image. VNC connection was the same.

How can I address this difference if using OCR?[/quote]
OCR is a feature that you’ll want to pull out when an image isn’t going to work. What you need to realize about the OCR feature is that OCR is a very intensive process that can’t be performed as quickly as an image search. The image search is taking an area of a fixed size and running across the screen to see if any other areas of that size resemble it. We’ve optimized that process so that it can quickly rule out areas that don’t match. The OCR engine has no guidelines as to what your text is going like – it could be any size and any color on any background. The OCR engine has to look at every pixel and try to determine if that pixel is part of a character and whether that character is part of a word. And it’s a true OCR engine, so it uses context and dictionary look-ups to resolve whether that character should be an “e” or an “o” – it’s a lot more effort than a straight image search.

The only way of optimizing an OCR search is to narrow the search area. It’s a great feature for reading a text field that can be identified based on an image of its label. If you wanted to use it to script against menus then you’d want to isolate it’s search area to the top portion of the screen.

We’ve made it easier to specify search rectangles in this release (make sure you have version 11.02 to get this feature everywhere rectangles are used). You can now specify a rectangle with two images representing the opposite corners. So to narrow your click command to look in a specific area of the screen, you could say:

click(text:"Users", searchRectangle:("upperLeftImage", "lowerRightImage"))

The readText() function will also take a single point or an image representing a single point as an argument and will try to read text in a horizontal region around that point:

put readText("lastNameField")

EggplantMatt · September 9, 2011, 10:25am

One other thing to be aware of that may be helpful to you: You can specify the language that the OCR is checking against. So for example, if you’re testing against something that has been localized to German, you might say:

readText(("image1", "image2"), language:"German")

If you do not specify German, then the characters might be read more or less properly, but you would not see characters like “ü” in the output; it would just be returned as “u”. And there might be more errors, since the engine would be comparing prospective words against an English-language dictionary and trying to make them fit into that vocabulary.