11.2 on Mac 10.6.8 has OCR problems

I’m testing a Silverlight application on multiple Windows OS and Mac OS SUTs using Mac 10.6.8 as the controller. Under Eggplant 11.11 generic text searches work well, sometimes needing a smaller search rectangle.

In upgrading to 11.2 I find a number of text searches are failing or otherwise acting strangely. In watching the script execute in GUI mode:

  • Strings match various other incorrect areas on the screen such as

    • “35” matches " a", "n " or “ss” (depending on search rectangle)
    • “633” or “633” match “ess”
    • “34” matches “01”
  • Some strings seem to match correctly but the green overlay that shows the match is offset to the left.

If I go back to 11.11 the scripts run successfully. With 11.2 they fail.

Jerry:

Thanks for your feedback and sorry for the problems you are seeing. One of the improvements to OCR has resulted in it being overzealous in some matching situations like your Silverlight application.

Starting with 11.2 we are limiting the characters that it finds to the characters in the string you are searching for. This definitely makes it better at reading things that were hard to read before (e.g. a particular number) but is leading to false positive findings in some scenarios.

We’ll be providing an easy way to control that in a future release, but in the meantime you can overcome this in the following ways:

[list][]For a given search you can add the value ValidCharacters:"" to your command.
[
]If OCR is your default text platform then for an entire run you can use this command
set the defaultTextPlatform.OCRstyles.default.validCharacters to “”
[*]To make all scripts revert to the prior behavior you need to modify your TextPlatforms.plist file to specify a blank ValidCharacters property under TextPlatforms->GenericOCR->OCRStyles->Default. That file is found in ~/Library/Eggplant (MAC/LINUX) or ~/Application Data/Eggplant/TextPlatforms.plist (WINDOWS).[/list:u]We have attached a modified copy of the standard TextPlatforms.plist that you can use if you don’t have any custom Text Definitions.

I can understand why it’s now finding a match in other places in a large search rectangle, because it’s being forced to find only the characters specified. This implies that we’ll need to specify fairly small search rectangles for each and every text search. This leads to several problems:

  • In general we’ve found that smaller search rectangles help up to a point. Make the rectangles too small (although visually more than big enough) and failures start to increase.

  • Application screen object locations vary a bit between SUT platforms/OS which could be problematic depending on rectangle size.

  • Additional development time to make search rectangles for every text search. Partially offset, at least, by modifications needed to get proper operation in the old way.

I’m not sure the new behavior is preferable.

  • The old way led to failures to find a string that was, in fact, there. A false failure. Script modifications can be made reduce the false failures. AFAIK I’ve not seen a false positive.

  • The new way seems more likely to find a string that is, in fact, not there. A false positive. For instance, if I’m looking for a hex value “4B” and it forces “48” as a match or “3b0” matches “360”. It can ba a lot harder to find and fix these false positives.

The false positives would seem to be more “costly”

  • by wrongly allowing failing scripts to pass
  • by leading scripts astray (ie clicking in the wrong location)
  • additional scripting needed to minimize problems

Since we can force the old behavior the above can be seen as fodder for discussion.

Thanks very much for your constructive discussion points, I hope that the work-arounds won’t be too much trouble.

I agree with your perspective – as a general rule we prefer to err on the side of a false failure instead of a false positive. With that in mind we might choose to revert the default behavior and have this new feature as an “opt-in” choice going forward.

Make sure to look at the green box drawn around the match. With the new default behavior the box is not drawn in the right place, definitely to the left of where it should be. I’m not sure if the found image coordinates are also offset or if it’s just the green rectangle that’s off.

This command doesn’t seem to work.
set the defaultTextPlatform.styles.default.validCharacters to “”

I put it and the beginning of the script and also right before the function call that looks for some text and in both cases it matched the wrong place.

For now I’m rolling back to 11.11.

You’ll need to use “OCRStyles” instead of “Styles”. Try this command instead and see if that works:

set the defaultTextPlatform.OCRstyles.default.validCharacters to “”

I guess spelling is important :roll:

This definitely has helped as it’s no longer matching the wrong thing. I continue to have some problems with the green matching box being offset on some matches and occasionally reporting the wrong location for the match. Am working with Jonathan on this.

Thanks very much for your reports, we really appreciate them. We recognize that consistent script behavior is paramount to your testing efforts and that even with new OCR matching options you need for the script behavior to be predictable.

Based on feedback from yourself and others we have decided to make the new matching features of the OCR algorithms opt-in for version 11.21 as described in the release notes. That version is available for download now from our downloads page.

Please let us know if 11.21 does not resolve your issue and thanks for using eggPlant.

We’ll be issuing a fix for this shortly, but we determined that the problem with the “green box” is that it’s showing the box size based on the size of the string without spaces, so the box is shorter than it should be and the middle of the box horizontally is shifted to the left of where it would be if the box was the length of the actual string.

Working on the same app that Jerry is working on automating. (eggPlant v. 11.21-Mac (1203090113) )
I am currently ‘fixing’ a script that was working correctly before. The scenario is a drop down list of Countries with ‘UNITED STATES’ as the top listed then everything else listed below. I have a search rectangle set for the size of the drop down menu and am doing an imagefound(15,(Text:CountryName)) currently looking for ‘UNITED STATES’. It is unable to find ‘UNITED STATES’, however if I log readtext from a point near those words, I receive back ‘UNITED STATES’. If I log readtext for the search rectangle points, I receive back a string of country names separated by ’
':

3/12/12 3:32:42 PM	set		SEARCHRECTANGLE = ((535,322),(875,747))
	repeat until imagefound(10,(Text:Country))
3/12/12 3:32:55 PM	imagefound		Unable to Find Image (TEXT:"UNITED STATES") within 10.00 seconds
log readtext (550,330)
3/12/12 3:32:58 PM	log		UNITED STATES
result: (541,334,630,343)
log readtext ((535,322),(875,747))
3/12/12 3:38:45 PM	readtext		(535,322,875,747)
3/12/12 3:38:45 PM	log		UNITED STATES
AFGHANISTAN
&LAND ISLANDS
ALBANIA
ALGERIA
AMERICAN SAMOA
ANDORRA
ANGOLA
ANGUILLA
ANTARCTICA
ANTIGUA AND BARBUDA
ARGENTINA
ARMENIA
ARUBA
AUSTRALIA
AUSTRIA
AZERBAIJAN
BAHAMAS
BAHRAIN

Am I doing something wrong with the imagefound syntax? I could have sworn this used to work properly on 11.11

It’s not you; it did work that way with 11.11.

In 11.20 - we set it to IgnoreSpaces (in search terms) by default, but that caused a variety of false positives.

11.21 switched that default back to where it was in 11.11 but did not completely disable the default IgnoreSpaces setting for TERMS that included a whitespace themselves.

11.22 completely reverts the default behavior, but still allow you to specify IgnoreSpaces: YES in cases where you might find that helpful in the future.

Cool, thanks. It is working again with the .22 update