How to parsing text

How to parse text? For example, I get all text from the screen.
Then I would like to get text by each line "
But I tried the following two, both don’t work.
1)put ad split by "
" into numbers_list
2)set the itemDelimiter to "

Log ReadText((0,0,800,800))
set ad to ReadText((0,0,800,800))

put ad split by "
" into numbers_list
log numbers_list

set the itemDelimiter to "
put ad into numbers_list2
log numbers_list2

The OCR Engine is using Unicode line endings, so the correct way to split will be:

put myvar split by numToChar(8232) into myList

Hope this helps.

A couple of other thoughts:

First, be aware that SenseTalk treats quoted text literally, so "
" in SenseTalk is two characters: the backslash “” followed by the letter “n”. The only exception to this in Eggplant is the TypeText command, which translates some sequences like "
" into special keystrokes on the SUT like Return. So your example code is actually looking for occurrences of backslash-n in the text, not for returns.

Next, Allen is correct that text returned by the ReadText function will use the Unicode lineSeparator character between lines of text. To make this easy to work with, beginning with version 11 of Eggplant the defaultLineDelimiter is set by default to be the set of all common line endings: (CRLF, LF, CR, LineSeparator, ParagraphSeparator). So you could get what you want by using the “each line of” expression:

put each line of ad into numbers_list

Finally, Allen loses one SenseTalk style point for using numToChar(8232) instead of the predefined variable LineSeparator which has the same value but is more readable (Allen, what were you thinking?!) :wink:

Thanks. That works.

How to read next item in the list?
For example, once I found “Min”, I want to get the value of the following item? Because they are splited by "
", "Min:

repeat with each line in myList
if the first word of it is “Min”
then put next item into min
end repeat

This is a case where using an iterator would come in handy, as iterators give you much more control over your progress through their values. Iterators include lists, ranges, and custom iterator objects. You can test whether a value gives you the control of an iterator by using the “is an iterator” operator:

put (1,2,3) is an iterator -- true
put 10..15 is an iterator -- true
put "abc" is an iterator -- false

As you can see, a text string is not an iterator so you’ll first have to convert the text into a list in order to use this technique.

put each line of sourceString into myList

Once you have a source value that can act as an iterator, you can iterate over its values using “repeat with each” but you now have the added control of being able to move the iteration pointer forward or backward (or skip it directly to any point in the list) during the repeat. Complete control is given by setting the currentIndex property of the list. But in your case, all you need to do is access the nextValue property to get the next value and advance the iterator. Here’s what it might look like:

set source to {{
put each line of source into myList

repeat with each item of myList
	put myList.currentIndex && it
	if it begins with "Min" then put myList.nextValue into min
end repeat

put "Min is " & min

The example above gives this output:

1 Min:
3 Max:
4 300
Min is 100

So you can see that accessing the nextValue property advanced the currentIndex of the list, causing the repeat loop to proceed with item 3 on the next iteration.

Where to find unicode information?
I tried to split a string by space.
Seems Space unicode is 9248.

But the following statement doesn’t work.
For example, I have a string “good |”, I want to get the value of “good”.
put string1 split by numToChar(9248) into list1
put item 1 of list1

Space is “space”. So the following should work:

put string1 split by space into list1
put item 1 of list1

The line ending is really the only thing being returned by an eggPlant function where the Unicode value is a consideration.

@Doug: Don’t blame Allen; I gave him the code. I didn’t realize that LineSeparator included that value.