Ajax software
Free javascripts

↑

Main Page

How does it work with the match for the part number

ABC

? When the regular expression engine is at the

position immediately before the uppercase

A

of the part number

ABC

, it attempts to match an uppercase

A

.

That matches. Next, an attempt is made to match an uppercase

B

. That too matches. Next, an attempt is

made to match an uppercase

C

. That too matches. At that stage, the first three characters in the regular

expression pattern have been matched. Finally, an attempt is made to match the pattern

[0-9]{0,2}

,

which means “Match a minimum of zero and a maximum of two numeric characters.” Zero numeric

digits follow the uppercase

C

in

ABC

. Because there are exactly zero numeric digits after the uppercase

C

of

ABC

, there is a match (zero numeric digits matches the criterion “a minimum of zero numeric digits”

specified by the minimum-occurrence specifier of the

{0,2}

quantifier). Because the final component

of the pattern matches, the whole pattern matches.

What happens when matching is attempted on the line that contains the part number

ABC8899

? Why do

the first five characters of the part number

ABC8899

match? When the regular expression engine is at the

position immediately before the

A

of

ABC8899

, it attempts to match the next character in the part number

with an uppercase

A

and finds it is a match. Next, an attempt is made to match an uppercase

B

. That too

matches. Then an attempt is made to match an uppercase

C

, which also matches. At that stage, the first

three characters in the regular expression pattern have been matched. Finally, an attempt is made to match

the pattern

[0-9]{0,2}

, which means “Match a minimum of zero and a maximum of two numeric charac-

ters.” Four numeric digits follow the uppercase

C

. Only two of those numeric digits are needed for a suc-

cessful match. Because there are four numeric digits after the uppercase

C

of

ABC

, there is a match (of two

numeric digits, which meets the criterion “a maximum of two numeric digits”), but the final two numeric

digits of

ABC8899

are not needed to form a match, so they are not highlighted. Because all components of

the pattern match, the whole pattern matches.

{n,m}

The minimum-occurrence specifier in the curly-brace syntax doesn’t have to be 0. It can be any number

you like, provided it is not larger than the maximum-occurrence specifier.

Look for one to three occurrences of a numeric digit. You can specify this in a problem definition as follows:

Match an uppercase

A

. If there is a match, attempt to match an uppercase

B

. If there is a match, attempt

to match an uppercase

C

. If all three uppercase characters match, attempt to match a minimum of one

and a maximum of three numeric digits.

So if you wanted to match one to three occurrences of a numeric digit in

Parts.txt

, you would use the

following pattern:

ABC[0-9]{1,3}

Figure A-20 shows the matches in OpenOffice.org Writer. Notice that the part number

ABC

does not

match, because it has zero numeric digits, and you are looking for matches that have one through

three numeric digits. Notice, too, that only the first three numeric digits of

ABC8899

form part of

the match.

The explanation in the preceding section for the {0,m} syntax should be sufficient to help you under-

stand what is happening in this example.

339

Appendix A: Simple Regular Expressions

bapp01.qxd:bapp01 10:47 339

Ajax software
Free javascripts

↓