Ajax software
Free javascripts

↑

Main Page

stage, the first three characters in the regular expression pattern have been matched. Finally, an attempt

is made to match the pattern

[0-9]*

, which means “Match zero or more numeric characters.” Because

the character after

C

is a newline character, there are no numeric digits. Because there are exactly zero

numeric digits after the uppercase

C

of

ABC

, there is a match (of zero numeric digits). Because all compo-

nents of the pattern match, the whole pattern matches.

Why does the part number

ABC8899

also match? When the regular expression engine is at the position

immediately before the

A

of

ABC8899

, it attempts to match the next character in the part number with an

uppercase

A

. Because the first character of the part number

ABC8899

is an uppercase

A

, there is a match.

Next, attempts are made to match an uppercase

B

and an uppercase

C

. These too match. At that stage, the

first three characters in the regular expression pattern have been matched. Finally, an attempt is made to

match the pattern

[0-9]*

, which means “Match zero or more numeric characters.” Four numeric digits

follow the uppercase

C

. Because there are exactly four numeric digits after the uppercase

C

of

ABC

, there

is a match (of four numeric digits, which meets the criterion “zero or more numeric digits”). Because all

components of the pattern match, the whole pattern matches.

Work through the other part numbers step by step, and you’ll find that each ought to match the pattern

ABC[0-9]*

.

The + Quantifier

There are many situations where you will want to be certain that a character or group of characters

is present at least once but also allow for the possibility that the character occurs more than once. The

+

cardinality operator is designed for that situation. The

+

operator means “Match one or more occur-

rences of the chunk that precedes me.”

Take a look at the example with

Parts.txt

, but look for matches that include at least one numeric digit.

You want to find part numbers that begin with the uppercase characters

ABC

and then have one or more

numeric digits.

You can express the problem definition like this:

Match an uppercase

A

. If there is a match, attempt to match an uppercase

B

. If there is a match,

attempt to match an uppercase

C

. If all three uppercase characters match, attempt to match one or

more numeric digits.

Use the following pattern to express that problem definition:

ABC[0-9]+

Matching One or More Numeric Digits

1.

Open OpenOffice.org Writer, and open the sample file

Parts.txt

.

2.

Use Ctrl+F to open the Find And Replace dialog box.

3.

Check the Regular Expressions and Match Case check boxes.

4.

Enter the pattern

ABC[0-9]+

in the Search For text box; click the Find All button, and inspect

the matching part numbers that are highlighted, as shown in Figure A-18.

334

Appendix A: Simple Regular Expressions

bapp01.qxd:bapp01 10:47 334

Ajax software
Free javascripts

↓