Regular expressions derive their name from the fact that the strings they recognize are (in a formal computer science sense) “regular.” This implies that there are certain kinds of strings that it will be very hard, if not impossible, to recognize with regular expressions. Luckily, these strings are not often encountered and usually arise only in parsing things like source code or natural language. If you can’t come up with a regular expression for a particular task, chances are that an expert could. However, there is a slight chance that what you want to do is actually impossible, so it never hurts to ask someone more knowledgeable than yourself.
Another issue to keep in mind is that some regular expressions can have exponential complexity. In plain words, this means that it is possible to craft regular expressions that take a really, really long time to test strings against. This usually happens when using the alternative operation (|) to give many complex options. If regular expressions are slowing down your script, consider simplifying them.
A common gotcha when performing form validation with regular expressions is validating e-mail addresses. Most people aren’t aware of the variety of forms e-mail addresses can take. Valid e-mail addresses can contain punctuation characters like ! and +, and they can employ IP addresses instead of domain names (like email@example.com). You'll need to do a bit of research and some experimentation to ensure that the regexps you create will be robust enough to match the types of strings you’re interested in. There are two lessons here. First, when performing form validation, always err on the side of being too permissive rather than too restrictive. Second, educate yourself on the formats the data you’re validating can take. For example, if you’re validating phone numbers, be sure to research common formats for phone numbers in other countries.
And finally, it is important to remember that even the best-crafted pattern cannot test for semantic validity. For example, you might be able to verify that a credit card number has the proper format, but without more complicated server-side functionality, your script has no way to check whether the card is truly valid. Still, associating a syntax checker with forms to look at user-entered data such as credit card numbers is a convenient way to catch common errors before submission to the server.