Monday, July 23, 2007

Little RegEx Examples in .NET

If you want to use any of the special characters(viz. [,\, ^, $, . , , ?, *, + , (, ) ) as a literal in a regex, you need to escape them with a backslash.

e.g. Dot in the email pattern

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

Most regular expression flavors treat the brace { as a literal character, unless it is part of a repetition operator like {2,6}.

Using Character Classes, you can find text with particular set of characters.

e.g. To allow alphanumeric characters and underscores, use [A-Za-z_0-9]*

The only special characters or metacharacters inside a character class are the closing bracket (]), the backslash (\), the caret (^) and the hyphen (-).

Examples

1) Email Regular Expression

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

2) HTML Tag matching - Regular Expression

<([A-Z][A-Z0-9]*)[^>]*>(.*?)</\1>

This is very simple example and can't be used in real scenario, as it can't find the nested html tags.

3) Do you want to make RegEx for parsing ASP 3.0 Classic style written program/code?

This can help in matching. And it's a beautiful example of using named capture in Regular Expressions

.

(?<tbefore>.*?)(?<allcode><%(?<equals>=)*\s*(?<code>.*?)%>)(?<endtext>.*)

Here, tbefore name will capture all text before code starting tag <%.

allcode name will capture all code text within <% %> including signs.

code will capture all code within <% %>.

endtext name will capture all text after this code block.

No comments:

 
Dotster Domain Registration Special Offer