Wednesday, October 14, 2015

Regular Expression Primer

I feel like I've finally cracked the code on regular expressions.  This is my attempt to document the essentials of regular expressions using JavaScript.  A regular expression essentially describes a pattern of text.

Generally, you use a regular expression (regex) in one of two ways:

  1. Test to see whether a specified string matches a regex exactly.  This is mainly used for validating input.
  2. Extract all instances of the pattern from a larger string of text.  The main use here is to replace the instances of the pattern with something else.
The easiest way to perform the first use is to define the regular express on the fly and then call the test method like so:

/^[0-9]+$/.test(inputToTest)

The test method returns true or false, based on whether the input string was a match.  The forward slashes simply delimit the regular expression.  The ^ indicates the beginning of the string, and the $ indicates the end of the string.  [0-9] says to match any number, and the plus means any number 1 or more times.  So, only numbers with one or more digits will match the regex.

The second use often involves replacing the matching text.  You can use a regex as a parameter when calling the string.replace method.  Let's say you want to replace all numbers in a string with the letter x.  Here is how you do it:

var results = inputString.replace(/[0-9]+/g, "x");

What's the "g" for?

That's the global flag.  That says to replace every matching text in the string with "x".

Now, let's talk about parenthesis.  A parenthesis in a regex means you can extract just the part in the parentheses.  Let's say you want to find everything with curly braces, and replace it all with just the text inside the curly braces.

var results = inputString.replace(/\{([A-Za-z]+)\}/g, "$1");

The &1 refers to the text matched by the first parenthesis.  So we would replace "{test}" with just "test".  Notice that the backslash is used to escape the curly brace character.

Sometimes a parenthesis is just a parenthesis and doesn't have to be referred to later.  You can just use it to group stuff.

These are really dumb examples to keep things simple, but these techniques are very powerful.  The best reference I've found for dealing with regex is here.


No comments:

Post a Comment