RegExp in JavaScript
Regular expressions are patterns used to match character combinations in strings.
Constructing a RegExp
var re = /ab+c/i;var re = new RegExp("ab+c", "i");
Lookahead
x(?=y)(Positive Lookahead)Matches
'x'only if'x'is followed by'y'.For example,
/Jack(?=Sprat)/matches'Jack'only if it is followed by'Sprat'./Jack(?=Sprat|Frost)/matches'Jack'only if it is followed by'Sprat'or'Frost'. However, neither ‘Sprat’ nor ‘Frost’ is part of the match results.
x(?!y)(Negative Lookahead)Matches
'x'only if'x'is not followed by'y'.For example,
/\d+(?!\.)/matches a number only if it is not followed by a decimal point. The regular expression/\d+(?!\.)/.exec("3.141")matches'141'but not'3.141'.
Lookahead does not consume characters in the string. It only asserts whether a match is possible or not.
As soon as the lookaround condition is satisfied, the regex engine forgets about everything inside the lookaround.
// Intersaction
// Match a 6+ letter password with at least:
// one number, one letter, and one symbol
var re = /^(?=.*\d)(?=.*[a-z])(?=.*[\W_]).{6,}$/i;
// Subtraction
// Any number that's NOT divisible by 5
var re = /\b(?!\d+[05])\d+\b/;
// Negation
// Anything that doesn't contain 'foo'
var re = /^(?!.*foo).+$/;Lookbehind
Lookbehind works backwards. It tells the regex engine to temporarily check backwards in the string.
(?<=y)x(Positive Lookbehind)Matches
'x'only if'x'is preceded by'y'.
(?<!y)x(Negative Lookbehind)Matches
'x'only if'x'is not preceded by'y'.For example,
(?<!\\)(\\\\)*\\$matches odd numbers of consecutive\.
Note: Lookbehind is not available in JavaScript.
Backreferences
(x)Matches
'x'and remembers the match. The parentheses are called capturing parentheses.
(?:x)Matches
'x'but does not remember the match. The parentheses are called non-capturing parentheses.In expression
/(?:foo){1,2}/. Without the non-capturing parentheses, the{1,2}characters would apply only to the last'o'in'foo'. With the capturing parentheses, the{1,2}applies to the entire word'foo'.
Including parentheses in a regular expression pattern causes the corresponding submatch to be remembered.
// match quote string "hello"
/('|").+?('|")/g.exec('"hello"');
// but false positive: "hello' or 'hello"
/('|").+?('|")/g.exec('"hello\'');
// better: match quote string "hello 'world'"
/('|").+?\1/g.exec('"hello \'world\'"');
// but false positive: "hello \"world\""
/('|").+?\1/g.exec('"hello "world""');
// best: can match quote string "hello \"world\""
/('|")(\\?.)*?\1/g.exec('"hello \\"world\\""');Furthermore, in string#replace method. the script can uses $1 and $2
to denote the first and second parenthesized substring matches.
"John Smith".replace(/(\w+)\s(\w+)/, "$2, $1");
// => "Smith, John"Special Characters
?
If used immediately after any of the quantifiers *, +, ?, or {},
makes the quantifier non-greedy.
For example, applying /\d+/ to "123abc" matches "123".
But applying /\d+?/ to that same string matches only the "1".
\b
Matches a word boundary. A word boundary matches the position where a word character is not followed or preceeded by another word-character. Note that a matched word boundary is not included in the match.
Examples:
/\bm/matches the'm'in"moon";/oo\b/does not match the'oo'in"moon", because'oo'is followed by'n'which is a word character;/oon\b/matches the'oon'in"moon", because'oon'is the end of the string, thus not followed by a word character;/\w\b\w/will never match anything, because a word character can never be followed by both a non-word and a word character.