Regular Expressions is a sequence of characters that defines a search pattern for text
Â
Table of Contents
Table of ContentsLiteral Characters Meta characters QuantifiersPositionFlagsCharacter Classes Alternation Capturing groups Back RefernecesTest and MatchReferences
Literal Characters
- a-z and A-Z
Meta characters
- ( other than literals)
Meta Character | WTF is it ? |
\d | any digit that can be from 0-9 |
\w | A-Z a-z 0-9 ( alphanumeric and underscore) |
\W | other than A-Z a-z 0-9 |
\s | white space |
\S | other than whitespace |
. | any character whatsoever |
* | 0 or more |
.* | wildcard (matches anything) |
ă…¤ | ă…¤ |
Quantifiers
Quantifier | WTF is it? |
* | 0 or more |
+ | one or more ( atleast one) |
? | 0 or one ( optional) |
{min,max} | Min, max |
{n} | n characters |
Position
Position | WTF is it? |
^ | begining |
$ | end |
\b | word boundary |
\B | not a word boundary |
Flags
Flag | WTF is it? |
g | global ( matching everything) |
i | case insensitive |
Character Classes
character class ⇒ [ xyz ] ⇒ x or y or z
- [.] here the dot is not a meta character it’s a literal dot.
- dash has to be first [-.]
- \. is also literal dot.
- [-] matches -
- [a-z] ⇒ matches from a-z not -
- [^abc] ⇒ which is not a | b | c
- \b[a-zA-Z]{4}\b ⇒ only words with literals and 4 digits
- \b[A-Z][a-z]*\b ⇒ Capital Letter followed by 0 or more small letters
Alternation
- (net | com) ( a | b ) a or b
Let’s suppose we are looking for emails like below
Â
- The last can be
.edu
,.com
,.in
,.dev
we wrap in()
- (\w+\.)?\w+@\w+\.(com|in)
Capturing groups
- Suppose i have names like this
john, doe durga, prasad loki, danne sai, venkat
- I want to change the order like doe john , prasad durga like this
- First we find the pattern and select em and groups which we want to change
- (\w+),\s+(\w+)
- $0 ⇒ gives everything
- $1 ⇒ gives the first group
- $2 ⇒ gives the second goup
- So while replacing we do
$2 $1
and replace then we get as desired. - Suppose I have markdown links and I do want to convert to html links
[google](https://google.com) [yahoo](https://yahoo.com)
<a href="https://google.com" alt="google"> google </a> <a href="https://yahoo.com" alt="yahoo"> yahoo </a>
Back Referneces
- double word capturing is one of the best examples
This is is is some text double double words not not sure why why this this is happening
- we need to capture
is is
,double double
…
- (\b\w+)\s\1\b
- Here \1 is the captured group that can be used in the same regEx itself
Test and Match
const finder = new RegExp("xyz") const regex = /hello/
RegExp.test(”String”)
- returns true or false of match.
String.match(”RegExp”)
- gives an array of the matches.
RegExp.exec("String")
Â
split
andreplace
- string.split(”regex”)
- returns an array
s.split(/\s/)
- splits at the space
- split at , or space
s.split(/[\s,]/)
s.split(/[\s,]+/)
- does split at , or space but not each other- string.replace(”regex”, String)
- IT DOES NOT MODIFY ORIGINAL STRING
- if we wanna replace the original string then
- s = s.rplace(regExp, str)
s.replace(/\s/g, "-")
- replaces the spaces with
-
s.replace(/\w{6,8}/,"dp")
- any words between 6 and 8 will be replaced by
dp
- Suppose i have a stirng and everytime I encounter a vowel i wanna double em
Â
var s = 'egg and bananan' var r = /([aeiou])/ig s.replace(r,"$$1")
- if the length of the word is >2 make it uppercase else remain it same
function k(match){ if(match.length>2){return match.toUpperCase()}else{return match}} let s = " the rainbow is of 7 colors namely vibgyor and I am dp" let r =/\b\w+\b/g s.replace(r,k) //the output is ' THE RAINBOW is of 7 COLORS NAMELY VIBGYOR AND I am dp' // if we caputre groups we can use the function (match, gruoup1, group2..){}
References
- Amazing youtube playlist:- https://www.youtube.com/playlist?list=PLRqwX-V7Uu6YEypLuls7iidwHMdCM6o2w
- Amazing website for practicing:- https://regexr.com/
Â