RegEx
RegEx

RegEx

Regular Expressions is a sequence of characters that defines a search pattern for text
 

Table of Contents


Literal Characters


  • a-z and A-Z

Meta characters


  • ( other than literals)
Meta Character
WTF is it ?
\d
any digit that can be from 0-9
\w
A-Z a-z 0-9 ( alphanumeric and underscore)
\W
other than A-Z a-z 0-9
\s
white space
\S
other than whitespace
.
any character whatsoever
*
0 or more
.*
wildcard (matches anything)
ă…¤
ă…¤

Quantifiers


Quantifier
WTF is it?
*
0 or more
+
one or more ( atleast one)
?
0 or one ( optional)
{min,max}
Min, max
{n}
n characters

Position


Position
WTF is it?
^
begining
$
end
\b
word boundary
\B
not a word boundary

Flags


Flag
WTF is it?
g
global ( matching everything)
i
case insensitive

Character Classes


character class ⇒ [ xyz ] ⇒ x or y or z
  • [.] here the dot is not a meta character it’s a literal dot.
  • dash has to be first [-.]
  • \. is also literal dot.
  • [-] matches -
  • [a-z] ⇒ matches from a-z not -
  • [^abc] ⇒ which is not a | b | c
  • \b[a-zA-Z]{4}\b ⇒ only words with literals and 4 digits
  • \b[A-Z][a-z]*\b ⇒ Capital Letter followed by 0 or more small letters

Alternation


  • (net | com) ( a | b ) a or b
Let’s suppose we are looking for emails like below
 
  • The last can be .edu , .com , .in , .dev we wrap in ()
  • (\w+\.)?\w+@\w+\.(com|in)

Capturing groups


  • Suppose i have names like this
john, doe durga, prasad loki, danne sai, venkat
  • I want to change the order like doe john , prasad durga like this
  • First we find the pattern and select em and groups which we want to change
  • (\w+),\s+(\w+)
  • $0 ⇒ gives everything
  • $1 ⇒ gives the first group
  • $2 ⇒ gives the second goup
  • So while replacing we do $2 $1 and replace then we get as desired.
    • Suppose I have markdown links and I do want to convert to html links
    • [google](https://google.com) [yahoo](https://yahoo.com)
    • I am capturing the text inside the [] as group 1
    • and the links as group 2
    • \[(\w+)\]\((.*)\)
    • now for replacing i will use
    • <a href="$2" alt="$1"> $1 </a>
    • <a href="https://google.com" alt="google"> google </a> <a href="https://yahoo.com" alt="yahoo"> yahoo </a>

Back Referneces


  • double word capturing is one of the best examples
This is is is some text double double words not not sure why why this this is happening
  • we need to capture is is , double double …
  • (\b\w+)\s\1\b
  • Here \1 is the captured group that can be used in the same regEx itself

Test and Match


const finder = new RegExp("xyz") const regex = /hello/
  • RegExp.test(”String”)
    • returns true or false of match.
  • String.match(”RegExp”)
    • gives an array of the matches.
  • RegExp.exec("String")
    •  
  • split and replace
    • string.split(”regex”)
      • returns an array
      • s.split(/\s/)
        • splits at the space
      • split at , or space
        • s.split(/[\s,]/)
        • s.split(/[\s,]+/) - does split at , or space but not each other
         
    • string.replace(”regex”, String)
      • IT DOES NOT MODIFY ORIGINAL STRING
      • if we wanna replace the original string then
        • s = s.rplace(regExp, str)
      • s.replace(/\s/g, "-")
        • replaces the spaces with -
      • s.replace(/\w{6,8}/,"dp")
        • any words between 6 and 8 will be replaced by dp
      • Suppose i have a stirng and everytime I encounter a vowel i wanna double em
        • var s = 'egg and bananan' var r = /([aeiou])/ig s.replace(r,"$$1")
    • One more use case is using function
      • if the length of the word is >2 make it uppercase else remain it same
      • function k(match){ if(match.length>2){return match.toUpperCase()}else{return match}} let s = " the rainbow is of 7 colors namely vibgyor and I am dp" let r =/\b\w+\b/g s.replace(r,k) //the output is ' THE RAINBOW is of 7 COLORS NAMELY VIBGYOR AND I am dp' // if we caputre groups we can use the function (match, gruoup1, group2..){}

References


Â