Regular Expressions for a powerful search pattern language

Original article was published on Artificial Intelligence on Medium

Regular Expressions for a powerful search pattern language

Experience regular expressions (regex), a ground-breaking search pattern language to rapidly discover the content you’re searching for.

While registering an account for social media application or finishing a request for a online order, each piece of data you enter into a web form is approved. Did you enter an email including a @ symbol? Did you enter a telephone number 10 digits in length, with or without -s and parentheses? And afterward there’s the immortal of all, did your new secret key meet the prerequisites for inclusion (and exclusion) of symbols, digits, and both upper and lower case letters?

Image credits: xkcd.com

While rectifying each field in our digital lives for proper format can be a hell of a headache, it’s guaranteeing that our records are secure, our packages are effectively delivered, and that we can be reached by telephone and email.

The innovation that energizes this check framework on almost every website and application is the regular expressions, regularly abbreviated to regex or regexp. A regular expression is an arrangement of characters that describe an example of content that ought to be found, or matched, in a string you enter. By matching text, we can distinguish how frequently and where certain piece of text occur, just as to have the chance to replace or refresh the text of content if necessary.

Regular Expressions have a variety of use cases including:

  • validating user input in HTML forms
  • verifying and parsing text in files, code and applications
  • examining test results
  • finding keywords in emails and web pages

While there are a variety of implementations of Regular Expressions across platforms, in this lesson you will learn the basics that apply everywhere. By the lesson’s end, you’ll be empowered to use them in your own projects and become a bahubali of regex!

Do you feel the need of regular expression superpowers running through your body?

Regular expressions are special sequences of characters that describe a pattern of text that is to be matched

  • We can use literals to match the exact characters that we need. The regex 9 will match the 9 in the piece of text 139, and the regex 5 choclateswill completely match the text 5 choclates!
  • Alternation allows us to match the text preceding or following the pipe symbol | . The regex burger|pizza will match burger in the text I love burger, but will also match pizza in the text I love pizza .
  • Character sets, denoted by a pair of brackets [], let us match one character from a series of characters. The regex [pasta] will match the characters p, a, s, t or a but not the text pasta.
  • Wildcards, represented by the period or dot ., will match any single character (letter, number, symbol or whitespace). The regex ..... will completely match pizza and pasta! Similarly, the regex I ate . pizzas will completely match both I ate 3 pizzas and I ate 8 pizzas!
  • Ranges allow us to specify a range of characters in which we can make a match. The regex I saw[2-9] [b-h]ats will match the text I saw 4 bats as well as I saw 8 cats and even I saw 5 hats.
  • Shorthand character classes like \w, \d and \s represent the ranges representing word characters, digit characters, and whitespace characters, respectively. The regex \d\s\w\w\w\w\w\w will matches a digit character, followed by a whitespace character, followed by 6 word characters. Thus the regex completely matches the text 3 pizzas.
  • Groupings, denoted with parentheses (), group parts of a regular expression together, and allows us to limit alternation to part of a regex. The regex I love (burger|pizza) will match the text I love and then match either burger or pizza, as the grouping limits the reach of the | to the text within the parentheses.
  • Fixed quantifiers, represented with curly braces {}, let us indicate the exact quantity or a range of quantity of a character we wish to match. The regex pizza{3}s will match the characters pizz followed by 3 as, and then the character s, such as in the text pizzaaas. The regex pizza{1,7}s will match the characters pizz followed by at least 1 as and at most 7 as, followed by an s, matching the strings pizzas, roaaaaar and roaaaaaaar .
  • Optional quantifiers, indicated by the question mark ?, allow us to indicate a character in a regex is optional, or can appear either 0 times or 1 time. The regex The ate a (fabulous)? pizza will completely match both The ate a fabulous pizza and The ate a pizza.
  • The Kleene star, denoted with the asterisk *, is a quantifier that matches the preceding character 0 or more times. The regex meo*w will match the characters me, followed by 0 or more os, followed by a w. Thus the regex will match mew, meow, meooow, and meoooooooooooow.
  • The Kleene plus, denoted by the plus +, matches the preceding character 1 or more times. The regex meo+w will match the characters me, followed by 1 or more os, followed by a w. Thus the regex will match meow, meooow, and meoooooooooooow, but not match mew .
  • The Anchor symbols hat ^ and dollar sign $ are used to match text at the start and end of a string, respectively. The regex ^The pizza is fabulous$ will completely match the text The pizza is fabulous but not match The pizza is fabulous at uncles homeor The pizza is fabulous at the party. The ^ ensures that the matched text begins with The, and the $ ensures the matched text ends with fabulous.

Do you just want to scream ah+ really loud? Awesome! You are now ready to take these skills and use them out in the wild.😉😉