Primer on Regular Expressions
In this post, I will try to give you a practical overview of Regular Expressions to teach you what they are, what they can be used for and a quick intro to how you can use them.
What are Regular Expressions even?
Regular Expressions (short Regexes) are Strings that work as a DSL (domain-specific language) to do some common tasks within other Strings. A DSL can also be subscribed as “a programming language within a programming language”.
In the case of Regexes, the outer programming language can be any programming language that supports the
String type, it just has to support Regexes. Nearly all popular programming languages support Regexes, which makes Regexes so useful to know. The inner language of Regexes consists of only
String with some characters having a special meaning.
For example in the String
. means "any character", the
* means "any amount of <whatever precedes>", together
.* means "any amount of any character". Then we have a non-special character
@, then again
.* followed by
\\ which means "escape the next character and treat like a non-special character" so
\\. together reads like a normal
. character without the special meaning "any character". Lastly, there's
com which is just a set of characters without any special meaning. Overall this Regex is a simple matcher for any email address ending with
.com and containing a
What can I do with the “Regular Expression DSL”?
NOTE: With Ruby installed (on Macs it’s preinstalled), you can type
irbto start an interactive Ruby shell to play around with the samples below.
There are three main functions that any Regex string can be used with:
matches: Given a Regex and another String, this function checks if the given String "matches" the Regex. This means if there's "any" part within the given String that matches the specified Regex, it returns
true, otherwise, it's
false. For example in Ruby (where
?is part of the function name):
/.*@.*\\.com/.match?('email@example.com') # =>…