Basic Regex Character Matching
Lesson
A Simple Introduction
We can perform a basic search with just a string of letters, numbers as shown in the following examples.
Example 1
Text to Search In
123 h3llo world
Regular Expression
123
Output
123 h3llo world
Example 2
Text to Search In
123 h3llo world
Regular Expression
h3llo
Output
123 h3llo world
We can also match on whitespace, on its own or combined with letters/numbers.
Example 3
Text to Search In
123 h3llo world
Regular Expression
lo wo
Output
123 h3llo world
How Many to Match
Depending on what you are using to do your matching, the regular expression engine may hit on just the first match, or it might return them all. A tool like 'grep' might default to showing all matches whereas programming language implementations will often show the first match (or just return true) by default.
If your implementation is only showing a single match then there is likely a way to turn on global matching to get all matches – often denoted with a 'g' flag or similar. Here we will generally use global matching in the examples.
Predefined Regex Classes
Often, we may want our regex to match on any alphanumeric character, any numerical digit or any whitespace character (including spaces, tabs or newlines). We use a backslash as an 'escape character' in regular expressions which means that the character which follows it should be treated specially.
\w Any alphanumeric character (letters and numbers) as well as underscores ( _ )
\d Any digit (0 to 9)
\s Any whitespace (space, tab, line break)
Example 4
Text to Search In
123 h3llo world
Regular Expression
\w
Output
123 h3llo world
Example 5
We can combine all of the above to match on a combination of characters
Text to Search In
123 h3llo world
Regular Expression
\d\sh
Output
123 h3llo world
Not Alphanumeric, Not a Number or Not Whitespace
We can also ask to match based on not being a specific type of character. These should look familiar as the uppercase counterparts to the above expressions.
\W Not an alphanumeric character.
\D Not a digit.
\S Not whitespace.
Example 6
Text to Search In
123 h3llo world
Regular Expression
\D\s\D
Output
123 h3llo world
Whitespace Metacharacters in Regular Expressions
There are various types of whitespace such as newline, carriage return, tab and more. Sometimes we want to match just a particular kind of whitespace instead of matching any whitespace using \s. There are several different metacharacters we can use in our regular expression:
Space (' ')
To match a space with a regular expression, just use a space ' ' with the spacebar!
Other Types of Whitespace
\n Newline, line-break or line-feed
\b Backspace
\r Carriage Return
\t Tab
Any Character
Sometimes we want to match any character. This is done using a dot (.).
. Match on any character (except a newline)
Regex Escape Character
If the character we want to use has a special meaning, how can we search for it?
For example, what if I want to match something with a full stop (.) in it? The
answer is that we escape that character with a backslash. For example, to look
for a dot/period/full stop, we use \.
and to look for a backslash we just use
another backslash \\
to escape the escape!
\ Escape the following character
Example 7
Imagine you wanted to find all files with a date in January 2020.
Text to Search In
20150105.txt
20170203.txt
20200104.txt
20200117.txt
20200321.txt
Regular Expression
202001..\.txt
Output
20150105.txt
20170203.txt
20200104.txt
20200117.txt
20200321.txt
References
Learn more about this topic by checking out these references.
Other Lessons
Learn more by checking out these related lessons
Courses
This lesson is part of the following courses.