Developing Webs logo
Google
Web DW-Main DW-Photoshop
Visit the message board for solutions or post a question Check out the Classes offered for free at Developing Webs Voice Chat submit a tutorial, resources, article, font for publication consideration and earn a one way link back to your site Get more specifics about the Developing Webs Voice chat room and its room members View the current news at Developing Webs alternative sites which are resources for your d3evelopment needs Join Developing Webs Chat Link to us on your web page to help support this resource! Check out Developing Webs At yahoo groups! Developing Webs Home Adobe Photoshop Tutorials Macromedia Flash Tutorials Jasc Paint Shop Pro Tutorials Adobe Illustrator Animation basics Some Basic PHP tutorials Perl Tutorials HTML Tutorials Cascading Style Sheets javascript Front Page Tutorials Macromedia Dream Weaver Tutorials dreamweaver Publishing your site, FTP, ChMod, Promotions Tools to help you create your site Free and shareware fonts to download Photographs to use in your projects Shop for your site needs Free Online classes REcommend this site to others Log in or edit your information when logged in Developingwebs Log Out Change My Account details and preferences
RSS Newsfeed
for your site
DW News
Calendar
DW Forum
Right click on button and copy shortcut

Add to My Yahoo!

Reminder
Remember to book mark this site!



Request a Tutorial

Regex Operators

by pixelatedcyberdust
Date: April 03, 2004

Too put it simply, a regex unlocks the power to complete string comparisons.  That is, it gives us full control over how we view and and manipulate any string (variable) we have.  Regexes is short for Regular Expressions and it takes even advanced programmers a while to understand these well enough to code them efficiently and accurately.

Regular expressions is one of the reasons Perl is such a powerful language, mastering these will give you full control over the data you're using through your scripts.  Before we begin, here is a simple regex for us to look at.

$variable =~ s/text/TEXT/gi;

The m// operator

The m// operator is how we deal with matching.  This is used against the default variable $_ by default but implementing another variable is just as easy as inserting the variable name.  This matching operator works well when you need to know if a string contains a certain character, group of characters or word or a group of words.  Instead of saying if ($line eq "test") which will not work if all we want to know if is the word test exists in $line, we would use m// instead.

The main difference between a simple eq or == and a m//, is one tests for equality and the other tests for the existence of the value inside the string.

my $line;
while($line = <STDIN>)
{
      if($line =~ m/exit/) { exit; }
}

This example acts as an infinite loop until it matches what we're looking for.  It's asking for input, unless the line contains the word exit it's not going to end for us.  From this you can see where our search gets used; the characters or words we want to match are placed inside the //.

my $text = "a blue cow ate the cheese";
if ($text =~ m/cow/)
{
     print "mooooooo";
}

We are taking a predefined variable $text and seeing if we can match the word cow anywhere in it.  As we can see,  while running this code we'll get mooooo back because it can find the word.

Remember, this matching operator doesn't test for equality, it checks for the existence.

The s/// operator

The second most used operator is the substitution operator.  This gives us the power and tools to manipulate our information in any way we wish.  We could scan an entire text file and change all the words "red" to "blue" if that's what we wanted.

This works hand-in-hand with the m/// we just learned in the fact that our words either exist and we can do something with them, or they don't.  This is to say, we can't substitute any part of our text unless the text we want to change already exists.

my $line;
while($line = <STDIN>)
{
   chomp($line);
   $line =~ s/exit/go/;
   print "Did you say $line?\n";
}

We're doing a bit more work in this example because there is a lot more to a substitution than to match words or phrases.  This is nearly the same example we used before, if you type any phrase containing the word exit something will happen.  In this case, we are s/exit/go which means if it finds the word exit, it will be replaced with the word go. 

The best way to learn is to do, so run this script a few times and run a few tests.  Type in words that don't contain exit and some that do so you get familiar with what's going on.

Unlike the match operator where we have m/word/, we have a new set s/word/neword/.  The second set of slashes is the replacement words/characters for what you asked for in the first set.

s/this/that;      # change the word from this to that
s/apple/pear;  # change the word apple to pear
s/I have a red car/I have a red bike/; # change the entire sentence if it matches

A few things to note before we move on is our s/// will only work once by default and is case-sensitive.  Put simply, if we tried to change the word this to that, by default it will only change the first occurrence of this and leave the rest untouched and it will not match THIS.

my $text = "the rabbit jumped down the hole where the cow lived.";
$text =~ s/the/THE/;
print $text;

This example substitutes the lowercase word the to the uppercase THE.  By running this script you'll notice that only the first the that's found gets replaced giving us the result: THE rabbit jumped down the hole where the cow lived.

my $text = "the rabbit jumped down the hole where the cow lived.";
$text =~ s/the/THE/;
print $text;

Using /g at the end of our substitution means to substitute globally, instead of just matching the first instance of the word or phrase we'll substitute it for each time it appears in our data.  Taking the same sentence we used before, simply by adding the /g modifier to the end will replace every occurrence of the word the and end with the result: THE rabbit jumped down THE hole where THE cow lived.

my $text = "The rabbit jumped down the hole where the cow lived.";
$text =~ s/the/THE/gi;
print $text;

With making the small change to our sentence (we capitalized the T on The on the first word), our substitution would normally skip this and replace only the because it's match is case sensitive.  The /i modifier changes the default to a case-insensitive substitution. This will s/// (short for substitute) the words The, THe, tHe and so forth with THE and since we're still using the global modifier /g, it will change all instances of these words.

Sometimes we want to just remove certain words or phrases instead of just s/// them with another word or phrase.  This can be done by leaving the second set of slashes empty.  Doing so tells Perl that you want to substitute the first set of words for nothing (an empty substation), therefore removing the words completely.

my $text = "The rabbit jumped down the hole where the cow lived.";
$text =~ s/the/gi;
print $text;

In this last example, we're removing the word the in any case and as many times as it can be found in the string.  This will produce the results:

  rabbit jumped down hole where cow lived

The tr/// operator

The translation operator also works on $_ by default, with this we can make a character-by-character translation.  The s/// worked on words, numbers and phrases.  This operator works on characters solely.

my $line;
while($line = <STDIN>)
{
   chomp($line);
   $line =~ tr/1/0/;
   print "Did you say $line?\n";
}

We are translating each occurrence of the character "1" with "0". Similar with s///, the 2nd set of slashes is what we're converting our data into if it matches.  For another simple example,

my $text = "bear";
$text =~ tr/b/t/;

Which gives us the result tear as we are replacing the character "b" with "t".

We can remove characters we want from our string instead of swapping it for another.  We do this using the /d (delete) modifier.  We create the character group we want to translate, leave the second set of slashes empty and append d.

my $text = "This is a line of text";
$text =~ tr/a//d;
print "results: $text";

Take not the second set of slashes // are to be left empty if you want to delete the characters instead of swapping them with another.  In our example above, we removed all the "a"s from our text, which was just one however.  A better example would have been to remove an "i" or an "e", but I'll leave that up to you to test.

We now have a fairly good understanding of swapping one character with another, Perl allows us to swap more than one at a time.  This is to say, we can tr/// as few (if greater than one, of course) or as many characters at a time as we want.

my $text = "This is the line that never ends. Yes it goes on and on my friend. Some people started writing it, not knowing what it was. And they'll continue writing it forever just because...this is the line that never ends!";

$text =~ tr/th/ht/d;
print "results: $text";

You will notice we are translating two different characters, the T and the H.  We are swapping them with H and T.  You can swap as many or as little as you want like we discussed earlier, but keep in mind it's in a set order.  The first character in the first set will swap with the first character in the second set (our "t" was swapped with "h"), the second character in the first set will always swap with the second letter in the second set (our "h" swapped with "t").

This example let us switch the H's and T's around making funny text :)  These are case sensitive too, tr/A// will not be the same as tr/a// and as of the time of writing this, I don't know of a case-insensitive modifier to remedy this.  So you'll need to use tr/Aa// if you want to catch all of the same character.

Four our last example, let's have a little fun and remove all the vowels from our text!  We would do that by adding each of the vowels to the first set of // and appending the delete modifier.

my $text = "This is the line that never ends. Yes it goes on and on my friend. Some people started writing it, not knowing what it was. And they'll continue writing it forever just because...this is the line that never ends!";

$text =~ tr/aeiou//d;
print "results: $text";

We get the results (LOL):

Ths s th ln tht nvr nds. Ys t gs n nd n my frnd. Sm ppl strtd wrtng t,
nt knwng wht t ws. And thy'll cntn wrtng t frvr jst bcs...ths s th ln tht nvr
nds!

Challenges

1) Of the three regex operators we learned, which one(s) does not alter the data in any way?
------------------------------------------------------------------------
The m// match operator only matches segments of a string, s/// and tr/// are used to change the data.
------------------------------------------------------------------------

2) We are trying to remove all the "a"s from our variable $sentence using s/// but it's not removing "A".  How can we remove all cases?
------------------------------------------------------------------------
We need to setup a case insensitive substitution.  We do this using the case-insensitive modifer, /i.

$sentence =~ s/a/gi;
------------------------------------------------------------------------

3) What is the difference between substitution and translation?
------------------------------------------------------------------------
Substution, or s///, replaces words, numbers or phrases from a string.  Translation, or tr///, only translates or swaps characters.

An example of s/// would be: s/word/this/gi, s/apple/pear/gi, s/moon is out/sun is out/gi.

An example of tr// would be: tr/a/e/, tr/1/0/, tr/x/z.
------------------------------------------------------------------------



"Building The Web Into a Nicer Place -- One Site At A Time"
Developing Webs Group Copyright © 2001-2014 All Rights Reserved
Privacy and Legal




iSECURE Network Security