Jon Michael Galindo

~ writing, programming, art ~

<< Previous next >>
2 January 2016

JavaScript: Regular Expressions Part 1


Nominally, this tutorial is about regular expressions. But, to teach them I'll show you how to build a code-coloring script like the one these pages uses. So, stick around for that, too. :-)


Why We Need Them

Regular expressions let you search within text for a particular pattern, rather than an exact string.

So, imagine this, you have a comma-separated list of license plates and you need to find one made of 3 letters and 4 numbers, the first letter is a G and the last three numbers are either a 4 or an 8, a 7, and a 0. Moreover, the list was pulled in from different databases and some of the plates have capital letters, some lowercase, some have a space between the first three characters and the last four, and some have a dash.

But don't worry, smile! You don't have to write a search program for this, RegExp makes it downright easy. Be warned, though, it's meant to be useful and compact, not intuitive:

~d/* Find that license plate! : ~* ~dG - (two letters) - (a number) - ((4 or 8), 7, 0) */~* ~bvar~* MatchingPlates = PlatesList.match(~p/,[Gg][A-Za-z]{2}[\s-]?\d(?:4|8)70,/g~*);

Pretty useful, right? Let's learn.


A Readable Example


The above example uses RegExp for searching, but it can also replace one pattern with another.

Take "Everybody was fighting." and replace "fighting" with "kung-fu fighting".

~bvar~* song = ~g"Everybody was fighting."~*; song = song.replace(~r/fighting/~*,"kung-fu fighting"); ~d//song now reads "Everybody was kung-fu fighting."~*

This RegExp makes a lot of sense. The only oddity is the / / around fighting; those are just used to define regular expressions the same way quotation marks define strings.

Of course, it also does nothing special. It doesn't match a pattern.


[Brackets] and Flags


The first feature to look at is the brackets. A set of brackets matches any character inside them.

So, if you wanted to change "moo foo poo too" to "doo doo doo doo":

~bvar~* nonsense = ~g"moo foo boo too"~*; nonsense = nonsense.replace(~r/[mfbt]oo/~*,"doo"); ~d//nonsense now reads "doo foo boo too"~*

But wait! That's not what we wanted. The replace function only matched the very first [mfbt]oo it found. Enter the "Global Search" flag: /g

~bvar~* nonsense = ~g"moo foo boo too"~*; nonsense = nonsense.replace(~r/[mfbt]oo/g~*,"doo"); ~d//nonsense now reads "doo doo doo doo"~*

And that is perfect. (For nonsense, anyway).

There are three more flags, but the only other one we need is /i, which makes the search case-insensitive.


Coloring Code

Now we know enough to start building our code-colorer. Let's try a straight-forward way.

~b<div ~rid~*=~g"code"~*>~* ~bvar~* nonsense = ~g"moo foo boo too"~*; ~b</div>~* ~b<script ~rtype~*=~g"text/javascript"~*>~* ~bvar~* d = ~pdocument~*.getElementById(~g"code"~*); d.innerHTML = d.innerHTML.replace(~r/var/g~*,~g"<span style='color:blue;'>var</span>"~*); ~b</script>~*

And, technically, that works. Now "var" in that code div will look blue. But, we'd have to repeat that process for every single word we wanted to change, and then there are things like numbers and strings where we really need a pattern that brackets can't give us.

We need a few more tools; and we'll get them, next time.



© Jon Michael Galindo 2015