Notes on perl programming



To split a string of input into an array

@array = split(/ /, $theline);
Where the first variable / / is the divisor between the words.  / +/ would be a better choice so as to eliminate all whitespace.



On matching and substitution

The syntax for a match is /query/
x - matches x
x* - matches 0 or more x's
x+ - matches 1 or more x's
[xX] - matches x or X
[^xX] - matches anything but x or X
[xX]+ - matches one or more more x/Xs
[1-5] - matches 1 to 5 inclusive
[a-z] - matches all lowercase letters from a to z
[a-zA-Z0-9] - matches all letters both upper and lowercase from a to z and all digits
^xim - matches xim at the begging of a string
xim$ - matches xim at the end of a string
^xim$ - matches the string xim
\bxim - matches xim at the start of a word
xim\b - matches xim at the end of a word
\bxim\b - matches the word xim
\Bxim - matches xim at the end of a word
xim\B - matches xim at the start of a word
\Bxim\B - matches xim on the inside of a word
\d - matches any digit  == [0-9]
\D - matches anything but a digit [^0-9]
\w - matches any word character [_0-9a-zA-Z]
\W - matches anything but a word character [^_0-9a-zA-Z]
\s - matches any white space [\r\t\n\f]
\S - matches anything but white space [^\r\t\n\f]
. - matches any character
xi{3}m - matches to three iS - xiiim
xi{1,3} - matches one, two, or three iS - xim or xiim or xiiim
xi{3,} - matches a minimum of three iS - xiiim or xiiiim and on.
xi{0,3} - matches less than or equal to three iS - xm, xim, xiim, xiiim
xim|mix - matches xim or mix
[a-z]+|[0-9]+ - matches one or more letters or one or more digits
([a-z])/1 - matches two lowercase letters.  () saves the search pattern for reference by /n where n is an integer corresponding to the 1st, 2nd, or nth pattern saved.  e.g.. ([a-z])([0-9])/1/2 this matches a lowercase letter, then a digit, then the same lowercase letter, then the same digit.   Note that you have to use that same character that it matched.  So if the first digit in the first saved pattern is 4 when /1 is referenced again it must also be a 4.

() - pattern memory
+ * ? {} - number of matches
^ $ \b \B - pattern anchors
| - alternatives

/ is a pattern delimiter... it starts and stops the pattern.  You can change the pattern delimiter by mn where n is the character to be the new pattern delimiter.
mn/hellon    - n is the delimiter
m!hi/tim!        - ! is the delimiter

/<match pattern>/<pattern matching options>
These options include
g - match all possible matches
i - ignore case
m - treat string as multiple lines
o - only evaluate once
s - treat string as single line
x - ignore white space in pattern

SUBSTITUTION
s/<pattern>/<replacement>/

$string = "abc123def";
$string =~ s/123/456/;
# $string now contains abc456def

like changing the pattern delimiter you can do that here:
s#<pattern>#<replacement>#

TRANSLATION
tr/<string1>/<string2>/

$string = "abcdefabcdef";
$string =~ tr/abc/def/;

this replaces the aS with dS, bS with eS, cS with fS making the string defdefdefdef