Chapter 17
System Variables
CONTENTS
ToChapter's lesson describes the built-in system variables that can
be referenced from every Perl program. These system variables
are divided into five groups:
- Global scalar variables
- Pattern system variables
- File system variables
- Array system variables
- Built-in file variables
The following sections describe these groups of system variables,
and also describe how to provide English-language equivalents
of their variable names.
The global scalar variables are built-in system variables
that behave just like the scalar variables you create in the main
body of your program. This means that these variables have the
following properties:
- Each built-in global scalar variable stores only one scalar
value.
- Only one copy of a global scalar variable is defined in a
program.
Other kinds of built-in scalar variables, which you will see later
in this lesson, do not behave in this way.
The following sections describe the global scalar variables your
Perl programs can use.
The most commonly used global scalar variable is the $_
variable. Many Perl functions and operators modify the contents
of $_ if you do not explicitly specify the scalar variable
on which they are to operate.
The following functions and operators work with the $_
variable by default:
- The pattern-matching operator
- The substitution operator
- The translation operator
- The <> operator, if it appears in a while
or for conditional expression
- The chop function
- The print function
- The study function
The Pattern-Matching Operator and $_
Normally, the pattern-matching operator examines the value stored
in the variable specified by a corresponding =~ or !~
operator. For example, the following statement prints hi
if the string abc is contained in the value stored in
$val:
print ("hi") if ($val =~ /abc/);
By default, the pattern-matching operator examines the value stored
in $_. This means that you can leave out the =~
operator if you are searching $_:
print ("hi") if ($_ =~ /abc/);
print ("hi") if (/abc/); # these two are the same
| NOTE |
If you want to use the !~ (true-if-pattern-not-matched) operator, you will always need to specify it explicitly, even if you are examining $_:
print ("hi") if ($_ !~ /abc/);
If the Perl interpreter sees just a pattern enclosed in / characters, it assumes the existence of a =~ operator
|
$_ enables you to use pattern-sequence memory to extract
subpatterns from a string and assign them to an array variable:
$_ = "This string contains the number 25.11.";
@array = /-?(\d+)\.?(\d+)/;
In the second statement shown, each subpattern enclosed in parentheses
becomes an element of the list assigned to @array. As
a consequence, @array is assigned (25,11).
In Perl 5, a statement such as
@array = /-?(\d+)\.?(\d+)/;
also assigns the extracted subpatterns to the pattern-sequence
scalar variables $1, $2, and so on. This means
that the statement assigns 25 to $1 and 11
to $2. Perl 4 supports assignment of subpatterns to arrays,
but does not assign the subpatterns to the pattern-sequence variables.
The Substitution Operator and $_
The substitution operator, like the pattern-matching operator,
normally modifies the contents of the variable specified by the
=~ or !~ operator. For example, the following
statement searches for abc in the value stored in $val
and replaces it with def:
$val =~ s/abc/def/;
The substitution operator uses the $_ variable if you
do not specify a variable using =~. For example, the
following statement replaces the first occurrence of abc
in $_ with def:
s/abc/def/;
Similarly, the following statement replaces all white space (spaces,
tabs, and newline characters) in $_ with a single space:
/\s+/ /g;
When you substitute inside $_, the substitution operator
returns the number of substitutions performed:
$subcount = s/abc/def/g;
Here, $subcount contains the number of occurrences of
abc that have been replaced by def. If abc
is not contained in the value stored in $_, $subcount
is assigned 0.
The Translation Operator and $_
The behavior of the translation operator is similar to that of
the pattern-matching and substitution operators: it normally operates
on the variable specified by =~, and it operates on $_
if no =~ operator is included. For example, the following
statement translates all lowercase letters in the value stored
in $_ to their uppercase equivalents:
tr/a-z/A-Z/;
Like the substitution operator, if the translation operator is
working with $_, it returns the number of operations
performed. For example:
$conversions = tr/a-z/A-Z/;
Here, $conversions contains the number of lowercase letters
converted to uppercase.
You can use this feature of tr to count the number of
occurrences of particular characters in a file. Listing 17.1 is
an example of a program that performs this operation.
Listing 17.1. A program that counts using tr.
1: #!/usr/local/bin/perl
2:
3: print ("Specify the nonblank characters you want to count:\n");
4: $countstring = <STDIN>;
5: chop ($countstring);
6: @chars = split (/\s*/, $countstring);
7: while ($input = <>) {
8: $_ = $input;
9: foreach $char (@chars) {
10: eval ("\$count = tr/$char/$char/;");
11: $count{$char} += $count;
12: }
13: }
14: foreach $char (sort (@chars)) {
15: print ("$char appears $count{$char} times\n");
16: }
$ program17_1 file1
Specify the nonblank characters you want to count:
abc
a appears 8 times
c appears 3 times
b appears 2 times
$

This program first asks the user for a line
of input containing the characters to be counted. These characters
can be separated by spaces or jammed into a single word.
Line 5 takes the line of input containing the characters to be
counted and removes the trailing newline character. Line 6 then
splits the line of input into separate characters, each of which
is stored in an element of the array @chars. The pattern
/\s*/ splits on zero or more occurrences of a whitespace
character; this splits on every nonblank character and skips over
the blank characters.
Line 7 reads a line of input from a file whose name is specified
on the command line. Line 8 takes this line and stores it in the
system variable $_. (In most cases, system variables
can be assigned to, just like other variables.)
Lines 9-12 count the number of occurrences of each character in
the input string read in line 4. Each character, in turn, is stored
in $char, and the value of $char is substituted
into the string in line 10. This string is then passed to eval,
which executes the translate operation contained in the string.
The translate operation doesn't actually do anything because it
is "translating" a character to itself. However, it
returns the number of translations performed, which means that
it returns the number of occurrences of the character. This count
is assigned to $count.
For example, suppose that the variable $char contains
the character e and that $_ contains Hi
there!. In this case, the string in line 10 becomes the following
because e is substituted for $char in the string:
$count = tr/e/e/;
The call to eval executes this statement, which counts
the number of e's in Hi there!. Because there
are two e's in Hi there!, $count is
assigned 2.
An associative array, %count, keeps track of the number
of occurrences of each of the characters being counted. Line 11
adds the count returned by line 10 to the associative array element
whose subscript is the character currently being counted. For
example, if the program is currently counting the number of e's,
this number is added to the element $count{"e"}.
After all input lines have been read and their characters counted,
lines 14-16 print the total number of occurrences of each character
by examining the elements of %count.
The <> Operator and $_
In Listing 17.1, which you've just seen, the program reads a line
of input into a scalar variable named $input and then
assigns it to $_. There is a quicker way to carry out
this task, however. You can replace
while ($input = <>) {
$_ = $input;
# more stuff here
}
with the following code:
while (<>) {
# more stuff here
}
If the <> operator appears in a conditional expression
that is part of a loop (an expression that is part of a conditional
statement such as while or for) and it is not
to the right of an assignment operator, the Perl interpreter automatically
assigns the resulting input line to the scalar variable $_.
For example, Listing 17.2 shows a simple way to print the first
character of every input line read from the standard input file.
Listing 17.2. A simple program that assigns to $_
using <STDIN>.
1: #!/usr/local/bin/perl
2:
3: while (<STDIN>) {
4: ($first) = split (//, $_);
5: print ("$first\n");
6: }
$ program17_2
This is a test.
T
Here is another line.
H
^D
$

Because <STDIN> is inside a
conditional expression and is not assigned to a scalar variable,
the Perl interpreter assigns the input line to $_. The
program then retrieves the first character by passing $_
to split.
| NOTE |
The <> operator assigns to $_ only if it is contained in a conditional expression in a loop. The statement
<STDIN>;
reads a line of input from the standard input file and throws it away without changing the contents of $_. Similarly, the following statement does not change the value of $_:
if (<>) {
print ("The input files are not all empty.\n");
}
|
The chop Function and $_
By default, the chop function operates on the value stored
in the $_ variable. For example:
while (<>) {
chop;
# you can do things with $_ here
}
Here, the call to chop removes the last character from
the value stored in $_. Because the conditional expression
in the while statement has just assigned a line of input
to $_, chop gets rid of the newline character
that terminates each input line.
The print Function and $_
The print function also operates on $_ by default.
The following statement writes the contents of $_ to
the standard output file:
print;
Listing 17.3 is an example of a program that simply writes out
its input, which it assumes is stored in $_. This program
is an implementation of the UNIX cat command, which reads
input files and displays their contents.
Listing 17.3. A simple version of the cat
command using $_.
1: #!/usr/local/bin/perl
2:
3: print while (<>);
$ program17_3 file1
This is the only line in file "file1".
$

This program uses the <> operator
to read a line of input at a time and store it in $_.
If the line is nonempty, the print function is called;
because no variable is specified with print, it writes
out the contents of $_.
| NOTE |
You can use this default version of print only if you are writing to the default output file (which is usually STDOUT but can be changed using the select function). If you are specifying a file variable when you call
print, you also must specify the value you are printing.
For example, to send the contents of $_ to the output file MYFILE, use the following command:
print MYFILE ($_)
|
The study Function and $_
If you do not specify a variable when you call study,
this function uses $_ by default:
study;
The study function increases the efficiency of programs
that repeatedly search the same variable. It is described on Chapter
13, "Process, String, and Mathematical Functions."
Benefits of the $_ Variable
The default behavior of the functions listed previously is useful
to remember when you are writing one-line Perl programs for use
with the -e option. For example, the following command
is a quick way to display the contents of the files file1,
file2, and file3:
$ perl -e "print while <>;" file1 file2 file3
Similarly, the following command changes all occurrences of abc
in file1, file2, and file3 to def:
$ perl -ipe "s/abc/def/g" file1 file2 file3
| TIP |
Although $_ is useful in cases such as the preceding one, don't overuse it. Many Perl programmers write programs that have references to $_ running like an invisible thread through their programs.
Programs that overuse $_ are hard to read and are easier to break than programs that explicitly reference scalar variables you have named yourself
|
The $0 variable contains the name of the program you
are running. For example, if your program is named perl1,
the statement
print ("Now executing $0...\n");
displays the following on your screen:
Now executing perl1...
The $0 variable is useful if you are writing programs
that call other programs. If an error occurs, you can determine
which program detected the error:
die ("$0: can't open input file\n");
Here, including $0 in the string passed to die
enables you to specify the filename in your error message. (Of
course, you can always leave off the trailing newline, which tells
Perl to print the filename and the line number when printing the
error message. However, $0 enables you to print the filename
without the line number, if that's what you want.)
| NOTE |
You can change your program name while it is running by modifying the value stored in $0
|
The $< and $> variables contain, respectively,
the real user ID and effective user ID for the program. The real
user ID is the ID under which the user of the program logged in.
The effective user ID is the ID associated with this particular
program (which is not always the same as the real user ID).
| NOTE |
If you are not running your Perl program on the UNIX operating system, the $< and $> variables might have no meaning. Consult your local documentation for more details
|
Listing 17.4 uses the real user ID to determine the user name
of the person running the program.
Listing 17.4. A program that uses the $<
variable.
1: #!/usr/local/bin/perl
2:
3: ($username) = getpwuid($<);
4: print ("Hello, $username!\n");
$ program17_4
Hello, dave!
$

The $< variable contains the real
user ID, which is the login ID of the person running this program.
Line 3 passes this user ID to getpwuid, which retrieves
the password file entry corresponding to this user ID. The user
name is the first element in this password file, and it is stored
in the scalar variable $username. Line 4 then prints
this user name.
| NOTE |
On certain UNIX machines, you can assign $< to $> (set the effective user ID to be the real user ID) or vice versa. If you have superuser privileges, you can set $< or $> to any defined user ID
|
The $( and $) variables define the real group
ID and the effective group ID for this program. The real group
ID is the group to which the real user ID (stored in the variable
$<) belongs; the effective group ID is the group to
which the effective user ID (stored in the variable $>)
belongs.
If your system enables users to be in more than one group at a
time, $( and $) contain a list of group IDs,
with each pair of group IDs being separated by spaces. You can
convert this into an array by calling split.
Normally, you can only assign $( to $), and
vice versa. If you are the superuser, you can set $(
or $) to any defined group ID.
| NOTE |
$( and $) might not have any useful meaning if you are running Perl on a machine running an operating system other than UNIX
|
The $] system variable contains the current version number.
You can use this variable to ensure that the Perl on which you
are running this program is the right version of Perl (or is a
version that can run your program).
Normally, $] contains a character string similar to this:
$RCSfile: perl.c,v $$Revision: 4.0.1.8 $$Date: 1993/02/05 19:39:30 $
Patch level: 36
The useful parts of this string are the revision number and the
patch level. The first part of the revision number indicates that
this is version 4 of Perl. The version number and the patch level
are often combined; in this notation, this is version 4.036 of
Perl.
You can use the pattern-matching operator to extract the useful
information from $]. Listing 17.5 shows one way to do
it.
Listing 17.5. A program that extracts information from the
$] variable.
1: #!/usr/local/bin/perl
2:
3: $] =~ /Revision: ([0-9.]+)/;
4: $revision = $1;
5: $] =~ /Patch level: ([0-9]+)/;
6: $patchlevel = $1;
7: print ("revision $revision, patch level $patchlevel\n");
$ program17_5
revision 4.0.1.8, patch level 36
$

This program just extracts the revision and
patch level from $] using the pattern-matching operator.
The built-in system variable $1, described later toChapter,
is defined when a pattern is matched. It contains the substring
that appears in the first subpattern enclosed in parentheses.
In line 3, the first subpattern enclosed in parentheses is [0-9.]+.
This subpattern matches one or more digits mixed with decimal
points, and so it matches 4.0.1.8. This means that 4.0.1.8
is assigned to $1 by line 3 and is assigned to $revision
by line 4.
Similarly, line 5 assigns 36 to $1 (because the subpattern
[0-9]+, which matches one or more digits, is the first
subpattern enclosed in parentheses). Line 6 then assigns 36 to
$patchlevel.
 |
On some machines, the value contained in $] might be completely different from the value used in this example. If you are not sure whether $] has a useful value, write a little program that just prints $]. If this program prints
something useful, you'll know that you can run programs that compare $] with an expected value
|
When the Perl interpreter is told to read a line of input from
a file, it usually reads characters until it reads a newline character.
The newline character can be thought of as an input line separator;
it indicates the end of a particular line.
The system variable $/ contains the current input line
separator. To change the input line separator, change the value
of $/. The $/ variable can be more than one
character long to handle the case in which lines are separated
by more than one character. If you set $/ to the null
character, the Perl interpreter assumes that the input line separator
is two newline characters.
Listing 17.6 shows how changing $/ can affect your program.
Listing 17.6. A program that changes the value of $/.
1: #!/usr/local/bin/perl
2:
3: $/ = ":";
4: $line = <STDIN>;
5: print ("$line\n");
$ program17_6
Here is some test input: here is the end.
Here is some test input:
$

Line 3 sets the value of $/ to a colon.
This means that when line 4 reads from the standard input file,
it reads until it sees a colon. As a consequence, $line
contains the following character string:
Here is some test input:
Note that the colon is included as part of the input line (just
as, in the normal case, the trailing newline character is included
as part of the line).
 |
The -0 (zero, not the letter O) switch sets the value of $/. If you change the value of $/ in your program, the value specified by -0 will be thrown away.
To temporarily change the value of $/ and then restore it to the value specified by -0, save the current value of $/ in another variable before changing it.
For more information on -0, refer to Chapter 16, "Command-Line Options.
|
The system variable $\ contains the current output line
separator. This is a character or sequence of characters that
is automatically printed after every call to print.
By default, $\ is the null character, which indicates
that no output line separator is to be printed. Listing 17.7 shows
how you can set an output line separator.
Listing 17.7. A program that uses the $\
variable.
1: #!/usr/local/bin/perl
2:
3: $\ = "\n";
4: print ("Here is one line.");
5: print ("Here is another line.");
$ program17_7
Here is one line.
Here is another line.
$

Line 3 sets the output line separator to the
newline character. This means that a list passed to a subsequent
print statement always appears on its own output line.
Lines 4 and 5 now no longer need to include a newline character
as the last character in the line.
 |
The -l option sets the value of $\. If you change $\ in your program without saving it first, the value supplied with -l will be lost. See Chapter 16 for more information on the -l option
|
The $, variable contains the character or sequence of
characters to be printed between elements when print
is called. For example, in the following statement the Perl interpreter
first writes the contents of $a:
print ($a, $b);
It then writes the contents of $, and then finally, the
contents of $b.
Normally, the $, variable is initialized to the null
character, which means that the elements of a print statement
are printed next to one another. Listing 17.8 is a program that
sets $, before calling print.
Listing 17.8. A program that uses the $,
variable.
1: #!/usr/local/bin/perl
2:
3: $a = "hello";
4: $b = "there";
5: $, = " ";
6: $\ = "\n";
7: print ($a, $b);
$ program17_8
hello there
$

Line 5 sets the value of $, to a space.
Consequently, line 7 prints a space after printing $a
and before printing $b.
Note that $\, the default output separator, is set to
the newline character. This setting ensures that the terminating
newline character immediately follows $b. By contrast,
the following statement prints a space before printing the trailing
newline character:
print ($a, $b, "\n");
| NOTE |
Here's another way to print the newline immediately after the final element that doesn't involve setting $\:
print ($a, $b . "\n");
Here, the trailing newline character is part of the second element being printed. Because $b and \n are part of the same element, no space is printed between them
|
Normally, if an array is printed inside a string, the elements
of the array are separated by a single space. For example:
@array = ("This", "is", "a", "list");
print ("@array\n");
Here, the print statement prints
This is a list
A space is printed between each pair of array elements.
The built-in system variable that controls this situation is the
$" variable. By default, $" contains
a space. Listing 17.9 shows how you can control your array output
by changing the value of $".
Listing 17.9. A program that uses the $"
variable.
1: #!/usr/local/bin/perl
2:
3: $" = "::";
4: @array = ("This", "is", "a", "list");
5: print ("@array\n");
$ program17_9
This::is::a::list
$

Line 3 sets the array element separator to
:: (two colons). Array element separators, like other
separators you can define, can be more than one character long.
Line 5 prints the contents of @array. Each pair of elements
is separated by the value stored in $", which is
two colons.
| NOTE |
The $" variable affects only entire arrays printed inside strings. If you print two variables together in a string, as in
print ("$a$b\n");
the contents of the two variables are printed with nothing separating them regardless of the value of $".
To change how arrays are printed outside strings, use $\, described earlier toChapter
|
By default, when the print function prints a number,
it prints it as a 20-digit floating point number in compact format.
This means that the following statements are identical if the
value stored in $x is a number:
print ($x);
printf ("%.20g", $x);
To change the default format that print uses to print
numbers, change the value of the $# variable. For example,
to specify only 15 digits of precision, use this statement:
$# = "%.15g";
This value must be a floating-point field specifier, as used in
printf and sprintf.
| NOTE |
The $# variable does not affect values that are not numbers and has no effect on the printf, write, and sprintf functions
|
For more information on the field specifiers you can use as the
default value in $#, see "Formatting Output Using
printf" on Chapter 11, "Formatting Your Output."
| NOTE |
The $# variable is deprecated in Perl 5. This means that although $# is supported, it is not recommended for use and might be removed from future versions of Perl
|
If a statement executed by the eval function contains
an error, or an error occurs during the execution of the statement,
the error message is stored in the system variable $@.
The program that called eval can decide either to print
the error message or to perform some other action.
For example, the statement
eval ("This is not a perl statement");
assigns the following string to $@:
syntax error in file (eval) at line 1, next 2 tokens "This is"
The $@ variable also returns the error generated by a
call to die inside an eval. The following statement
assigns this string to $@:
eval ("die (\"nothing happened\")");
nothing happened at (eval) line 1.
| NOTE |
The $@ variable also returns error messages generated by the require function. See Chapter 19, "Object-Oriented Programming in Perl," for more information on require
|
The $? variable returns the error status generated by
calls to the system function or by calls to functions
enclosed in back quotes, as in the following:
$username = 'hostname';
The error status stored in $? consists of two parts:
- The exit value (return code) of the process called by system
or specified in back quotes
- A status field that indicates how the process was terminated,
if it terminated abnormally
The value stored in $? is a 16-bit integer. The upper
eight bits are the exit value, and the lower eight bits are the
status field. To retrieve the exit value, use the >>
operator to shift the eight bits to the right:
$retcode = $? >> 8;
For more information on the status field, refer to the online
manual page for the wait function or to the file /usr/include/sys/wait.h.
For more information on commands in back quotes, refer to Chapter
20, "Miscellaneous Features of Perl."
Some Perl library functions call system library functions. If
a system library function generates an error, the error code generated
by the function is assigned to the $! variable. The Perl
library functions that call system library functions vary from
machine to machine.
| NOTE |
The $! variable in Perl is equivalent to the errno variable in the C programming language
|
The $. variable contains the line number of the last
line read from an input file. If more than one input file is being
read, $. contains the line number of the last input file
read. Listing 17.10 shows how $. works.
Listing 17.10. A program that uses the $.
variable.
1: #!/usr/local/bin/perl
2:
3: open (FILE1, "file1") ||
4: die ("Can't open file1\n");
5: open (FILE2, "file2") ||
6: die ("Can't open file2\n");
7: $input = <FILE1>;
8: $input = <FILE1>;
9: print ("line number is $.\n");
10: $input = <FILE2>;
11: print ("line number is $.\n");
12: $input = <FILE1>;
13: print ("line number is $.\n");
$ program17_10
line number is 2
line number is 1
line number is 3
$

When line 9 is executed, the input file FILE1
has had two lines read from it. This means that $. contains
the value 2. Line 10 then reads from FILE2.
Because it reads the first line from this file, $. now
has the value 1. When line 12 reads a third line from FILE1,
$. is set to the value 3. The Perl interpreter remembers
that two lines have already been read from FILE1.
| NOTE |
If the program is reading using <>, which reads from the files listed on the command line, $. treats the input files as if they are one continuous file. The line number is not reset when a new input file is opened
You can use eof to test whether a particular file has ended, and then reset $. yourself (by assigning zero to it) before reading from the next file.
|
Normally, the operators that match patterns (the pattern-matching
operator and the substitution operator) assume that the character
string being searched is a single line of text. If the character
string being searched consists of more than one line of text (in
other words, it contains newline characters), set the system variable
$* to 1.
| NOTE |
By default, $* is set to 0, which indicates that multiline pattern matches are not required
|
 |
The $* variable is deprecated in Perl 5. If you are running Perl 5, use the m pattern-matching option when matching in a multiple-line string. See Chapter 7, "Pattern Matching," for more details on this option
|
Normally, when a program references the first element of an array,
it does so by specifying the subscript 0. For example:
@myarray = ("Here", "is", "a", "list");
$here = $myarray[0];
The array element $myarray[0] contains the string Here,
which is assigned to $here.
If you are not comfortable with using 0 as the subscript for the
first element of an array, you can change this setting by changing
the value of the $[ variable. This variable indicates
which value is to be used as the subscript for the first array
element.
Here is the preceding example, modified to use 1 as the first
array element subscript:
$[ = 1;
@myarray = ("Here", "is", "a", "list");
$here = $myarray[1];
In this case, the subscript 1 now references the first array element.
This means that $here is assigned Here, as before.
| TIP |
Don't change the value of $[. It is too easy for a casual reader of your program to forget that the subscript 0 no longer references the first element of the array. Besides, using 0 as the subscript for the first element is standard practice in
many programming languages, including C and C++
|
| NOTE |
$[ is deprecated in Perl 5
|
So far, all the arrays you've seen have been one-dimensional arrays,
which are arrays in which each array element is referenced by
only one subscript. For example, the following statement uses
the subscript foo to access an element of the associative
array named %array:
$myvar = $array{"foo"};
Perl does not support multidimensional arrays directly. The following
statement is not a legal Perl statement:
$myvar = $array{"foo"}{"bar"};
However, Perl enables you to simulate a multidimensional associative
array using the built-in system variable $;.
Here is an example of a statement that accesses a (simulated)
multidimensional array:
$myvar = $array{"foo","bar"};
When the Perl interpreter sees this statement, it converts it
to this:
$myvar = $array{"foo" . $; . "bar"};
The system variable $; serves as a subscript separator.
It automatically replaces any comma that is separating two array
subscripts.
Here is another example of two equivalent statements:
$myvar = $array{"s1", 4, "hi there"};
$myvar = $array{"s1".$;.4.$;."hi there"};
The second statement shows how the value of the $; variable
is inserted into the array subscript.
By default, the value of $; is \034 (the Ctrl+\
character). You can define $; to be any value you want.
Listing 17.11 is an example of a program that sets $;.
Listing 17.11. A program that uses the $;
variable.
1: #!/usr/local/bin/perl
2:
3: $; = "::";
4: $array{"hello","there"} = 46;
5: $test1 = $array{"hello","there"};
6: $test2 = $array{"hello::there"};
7: print ("$test1 $test2\n");
$ program17_11
46 46
$

Line 3 sets $; to the string ::.
As a consequence, the subscript "hello","there"
in lines 4 and 5 is really hello::there because the Perl
interpreter replaces the comma with the value of $;.
Line 7 shows that both "hello","there"
and hello::there refer to the same element of the associative
array.
 |
If you set $;, be careful not to set it to a character that you are actually using in a subscript. For example, if you set $; to ::, the following statements reference the same element of the array:
$array{"a::b", "c"} = 1;
$array{"a", "b::c"} = 2;
In each case, the Perl interpreter replaces the comma with ::, producing the subscript a::b::c
|
On Chapter 11 you learned how to format your output using print formats
and the write statement. Each print format contains one
or more value fields that specify how output is to appear on the
page.
If a value field in a print format begins with the ^
character, the Perl interpreter puts a word in the value field
only if there is room enough for the entire word. For example,
in the following program (a duplicate of Listing 11.9),
1: #!/usr/local/bin/perl
2:
3: $string = "Here\nis an unbalanced line of\ntext.\n";
4: $~ = "OUTLINE";
5: write;
6:
7: format OUTLINE =
8: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<
9: $string
10: .
the call to write uses the OUTLINE print format
to write the following to the screen:
Here is an unbalanced line
Note that the word of is not printed because it cannot
fit into the OUTLINE value field.
To determine whether a word can fit in a value field, the Perl
interpreter counts the number of characters between the next character
to be formatted and the next word-break character. A word-break
character is one that denotes either the end of a word or
a place where a word can be split into two parts.
By default, the legal word-break characters in Perl are the space
character, the newline character, and the - (hyphen)
character. The acceptable word break characters are stored in
the system variable $:.
To change the list of acceptable word-break characters, change
the value of $:. For example, to ensure that all hyphenated
words are in the same line of formatted output, define $:
as shown here:
$: = " \n";
Now only the space and newline characters are legal word-break
characters.
| NOTE |
Normally, the tab character is not a word-break character. To allow lines to be broken on tabs, add the tab character to the list specified by the $: variable:
$: = " \t\n-"
|
The $$ system variable contains the process ID for the
Perl interpreter itself. This is also the process ID for your
program.
When you use the <> operator, the Perl interpreter
reads input from each file named on the command line. For example,
suppose that you are executing the program myprog as
shown here:
$ myprog test1 test2 test3
In myprog, the first occurrence of the <>
operator reads from test1. Subsequent occurrences of
<> continue reading from test1 until it
is exhausted; at this point, <> reads from test2.
This process continues until all the input files have been read.
On Chapter 6, "Reading from and Writing to Files," you learned
that the @ARGV array lists the elements of the command
line and that the first element of @ARGV is removed when
the <> operator reads a line. (@ARGV also
is discussed later toChapter.)
When the <> operator reads from a file for the
first time, it assigns the name of the file to the $ARGV
system variable. This enables you to keep track of what file is
currently being read. Listing 17.12 shows how you can use $ARGV.
Listing 17.12. A simple file-searching program using $ARGV.
1: #!/usr/local/bin/perl
2:
3: print ("Enter the search pattern:\n");
4: $string = <STDIN>;
5: chop ($string);
6: while ($line = <>) {
7: if ($line =~ /$string/) {
8: print ("$ARGV:$line");
9: }
10: }
$ program17_12 file1 file2 file3
Enter the string to search:
the
file1:This line contains the word "the".
$

This program reads each line of the input files
supplied on the command line. If a line contains the pattern specified
by $string, line 8 prints the name of the file and then
the line itself. Note that the pattern in $string can
contain special pattern characters.
| NOTE |
If <> is reading from the standard input file (which occurs when you have not specified any input files on the command line), $ARGV contains the string - (a single hyphen)
|
The $^A variable is used by write to store formatted
lines to be printed. The contents of $^A are erased after
the line is printed.
This variable is defined only in Perl 5.
The $^D variable displays the current internal debugging
value. This variable is defined only when the -D switch
has been specified and when your Perl interpreter has been compiled
with debugging included.
See your online Perl documentation for more details on debugging
Perl. (Unless you are using an experimental version of Perl, you
are not likely to need to debug it.)
The $^F variable controls whether files are to be treated
as system files. Its value is the largest UNIX file descriptor
that is treated as a system file.
Normally, only STDIN, STDOUT, and STDERR
are treated as system files, and the value assigned to $^F
is 2. Unless you are on a UNIX machine, are familiar
with file descriptors, and want to do something exotic with them,
you are not likely to need to use the $^F system variable.
The $^I variable is set to a nonzero value by the Perl
interpreter when you specify the -i option (which edits
files as they are read by the <> operator).
The following statement turns off the editing of files being read
by <>:
undef ($^I);
When $^I is undefined, the next input file is opened
for reading, and the standard output file is no longer changed.
 |
DO open the files for input and output yourself if your program wants to edit some of its input files and not others; this process is easier to follow.
DON'T use $^I if you are reading files using the -n or -p option unless you really know what you are doing, because you are not likely to get the behavior you expect. If -i has modified the default output file,
undefining $^I does not automatically set the default output file to STDOUT
|
The $^L variable contains the character or characters
written out whenever a print format wants to start a new page.
The default value is \f, the form-feed character.
The $^P variable is used by the Perl debugger. When this
variable is set to zero, debugging is turned off.
You normally won't need to use $^P yourself, unless you
want to specify that a certain chunk of code does not need to
be debugged.
The $^T variable contains the time at which your program
began running. This time is in the same format as is returned
by the time function: the number of seconds since January
1, 1970.
The following statement sets the file-access and -modification
times of the file test1 to the time stored in $^T:
utime ($^T, $^T, "test1");
For more information on the time and utime functions,
refer to Chapter 12, "Working with the File System."
| NOTE |
The time format used by $^T is also the same as that used by the file test operators -A, -C, and -M
|
The $^W system variable controls whether warning messages
are to be displayed. Normally, $^W is set to a nonzero
value only when the -w option is specified.
You can set $^W to zero to turn off warnings inside your
program. This capability is useful if your program contains statements
that generate warnings you want to ignore (because you know that
your statements are correct). For example:
$^W = 0; # turn off warning messages
# code that generates warnings goes here
$^W = 1; # turn warning messages back on
 |
Some warnings are printed before program execution starts (for example, warnings of possible typos). You cannot turn off these warnings by setting $^W to zero
|
The $^X variable displays the first word of the command
line you used to start this program. If you started this program
by entering its name, the name of the program appears in $^X.
If you used the perl command to start this program, $^X
contains perl.
The following statement checks to see whether you started this
program with the command perl:
if ($^X ne "perl") {
print ("You did not use the 'perl' command ");
print ("to start this program.\n");
}
The system variables you have seen so far are all defined throughout
your program. The following system variables are defined only
in the current block of statements you are running. (A block
of statements is any group of statements enclosed in the brace
characters { and }.) These pattern system
variables are set by the pattern-matching operator and the
other operators that use patterns (such as, for example, the substitution
operator). Many of these pattern system variables were first introduced
on Chapter 7.
| TIP |
Even though the pattern system variables are defined only inside a particular block of statements, your programs should not take advantage of that fact. The safest way to use the pattern-matching variables is to assign any variable that you might need to
a scalar variable of your own
|
When you specify a pattern for the pattern-matching or substitution
operator, you can enclose parts of the pattern in parentheses.
For example, the following pattern encloses the subpattern \d+
in parentheses. (The parentheses themselves are not part of the
pattern.)
/(\d+)\./
This subpattern matches one or more digits.
After a pattern has been matched, the system variables $1,
$2, and so on match the subpatterns enclosed in parentheses.
For example, suppose that the following pattern is successfully
matched:
/(\d+)([a-z]+)/
In this case, the match found must consist of one or more digits
followed by one or more lowercase letters. After the match has
been found, $1 contains the sequence of one or more digits,
and $2 contains the sequence of one or more lowercase
letters.
Listing 17.13 is an example of a program that uses $1,
$2, and $3 to match subpatterns.
Listing 17.13. A program that uses variables containing matched
subpatterns.
1: #!/usr/local/bin/perl
2:
3: while (<>) {
4: while (/(-?\d+)\.(\d+)([eE][+-]?\d+)?/g) {
5: print ("integer part $1, decimal part $2");
6: if ($3 ne "") {
7: print (", exponent $3");
8: }
9: print ("\n");
10: }
11: }
$ program17_13 file1
integer part 26, decimal part 147, exponent e-02
integer part -8, decimal part 997
$

This program reads each input line and searches
for floating-point numbers. Line 4 matches if a floating-point
number is found. (Line 4 is a while statement, not an
if, to enable the program to detect lines containing
more than one floating-point number. The loop starting in line
4 iterates until no more matches are found on the line.)
When a match is found, the first set of parentheses matches the
digits before the decimal point; these digits are copied into
$1. The second set of parentheses matches the digits
after the decimal point; these matched digits are stored in $2.
The third set of parentheses matches an optional exponent; if
the exponent exists, it is stored in $3.
Line 5 prints the values of $1 and $2 for each
match. If $3 is defined, its value is printed by line
7.
 |
DO use $1, not $0, to retrieve the first matched subpattern. $0 contains the name of the program you are running.
DON'T confuse $1 with \1. \1, \2, and so on are defined only inside a pattern. See Chapter 7 for more information on \1
|
In patterns, parentheses are counted starting from the left. This
rule tells the Perl interpreter how to handle nested parentheses:
/(\d+(\.)?\d+)/
This pattern matches one or more digits optionally containing
a decimal point. When this pattern is matched, the outer set of
parentheses is considered to be the first set of parentheses;
these parentheses contain the entire matched number, which is
stored in $1.
The inner set of parentheses is treated as the second set of parentheses
because it includes the second left parenthesis seen by the pattern
matcher. The variable $2, which contains the subpattern
matched by the second set of parentheses, contains .
(a period) if a decimal point is matched and the empty string
if it is not.
When a pattern is matched successfully, the matched text string
is stored in the system variable $&. This is the
only way to retrieve the matched pattern because the pattern matcher
returns a true or false value indicating whether the pattern match
is successful. (This is not strictly true, because you could enclose
the entire pattern in parentheses and then check the value of
$1; however, $& is easier to use in this
case.) Listing 17.14 is a program that uses $& to
count all the digits in a set of input files.
Listing 17.14. A program that uses $&.
1: #!/usr/local/bin/perl
2:
3: while ($line = <>) {
4: while ($line =~ /\d/g) {
5: $digitcount[$&]++;
6: }
7: }
8: print ("Totals for each digit:\n");
9: for ($i = 0; $i <= 9; $i++) {
10: print ("$i: $digitcount[$i]\n");
11: }
$ program17_14 file1
Totals for each digit:
0: 11
1: 6
2: 3
3: 1
4: 2
5:
6: 1
7:
8:
9: 1
$

This program reads one line at a time from
the files specified on the command line. Line 4 matches each digit
in the input line in turn; the matched digit is stored in $&.
Line 5 takes the value of $& and uses it as the subscript
for the array @digitcount. This array keeps a count of
the number of occurrences of each digit.
When the input files have all been read, lines 9-11 print the
totals for each digit.
| NOTE |
If you need the value of $&, be sure to get it before exiting the while loop or other statement block in which the pattern is matched. (A statement block is exited when the Perl interpreter sees a } character.)
For example, the pattern matched in line 4 cannot be accessed outside of lines 4-6 because this copy of $& is defined only in these lines. (This rule also holds true for all the other pattern system variables defined in toChapter's lesson.)
The best rule to follow is to either use or assign a pattern system variable immediately following the statement that matches the pattern
|
When a pattern is matched, the text of the match is stored in
the system variable $&. The rest of the string is
stored in two other system variables:
- The unmatched text preceding the match is stored in the $`
variable.
- The unmatched text following the match is stored in the $'
variable.
For example, if the Perl interpreter searches for the /\d+/
pattern in the string qwerty1234uiop, it matches 1234,
which is stored in $&. The substring qwerty,
which precedes the match, is stored in $`. The rest of
the string, uiop, is stored in $'.
If the beginning of a text string is matched, $` is set
to the empty string. Similarly, if the last character in the string
is part of the match, $' is set to the empty string.
The $+ variable matches the last subpattern enclosed
in parentheses. For example, when the following pattern is matched,
$+ matches the digits after the decimal point:
/(\d+)\.(\d+)/
This variable is useful when the last part of a pattern is the
only part you really need to look at.
Several system variables are associated with file variables.
One copy of each file system variable is defined for each file
that is referenced in your Perl program. Many of these system
variables were first introduced on Chapter 11. The variables mentioned
there are redefined here for your convenience.
When the write statement sends formatted output to a
file, it uses the value of the $~ system variable for
that file to determine the print format to use.
When a program starts running, the default value of $~
for each file is the same as the name of the file variable that
represents the file. For example, when you write to the file represented
by the file variable MYFILE, the default value of $~
is MYFILE. This means that write normally uses
the MYFILE print format. (For the standard output file,
this default print format is named STDOUT.)
If you want to specify a different print format, change the value
of $~ before calling the write function. For
example, to use the print format MYFORMAT when writing
to the standard output file, use the following code:
select (STDOUT); # making sure you are writing to STDOUT
$~ = "MYFORMAT";
write;
This call to write uses MYFORMAT to format its
output.
 |
Remember that one copy of $~ is defined for each file variable. Therefore, the following code is incorrect:
$~ = "MYFORMAT";
select (MYFILE);
write;
In this example, the assignment to $~ changes the default print format for whatever the current output file happens to be. This assignment does not affect the default print format for MYFILE because MYFILE is selected after
$~ is assigned. To change the default print format for MYFILE, select it first:
select (MYFILE);
$~ = "MYFORMAT";
write;
This call to write now uses MYFORMAT to write to MYFILE
|
The $= variable defines the page length (number of lines
per page) for a particular output file. $= is normally
initialized to 60, which is the value that the Perl interpreter
assumes is the page length for every output file. This page length
includes the lines left for page headers, and it is the length
that works for most printers.
If you are directing a particular output file to a printer with
a nonstandard page length, change the value of $= for
this file before writing to it:
select ("WEIRDLENGTH");
$= = 72;
This code sets the page length for the WEIRDLENGTH file
to 72.
 |
$= is set to 60 by default only if a page header format is defined for the page. If no page header is defined, $= is set to 9999999 because Perl assumes that you want your output to be a continuous stream.
If you want paged output without a page header, define an empty page header for the output file
|
The $- variable associated with a particular file variable
lists the number of lines left on the current page of that file.
Each call to write subtracts the number of lines printed
from $-. If write is called when $-
is zero, a new page is started. (If $- is greater than
zero, but write is printing more lines than the value
of $-, write starts a new page in the middle
of its printing operation.)
When a new page is started, the initial value of $- is
the value stored in $=, which is the number of lines
on the page.
The program in Listing 17.15 displays the value of $-.
Listing 17.15. A program that displays $-.
1: #!/usr/local/bin/perl
2:
3: open (OUTFILE, ">outfile");
4: select ("OUTFILE");
5: write;
6: print STDOUT ("lines to go before write: $-\n");
7: write;
8: print STDOUT ("lines to go after write: $-\n");
9: format OUTFILE =
10: This is a test.
11: .
12: format OUTFILE_TOP =
13: This is a test.
14: .
$ program17_15
lines to go before write: 58
lines to go after write: 57
$

Line 3 opens the output file outfile
and associates the file variable OUTFILE with this file.
Line 4 then calls select, which sets the default output
file to OUTFILE.
Line 5 calls write, which starts a new page. Line 6 then
sends the value of $- to the standard output file, STDOUT,
by specifying STDOUT in the call to print. Note
that the copy of $- printed is the copy associated with
OUTFILE, not STDOUT, because OUTFILE
is currently the default output file.
Line 7 calls write, which sends a line of output to OUTFILE
and decreases the value of $- by one. Line 8 prints this
new value of $-.
| NOTE |
If you want to force your next output to appear at the beginning of a new page, you can set $- to zero yourself before calling write.
When a file is opened, the copy of $- for this file is given the initial value of zero. This technique ensures that the first call to write always starts a page (and generates the header for the page)
|
When write starts a new page, you can specify the page
header that is to appear on the page. To do this, define a page
header print format for the output file to which the page is to
be sent.
The system variable $^ contains the name of the print
format to be used for printing page headers. If this format is
defined, page headers are printed; if it does not exist, no page
headers are printed.
By default, the copy of $^ for a particular file is set
equal to the name of the file variable plus the string _TOP.
For example, for the file represented by the file variable MYFILE,
$^ is given an initial value of MYFILE_TOP.
To change the page header print format for a particular file,
set the default output file by calling select, and then
set $^ to the print format you want to use. For example:
select (MYFILE);
$^ = "MYHEADER";
This code changes the default output file to MYFILE and
then changes the page header print format for MYFILE
to MYHEADER. As always, you must remember to select
the file before changing $^ because each file has its
own copy of $^.
When you send output to a file using print or write,
the operating system might not write it right away. Some systems
first send the output to a special array known as a buffer;
when the buffer becomes full, it is written all at once. This
process of output buffering is usually a more efficient way to
write data.
In some circumstances, you might want to send output straight
to your output file without using an intervening buffer. (For
example, two processes might be sending output to the standard
output file at the same time.)
The $| system variable indicates whether a particular
file is buffered. By default, the Perl interpreter defines a buffer
for each output file, and $| is set to 0. To eliminate
buffering for a particular file, select the file and then set
the $| variable to a nonzero value. For example, the
following code eliminates buffering for the MYFILE output
file:
select ("MYFILE");
$| = 1;
These statements set MYFILE as the default output file
and then turn off buffering for it.
 |
If you want to eliminate buffering for a particular file, you must set $| before writing to the file for the first time because the operating system creates the buffer when it performs the first write operation
|
Each output file opened by a Perl program has a copy of the $%
variable associated with it. This variable stores the current
page number. When write starts a new page, it adds one
to the value of $%. Each copy of $% is initialized
to 0, which ensures that $% is set to 1 when the first
page is printed. $% often is displayed by page header
print formats.
The system variables you've seen so far have all been scalar
variables. The following sections describe the array variables
that are automatically defined for use in Perl programs. All of
these variables, except for the @_ variable, are global
variables: their value is the same throughout a program.
The @_ variable, which is defined inside each subroutine,
is a list of all the arguments passed to the subroutine.
For example, suppose that the subroutine my_sub is called
as shown here:
&my_sub("hello", 46, $var);
The values hello and 46, plus the value stored
in $var, are combined into a three-element list. Inside
my_sub, this list is stored in @_.
In a subroutine, the @_ array can be referenced or modified,
just as with any other array variable. Most subroutines, however,
assign @_ to locally defined scalar variables using the
local function:
sub my_sub {
local ($arg1, $arg2, $arg3) = @_;
# more stuff goes here
}
Here, the local statement defines three local variables,
$arg1, $arg2, and $arg3. $arg1
is assigned the first element of the list stored in @_,
$arg2 is assigned the second, and $arg3 is assigned
the third.
For more information on subroutines, refer to Chapter 9, "Using
Subroutines."
| NOTE |
If the shift function is called inside a subroutine with no argument specified, the @_ variable is assumed, and its first element is removed
|
When you run a Perl program, you can specify values that are to
be passed to the program by including them on the command line.
For example, the following command calls the Perl program myprog
and passes it the values hello and 46:
$ myprog "hello" 46
Inside the Perl program, these values are stored in a special
built-in array named @ARGV. In this example, @ARGV
contains the list ("hello", 46).
Here is a simple statement that prints the values passed on the
command line:
print ("@ARGV\n");
The @ARGV array also is associated with the <>
operator. This operator treats the elements in @ARGV
as filenames; each file named in @ARGV is opened and
read in turn. Refer to Chapter 6 for a description of the <>
operator.
| NOTE |
If the shift function is called in the main body of a program (outside a subroutine) and no arguments are passed with it, the Perl interpreter assumes that the @ARGV array is to have its first element removed.
The following loop assigns each element of @ARGV, in turn, to the variable $var:
while ($var = shift) {
# stuff
}
|
In Perl, if you specify the -n or -p option,
you can also supply the -a option. This option tells
the Perl interpreter to break each input line into individual
words (throwing away all tabs and spaces). These words are stored
in the built-in array variable @F. After an input line
has been (automatically) read, the @F array variable
behaves like any other array variable.
For more information on the -a, -n, or -p
options, refer to Chapter 16, "Command-Line Options."
| NOTE |
When the -a option is specified and an input line is broken into words, the original input line can still be accessed because it is stored in the $_ system variable
|
The @INC array variable contains a list of directories
to be searched for files requested by the require function.
This list consists of the following items, in order from first
to last:
- The directories specified by the -I option
- The Perl library directory, which is normally /usr/local/bin/perl
- The current working directory (represented by the .
character)
Like any array variable, @INC can be added to or modified.
For more information on the require function, refer to
Chapter 19.
The built-in associative array %INC lists the files requested
by the require function that have already been found.
When require finds a file, the associative array element
$INC{file} is defined, in which file is the
name of the file. The value of this associative array element
is the location of the actual file.
When require requests a file, the Perl interpreter first
looks to see whether an associative array element has already
been created for this file. This action ensures that the interpreter
does not try to include the same code twice.
The %ENV associative array lists the environment variables
defined for this program and their values. The environment variables
are the array subscripts, and the values of the variables are
the values of the array elements.
For example, the following statement assigns the value of the
environment variable TERM to the scalar variable $term:
$term = $ENV{"TERM"};
In the UNIX environment, processes can send signals to other processes.
These signals can, for example, interrupt a running program, trigger
an alarm in the program, or kill off the program.
You can control how your program responds to signals it receives.
To do this, modify the %SIG associative array. This array
contains one element for each available signal, with the signal
name serving as the subscript for the element. For example, the
INT (interrupt) signal is represented by the $SIG{"INT"}
element.
The value of a particular element of %SIG is the action
that is to be performed when the signal is received. By default,
the value of an array element is DEFAULT, which tells
the program to do what it normally does when it receives this
signal.
You can override the default action for some of the signals in
two ways: you can tell the program to ignore the signal, or you
can define your own signal handler. (Some signals, such as KILL,
cannot be overridden.)
To tell the program to ignore a particular type of signal, set
the value of the associative array element for this signal to
IGNORE. For example, the following statement indicates
that the program is to ignore any INT signals it receives:
$SIG{"INT"} = "IGNORE";
If you assign any value other than DEFAULT or IGNORE
to a signal array element, this value is assumed to be the name
of a function that is to be executed when this signal is received.
For example, the following statement tells the program to jump
to the subroutine named interrupt when it receives an
INT signal:
$SIG{"INT"} = "interrupt";
Subroutines that can be jumped to when a signal is received are
called interrupt handlers, because signals interrupt normal
program execution. Listing 17.16 is an example of a program that
defines an interrupt handler.
Listing 17.16. A program containing an interrupt handler.
1: #!/usr/local/bin/perl
2:
3: $SIG{"INT"} = "wakeup";
4: sleep();
5:
6: sub wakeup {
7: print ("I have woken up!\n");
8: exit();
9: }
$ program17_16
I have woken up!
$

Line 3 tells the Perl interpreter that the
program is to jump to the wakeup subroutine when it receives
the INT signal. Line 4 tells the program to go to sleep.
Because no argument is passed to sleep, the program will
sleep until a signal wakes it up.
To wake up the process, get the process ID using the ps
command, and then send an INT signal to the process using
the kill command. (See the manual page for kill,
and the related documentation for signal handling, to see how
to perform this task in your environment.)
When the program receives the INT signal, it executes
the wakeup subroutine. This subroutine prints the following
message and then exits:
I have woken up!
If desired, you can use the same subroutine to handle more than
one signal. The signal actually sent is passed as an argument
to the called subroutine, which ensures that your subroutine can
determine which signal triggered it:
sub interrupt {
local ($signal) = @_;
print ("Interrupted by the $signal signal.\n");
}
If a subroutine exits normally, the program returns to where it
was executing when it was interrupted. If a subroutine calls exit
or die, the program execution is terminated.
| NOTE |
When a program continues executing after being interrupted, the element of %SIG corresponding to the received signal is reset to DEFAULT. To ensure that repeated signals are trapped by your interrupt handler, redefine the appropriate
element of %SIG
|
Perl provides several built-in file variables, most of which
you have previously seen. The only file variables that have not
yet been discussed are DATA and _ (underscore).
The others are briefly described here for the sake of completeness.
The file variable STDIN is, by default, associated with
the standard input file. Using STDIN with the <>
operator, as in <STDIN>, normally reads data from
your keyboard. If your shell has used < or some equivalent
redirection operator to specify input from a file, <STDIN>
reads from that file.
The file variable STDOUT normally writes to the standard
output file, which is usually directed to your screen. If your
shell has used > or the equivalent to redirect standard
output to a file, writing to STDOUT sends output to that
file.
STDERR represents the standard error file, which is almost
always directed to your screen. Writing to STDERR ensures
that you see error messages even when you have redirected the
standard output file.
You can associate STDIN, STDOUT, or STDERR
with some other file using open:
open (STDIN, "myinputfile");
open (STDOUT, "myoutputfile");
open (STDERR, "myerrorfile");
Opening a file and associating it with STDIN overrides
the default value of STDIN, which means that you can
no longer read from the standard input file. Similarly, opening
a file and associating it with STDOUT or STDERR
means that writing to that particular file variable no longer
sends output to the screen.
To associate a file variable with the standard input file after
you have redirected STDIN, specify a filename of -:
open (MYSTDIN, "-");
To associate a file variable with the standard output file, specify
a filename of >-:
open (MYSTDOUT, ">-");
You can, of course, specify STDIN with - or
STDOUT with >- to restore the original values
of these file variables.
ARGV is a special file variable that is associated with
the current input file being read by the <> operator.
For example, consider the following statement:
$line = <>;
This statement reads from the current input file. Because ARGV
represents the current input file, the preceding statement is
equivalent to this:
$line = <ARGV>;
You normally will not need to access ARGV yourself except
via the <> operator.
The DATA file variable is used with the __END__
special value, which can be used to indicate the end of a program.
Reading from DATA reads the line after __END__,
which enables you to include a program and its data in the same
file.
Listing 17.17 is an example of a program that reads from DATA.
Listing 17.17. An example of the DATA
file variable.
1: #!/usr/local/bin/perl
2:
3: $line = <DATA>;
4: print ("$line");
5: __END__
6: This is my line of data.
$ program17_17
This is my line of data.
$

The __END__ value in line 5 indicates
the end of the program. When line 3 reads from the DATA
file variable, the first line after __END__ is read in
and is assigned to $line. (Subsequent requests for input
from DATA read successive lines, if any exist.) Line
6 then prints this input line.
| NOTE |
For more information on __END__ and methods of indicating the end of the program, refer to Chapter 20, "Miscellaneous Features of Perl.
|
The _ (underscore) file variable represents the file
specified by the last call to either the stat function
or a file test operator. For example:
$readable = -r "/u/jqpublic/myfile";
$writeable = -w _;
Here, the _ file variable used in the second statement
refers to /u/jqpublic/myfile because this is the filename
that was passed to -r.
You can use _ anywhere that a file variable can be used,
provided that the file has been opened appropriately:
if (-T $myoutfile) {
print _ ("here is my output\n");
}
Here, the file whose name is stored in $myoutfile is
associated with _ because this name was passed to -T
(which tests whether the file is a text file). The call to print
writes output to this file.
The main benefit of _ is that it saves time when you
are using several file-test operators at once:
if (-r "myfile" || -w _ || -x _) {
print ("I can read, write, or execute myfile.\n");
}
Using _ rather than myfile saves time because
file test operators normally call the UNIX system function stat.
If you specify _, the Perl interpreter is told to use
the results of the preceding call to the UNIX stat function
and to not bother calling it again.
As you have seen, the system variables defined by Perl normally
consist of a $, @ or % followed by
a single non-alphanumeric character. This ensures that you cannot
define a variable whose name is identical to that of a Perl system
variable.
If you find Perl system variable names difficult to remember or
type, Perl 5 provides an alternative for most of them. If you
add the statement
use English;
at the top of your program, Perl defines alternative variable
names that more closely resemble English words. This makes it
easier to understand what your program is doing. Table 17.1 lists
these alternative variable names.
Table 17.1. Alternative names for Perl system variables.
| Variable | Alternative name(s)
|
| $_ | $ARG
|
| $0 | $PROGRAM_NAME
|
| $< | $REAL_USER_ID or $UID
|
| $> | $EFFECTIVE_USER_ID or $EUID
|
| $( | $REAL_GROUP_ID or $GID
|
| $) | $EFFECTIVE_GROUP_ID or $EGID
|
| $] | $PERL_VERSION
|
| $/ | $INPUT_RECORD_SEPARATOR or $RS
|
| $\ | $OUTPUT_RECORD_SEPARATOR or $ORS
|
| $, | $OUTPUT_FIELD_SEPARATOR or $OFS
|
| $" | $LIST_SEPARATOR
|
| $# | $OFMT
|
| $@ | $EVAL_ERROR
|
| $? | $CHILD_ERROR
|
| $! | $OS_ERROR or $ERRNO
|
| $. | $INPUT_LINE_NUMBER or $NR
|
| $* | $MULTILINE_MATCHING
|
| $[ | none (deprecated in Perl 5)
|
| $; | $SUBSCRIPT_SEPARATOR or $SUBSEP
|
| $: | $FORMAT_LINE_BREAK_CHARACTERS
|
| $$ | $PROCESS_ID or $PID
|
| $^A | $ACCUMULATOR
|
| $^D | $DEBUGGING
|
| $^F | $SYSTEM_FD_MAX
|
| $^I | $INPLACE_EDIT
|
| $^L | $FORMAT_FORMFEED
|
| $^P | $PERLDB
|
| $^T | $BASETIME
|
| $^W | $WARNING
|
| $^X | $EXECUTABLE_NAME
|
| $& | $MATCH
|
| $' | $PREMATCH
|
| $' | $POSTMATCH
|
| $+ | $LAST_PAREN_MATCH
|
| $~ | $FORMAT_NAME
|
| $= | $FORMAT_LINES_PER_PAGE
|
| $- | $FORMAT_LINES_LEFT
|
| $^ | $FORMAT_TOP_NAME
|
| $| | $OUTPUT_AUTOFLUSH
|
| $% | $FORMAT_PAGE_NUMBER
|
ToChapter you learned about the built-in system variables available
within every Perl program. These system variables are divided
into five groups:
- Global scalar variables, which are defined everywhere in the
program and contain a single scalar value
- Pattern system variables, which are defined immediately after
a pattern-matching or substitution operation has been performed
- File system variables, which are defined for each input or
output file accessible from the program
- Array system variables, each of which contains a list
- Built-in file variables, which are associated with files that
are automatically open or automatically available
You also learned how to specify English-language equivalents for
Perl system variables.
| Q: | Why do some system variables use special characters rather than letters in their names?
|
| A: | To distinguish them from variables that you define and to ensure that the reset function (described in the next chapter) cannot affect them.
|
| Q: | Why do some functions use $_ as the default, whereas others do not?
|
| A: | The functions that use $_ as the default are those that are likely to appear in Perl programs specified on the command line using the -e option.
|
| Q: | What is the current line number when $. is used with the <> operator?
|
| A: | Effectively, the <> operator treats its input files as if they are a single file. This means that $. contains the total number of lines seen, not the line number of the current input
file. (If you want $. to contain the line number of the current file, set $. to zero each time eof returns true.)
|
| Q: | Are pattern system variables local or global?
|
| A: | Each pattern system variable is defined only in the current subroutine or block of statements.
|
| Q: | Why does Perl define both the $" and the $, system variables?
|
| A: | Some programs like to treat the following statements differently:
print ("@array");
print (@array);
(In fact, by default, the first statement puts a space between each pair of elements in the array, and the second does not.) The $" and $, variables handle these two separate cases.
|
The Workshop provides quiz questions to help you solidify your
understanding of the material covered, and exercises to provide
you with experience in using what you've learned.
- List the functions and operators that use $_ by default.
- What do the following variables contain?
a. $=
b. $/
c.$ ?
d. $!
d. @_
- Explain the differences between ARGV, $ARGV,
and @ARGV.
- Explain the difference between @INC and %INC.
- Explain the difference between $0 and $1.
- Write a program that reads lines of input, replaces multiple
blanks and tabs with a single space, converts all uppercase letters
to lowercase, and prints the resulting lines. Use no explicit
variable names in this program.
- Write a program that uses $' and $_ to remove
all extra spaces from input lines.
- Write a program that prints the directories in your PATH
environment variable, one per line.
- Write a program that prints numbers, starting with 1 and continuing
until interrupted by an INT signal.
- Write a program whose data consists of one or more numbers
per input line. Put the input lines in the program file itself.
Add the numbers and print their total.
- BUG BUSTER: What is wrong with the following statement?
if ($line =~ /abc/) {
$' =~ s/ +/ /;
}

|