Chapter 6
Reading from and Writing to Files
CONTENTS
So far, you've learned to read input from the standard input file,
which stores data that is entered from the keyboard. You've also
learned how to write to the standard output file, which sends
data to your screen. In toChapter's lesson, you'll learn the following:
- How to open a file
- How to read from and write to an opened file
- How to redirect standard input and standard output and how
to use the standard error file
- How to close a file
- About file-test operators, which determine the status of a
file
- How to read from multiple files
- How to use command-line arguments
- How to open pipes
Before you can read from or write to a file, you must first open
the file. This operation tells the operating system that you are
currently accessing the file and that no one else can change it
while you are working with it. To open a file, call the library
function open.
The syntax for the open library function is
open (filevar, filename);
When you call open, you must supply two arguments:
- filevar represents the name you want to use in your
Perl program to refer to the file.
- filename represents the location of the file on your
machine.
The first argument passed to open is the name that the
Perl interpreter uses to refer to the file. This name is also
known as the file variable (or the file handle).
A file-variable name can be any sequence of letters, digits, and
underscores, as long as the first character is a letter.
The following are legal file-variable names:
filename
MY_NAME
NAME2
A_REALLY_LONG_FILE_VARIABLE_NAME
The following are not legal file-variable names:
1NAME
A.FILE.NAME
_ANOTHERNAME
if
if is not a valid file-variable name because it has another
meaning: as you've seen, it indicates the start of an if
statement. Words such as if that have special meanings
in Perl are known as reserved words and cannot be used
as names.
| Tip |
It's a good idea to use all uppercase letters for your file-variable names. This makes it easier to distinguish file-variable names from other variable names and from reserved words.
|
The second item passed to open is the name of the file
you want to open. For example, if you are running Perl on a UNIX
file system, and your current working directory contains a file
named file1 that you would like to open, you can open
it as follows:
open(FILE1, "file1");
This statement tells Perl that you want to open the file file1
and associate it with the file variable FILE1.
If you want to open a file in a different directory, you can specify
the complete pathname, as follows:
open(FILE1, "/u/jqpublic/file1");
This opens the file /u/jqpublic/file1 and associates
it with the file variable FILE1.
| NOTE |
If you are running Perl on a file system other than UNIX, use the filename and directory syntax that is appropriate for your system. The Perl interpreter running on that system will be able to figure out where your file is located.
|
When you open a file, you must decide how you want to access the
file. There are three different file-access modes (or,
simply, file modes) available in Perl:
| read mode | Enables the program to read the existing contents of the file but does not enable it to write into the file
|
| write mode | Destroys the current contents of the file and overwrites them with the output supplied by the program
|
| append mode | Appends output supplied by the program to the existing contents of the file
|
By default, open assumes that a file is to be opened
in read mode. To specify write mode, put a > character
in front of the filename that you pass to open, as follows:
open (OUTFILE, ">/u/jqpublic/outfile");
This opens the file /u/jqpublic/outfile for writing and
associates it with the file variable OUTFILE.
To specify append mode, put two > characters in front
of the filename, as follows:
open (APPENDFILE, ">>/u/jqpublic/appendfile");
This opens the file /u/jqpublic/appendfile in append
mode and associates it with the file variable APPENDFILE.
| NOTE |
Here are a few things to remember when opening files:
- When you open a file for writing, any existing contents are destroyed.
- You cannot read from and write to the same file at the same time.
- When you open a file in append mode, the existing contents are not destroyed, but you cannot read the file while writing to it.
|
Before you can use a file opened by the open function,
you should first check whether the open function actually
is giving you access to the file. The open function enables
you to do this by returning a value indicating whether the file-opening
operation succeeded:
- If open returns a nonzero value, the file has been
opened successfully.
- If open returns 0, an error has occurred.
As you can see, the values returned by open correspond
to the values for true and false in conditional expressions. This
means that you can use open in if and unless
statements. The following is an example:
if (open(MYFILE, "/u/jqpublic/myfile")) {
# here's what to do if the file opened
}
The code inside the if statement is executed only if
the file has been successfully opened. This ensures that your
programs read or write only to files that you can access.
| NOTE |
If open returns false, you can find out what went wrong by using the file-test operators, which you'll learn about later toChapter.
|
Once you have opened a file and determined that the file is available
for use, you can read information from it.
To read from a file, enclose the file variable associated with
the file in angle brackets (< and >),
as follows:
$line = <MYFILE>;
This statement reads a line of input from the file specified by
the file variable MYFILE and stores the line of input
in the scalar variable $line.
Listing 6.1 is a simple program that reads input from a file and
writes it to the standard output file.
Listing 6.1. A program that reads lines from a file and prints
them.
1: #!/usr/local/bin/perl
2:
3: if (open(MYFILE, "file1")) {
4: $line = <MYFILE>;
5: while ($line ne "") {
6: print ($line);
7: $line = <MYFILE>;
8: }
9: }
$ program6_1
Here is a line of input.
Here is another line of input.
Here is the last line of input.
$

Line 3 opens the file file1 in read
mode, which means that the file is to be made available for reading.
file1 is assumed to be in the current working directory.
The file variable MYFILE is associated with the file
file1.
If the call to open returns a nonzero value, the conditional
expression
open(MYFILE, "file1")
is assumed to be true, and the code inside the if statement
is executed.
Lines 4-8 print the contents of file1. The sample output
shown here assumes that file1 contains the following
three lines:
Here is a line of input.
Here is another line of input.
Here is the last line of input.
Line 4 reads the first line of input from the file specified by
the file variable MYFILE, which is file1. This
line of input is stored in the scalar variable $line.
Line 5 tests whether the end of the file specified by MYFILE
has been reached. If there are no more lines left in MYFILE,
$line is assigned the empty string.
Line 6 prints the text stored in $line, which is the
line of input read from MYFILE.
Line 7 reads the next line of MYFILE, preparing for the
loop to start again.
Now that you have seen how Perl programs read input from files
in read mode, take another look at a statement that reads a line
of input from the standard input file.
$line = <STDIN>;
Here's what is actually happening: The Perl program is referencing
the file variable STDIN, which represents the standard
input file. The < and > on either side
of STDIN tell the Perl interpreter to read a line of
input from the standard input file, just as the <
and > on either side of MYFILE in
$line = <MYFILE>;
tell the Perl interpreter to read a line of input from MYFILE.
STDIN is a file variable that behaves like any other
file variable representing a file in read mode. The only difference
is that STDIN does not need to be opened by the open
function because the Perl interpreter does that for you.
In Listing 6.1, you saw that the return value from open
can be tested to see whether the program actually has access to
the file. The code that operates on the opened file is contained
in an if statement.
If you are writing a large program, you might not want to put
all of the code that affects a file inside an if statement,
because the distance between the beginning of the if
statement and the closing brace (}) could get very large.
For example:
if (open(MYFILE, "file1")) {
# this could be many pages of statements!
}
Besides, after a while, you'll probably get tired of typing the
spaces or tabs you use to indent the code inside the if
statement. Perl provides a way around this using the library function
die.
The syntax for the die library function is
die (message);
When the Perl interpreter executes the die function,
the program terminates immediately and prints the message passed
to die.
For example, the statement
die ("Stop this now!\n");
prints the following on your screen and terminates the program:
Stop this now!
Listing 6.2 shows how you can use die to smoothly test
whether a file has been opened correctly.
Listing 6.2. A program that uses die
when testing for a successful file open operation.
1: #!/usr/local/bin/perl
2:
3: unless (open(MYFILE, "file1")) {
4: die ("cannot open input file file1\n");
5: }
6:
7: # if the program gets this far, the file was
8: # opened successfully
9: $line = <MYFILE>;
10: while ($line ne "") {
11: print ($line);
12: $line = <MYFILE>;
13: }
$ program6_2
Here is a line of input.
Here is another line of input.
Here is the last line of input.
$

This program behaves the same way as the one
in Listing 6.1, except that it prints out an error message when
it can't open the file.
Line 3 opens the file and tests whether the file opened successfully.
Because this is an unless statement, the code inside
the braces ({ and }) is executed unless the
file opened successfully.
Line 4 is the call to die that is executed if the file
does not open successfully. This statement prints the following
message on the screen and exits:
cannot open input file file1
Because line 4 terminates program execution when the file is not
open, the program can make it past line 5 only if the file has
been opened successfully.
The loop in lines 9-13 is identical to the loop you saw in Listing
6.1. The only difference is that this loop is no longer inside
an if statement.
| NOTE |
Here is another way to write lines 3-5:
open (MYFILE, "file1") || die ("Could not open file");
Recall that the logical OR operator only evaluates the expression on its right if the expression on its left is false. This means that die is called only if open returns false (if the open operation fails).
|
Printing Error Information Using die
If you like, you can have die print the name of the Perl
program and the line number of the statement containing the call
to die. To do this, leave off the trailing newline character
in the character string, as follows:
die ("Missing input file");
If the Perl program containing this statement is called myprog,
and this statement is line 14 of myprog, this call to
die prints the following and exits:
Missing input file at myprog line 14.
Compare this with
die ("Missing input file\n");
which simply prints the following before exiting:
Missing input file
Specifying the program name and line number is useful in two cases:
- If the program contains many similar error messages, you can
use die to specify the line number of the message that
actually appeared.
- If the program is called from within another program, you
can use die to indicate that this program generated the
error.
Perl enables you to read an entire file into a single array variable.
To do this, assign the file variable to the array variable, as
follows:
@array = <MYFILE>;
This reads the entire file represented by MYFILE into
the array variable @array. Each line of the file becomes
an element of the list that is stored in @array.
Listing 6.3 is a simple program that reads an entire file into
an array.
Listing 6.3. A program that reads an entire input file into
an array.
1: #!/usr/local/bin/perl
2:
3: unless (open(MYFILE, "file1")) {
4: die ("cannot open input file file1\n");
5: }
6: @input = <MYFILE>;
7: print (@input);
$ program6_3
Here is a line of input.
Here is another line of input.
Here is the last line of input.
$

Lines 3-5 open the file, test whether the file
has been opened successfully, and terminate the program if the
file cannot be opened.
Line 6 reads the entire contents of the file represented by MYFILE
into the array variable @input. @input now contains
a list consisting of the following three elements:
("Here is a line of input.\n",
"Here is another line of input.\n",
"Here is the last line of input.\n")
Note that a newline character is included as the last character
of each line.
Line 7 uses the print function to print the entire file.
After you have opened a file in write or append mode, you can
write to the file you have opened by specifying the file variable
with the print function. For example, if you have opened
a file for writing using the statement
open(OUTFILE, ">outfile");
the following statement:
print OUTFILE ("Here is an output line.\n");
writes the following line to the file specified by OUTFILE,
which is the file called outfile:
Here is an output line.
Listing 6.4 is a simple program that reads from one file and writes
to another.
Listing 6.4. A program that opens two files and copies one
into another.
1: #!/usr/local/bin/perl
2:
3: unless (open(INFILE, "file1")) {
4: die ("cannot open input file file1\n");
5: }
6: unless (open(OUTFILE, ">outfile")) {
7: die ("cannot open output file outfile\n");
8: }
9: $line = <INFILE>;
10: while ($line ne "") {
11: print OUTFILE ($line);
12: $line = <INFILE>;
13: }
This program writes nothing to the screen because all output is
directed to the file called outfile.

Lines 3-5 open file1 for reading.
If the file cannot be opened, line 4 is executed, which prints
the following message on the screen and terminates the program:
cannot open input file file1
Lines 6-8 open outfile for writing; the >
in >outfile indicates that the file is to be opened
in write mode. If outfile cannot be opened, line 7 prints
the message
cannot open output file outfile
on the screen and terminates the program.
The only other line in the program that you have not seen in other
listings in this lesson is line 11, which writes the contents
of the scalar variable $line on the file specified by
OUTFILE.
Once this program has completed, the contents of file1
are copied into outfile.
Here is a line of input.
Here is another line of input.
Here is the last line of input.
 |
Make sure that files you open in write mode contain nothing valuable. When the open function opens a file in write mode, any existing contents are destroyed.
|
If you want, your program can reference the standard output file
by referring to the file variable associated with the output file.
This file variable is named STDOUT.
By default, the print statement sends output to the standard
output file, which means that it sends the output to the file
associated with STDOUT. As a consequence, the following
statements are equivalent:
print ("Here is a line of output.\n");
print STDOUT ("Here is a line of output.\n");
| NOTE |
You do not need to open STDOUT because Perl automatically opens it for you.
|
In Perl, you can open as many files as you like, provided you
define a different file variable for each one. (Actually, there
is an upper limit on the number of files you can open, but it's
fairly large and also system-dependent.) For an example of a program
that has multiple files open at one time, take a look at Listing
6.5. This program merges two files by creating an output file
consisting of one line from the first file, one line from the
second file, another line from the first file, and so on. For
example, if an input file named merge1 contains the lines
a1
a2
a3
and another file, merge2, contains the lines
b1
b2
b3
then the resulting output file consists of
a1
b1
a2
b2
a3
b3
Listing 6.5. A program that merges two files.
1: #!/usr/local/bin/perl
2:
3: open (INFILE1, "merge1") ||
4: die ("Cannot open input file merge1\n");
5: open (INFILE2, "merge2") ||
6: die ("Cannot open input file merge2\n");
7: $line1 = <INFILE1>;
8: $line2 = <INFILE2>;
9: while ($line1 ne "" || $line2 ne "") {
10: if ($line1 ne "") {
11: print ($line1);
12: $line1 = <INFILE1>;
13: }
14: if ($line2 ne "") {
15: print ($line2);
16: $line2 = <INFILE2>;
17: }
18: }
$ program6_5
a1
b1
a2
b2
a3
b3
$

Lines 3 and 4 show another way to write a statement
that either opens a file or calls die if the open fails.
Recall that the || operator first evaluates its left
operand; if the left operand evaluates to true (a nonzero value),
the right operand is not evaluated because the result of the expression
is true.
Because of this, the right operand, the call to die,
is evaluated only when the left operand is false-which happens
only when the call to open fails and the file merge1
cannot be opened.
Lines 5 and 6 repeat the preceding process for the file merge2.
Again, either the file is opened successfully or the program aborts
by calling die.
The program then loops repeatedly, reading a line of input from
each file each time. The loop terminates only when both files
have been exhausted. If one file is empty but the other is not,
the program just copies the line from the non-empty file to the
standard output file.
Note that the output from this program is printed on the screen.
If you decide that you want to send this output to a file, you
can do one of two things:
- You can modify the program to write its output to a different
file. To do this, open the file in write mode and associate it
with a file variable. Then, change the print statements
to refer to this file variable.
- You can redirect the standard output file on the command line.
For a discussion of the second method, see the following section.
When you run programs on UNIX, you can redirect input and output
using < and >, respectively, as follows:
myprog <input >output
Here, when you run the program called myprog, the input
for the program is taken from the file specified by input
instead of from the keyboard, and the output for the program is
sent to the file specified by output instead of to the
screen.
When you run a Perl program and redirect input using <,
the standard input file variable STDIN now represents
the file specified with <. For example, consider the
following simple program:
#!/usr/local/bin/perl
$line = <STDIN>;
print ($line);
Suppose this program is named myperlprog and is called
with the command
myperlprog <file1
In this case, the statement
$line = <STDIN>;
reads a line of input from file1 because the file variable
STDIN represents file1.
Similarly, specifying > on the command file redirects
the standard output file from the screen to the specified file.
For example, consider this command:
myperlprog <file1 >outfile
It redirects output from the standard output file to the file
called outfile. Now, the following statement writes a
line of data to outfile:
print ($line);
Besides the standard input file and the standard output file,
Perl also defines a third built-in file variable, STDERR,
which represents the standard error file. By default, text sent
to this file is written to the screen. This enables the program
to send messages to the screen even when the standard output file
has been redirected to write to a file. As with STDIN
and STDOUT, you do not need to open STDERR because
it automatically is opened for you.
Listing 6.6 provides a simple example of the use of STDERR.
The output shown in the input-output example assumes that the
standard input file and standard output file have been redirected
to files using < and >, as in
myprog <infile >outfile
Therefore, the only output you see is what is written to STDERR.
Listing 6.6. A program that writes to the standard error file.
1: #!/usr/local/bin/perl
2:
3: open(MYFILE, "file1") ||
4: die ("Unable to open input file file1\n");
5: print STDERR ("File file1 opened successfully.\n");
6: $line = <MYFILE>;
7: while ($line ne "") {
8: chop ($line);
9: print ("\U$line\E\n");
10: $line = <MYFILE>;
11: }
$ program6_6
File file1 opened successfully.
$

This program converts the contents of a file
into uppercase and sends the converted contents to the standard
output file.
Line 3 tries to open file1. If the file cannot be opened,
line 4 is executed. This calls die, which prints the
following message and terminates:
Unable to open input file file1
| NOTE |
The function die sends its messages to the standard error file, not the standard output file. This means that when a program terminates, the message printed by die always appears on your screen, even when you have redirected output to a
file.
|
If the file is opened successfully, line 5 writes a message to
the standard error file, which indicates that the file has been
opened. As you can see, the standard error file is not reserved
solely for errors. You can write anything you want to STDERR
at any time.
Lines 6-11 read one line of file1 at a time and write
it out in uppercase (using the escape characters \U and
\E, which you learned about on Chapter 3, "Understanding
Scalar Values").
When you are finished reading from or writing to a file, you can
tell the Perl interpreter that you are finished by calling the
library function close.
The syntax for the close library function is
close (filevar);
close requires one argument: the file variable representing
the file you want to close. Once you have closed the file, you
cannot read from it or write to it without invoking open
again.
Note that you do not have to call close when you are
finished with a file: Perl automatically closes the file when
the program terminates or when you open another file using a previously
defined file variable. For example, consider the following statements:
open (MYFILE, ">file1");
print MYFILE ("Here is a line of output.\n");
open (MYFILE, ">file2");
print MYFILE ("Here is another line of output.\n");
Here, when file2 is opened for writing, file1
automatically is closed. The file variable MYFILE is
now associated with file2. This means that the second
print statement sends the following to file2:
Here is another line of output.
 |
DO use the <> operator, which is an easy way to read input from several files in succession. See the section titled "Reading from a Sequence of Files," later in this lesson, for more information on the <>
operator.
DON'T use the same file variable to represent multiple files unless it is absolutely necessary. It is too easy to lose track of which file variable belongs to which file, especially if your program is large or has many nested conditional
statements.
|
Many of the example programs in toChapter's lesson call open
and test the returned result to see whether the file has been
opened successfully. If open fails, it might be useful
to find out exactly why the file could not be opened. To do this,
use one of the file-test operators.
Listing 6.7 provides an example of the use of a file-test operator.
This program is a slight modification of Listing 6.6, which is
an uppercase conversion program.
Listing 6.7. A program that checks whether an unopened file
actually exists.
1: #!/usr/local/bin/perl
2:
3: unless (open(MYFILE, "file1")) {
4: if (-e "file1") {
5: die ("File file1 exists, but cannot be opened.\n");
6: } else {
7: die ("File file1 does not exist.\n");
8: }
9: }
10: $line = <MYFILE>;
11: while ($line ne "") {
12: chop ($line);
13: print ("\U$line\E\n");
14: $line = <MYFILE>;
15: }
$ program6_7
File file1 does not exist.
$

Line 3 attempts to open the file file1
for reading. If file1 cannot be opened, the program executes
the if statement starting in line 4.
Line 4 is an example of a file-test operator. This file-test operator,
-e, tests whether its operand, a file, actually exists.
If the file file1 exists, the expression -e "file1"
returns true, the message File file1 exists, but cannot be
opened. is displayed, and the program exits. If file1
does not exist, -e "file1" is false, and the
library function die prints the following message before
exiting:
File file1 does not exist.
All file-test operators have the same syntax as the -e
operator used in Listing 6.7.
The syntax for the file-test operators is
-x expr
Here, x is an alphabetic character and expr
is any expression. The value of expr is assumed to be
a string that contains the name of the file to be tested.
Because the operand for a file-test operator can be any expression,
you can use scalar variables and string operators in the expression
if you like. For example:
$var = "file1";
if (-e $var) {
print STDERR ("File file1 exists.\n");
}
if (-e $var . "a") {
print STDERR ("File file1a exists.\n");
}
In the first use of -e, the contents of $var,
file1, are assumed to be the name of a file, and this
file is tested for existence. In the second case, a is
appended to the contents of file1, producing the string
file1a. The -e operator then tests whether a
file named file1a exists.
| NOTE |
The Perl interpreter does not get confused by the expression
-e $var . "a"
because the . operator has higher precedence than the -e operator. This means that the string concatenation is performed first.
The file-test operators have higher precedence than the comparison operators but lower precedence than the shift operators. To see a complete list of the Perl operators and their precedences, refer to Chapter 4, "More
Operators."
|
The string can be a complete path name, if you like. The following
is an example:
if (-e "/u/jqpublic/file1") {
print ("The file exists.\n");
}
This if statement tests for the existence of the file
/u/jqpublic/file1.
Table 6.1 provides a complete list of the file-test operators
available in Perl. In this table, name is a placeholder
for the name of the operand being tested.
Table 6.1. The file-test operators.
| Operator | Description
|
| -b | Is name a block device?
|
| -c | Is name a character device?
|
| -d | Is name a directory?
|
| -e | Does name exist?
|
| -f | Is name an ordinary file?
|
| -g | Does name have its setgid bit set?
|
| -k | Does name have its "sticky bit" set?
|
| -l | Is name a symbolic link?
|
| -o | Is name owned by the user?
|
| -p | Is name a named pipe?
|
| -r | Is name a readable file?
|
| -s | Is name a non-empty file?
|
| -t | Does name represent a terminal?
|
| -u | Does name have its setuid bit set?
|
| -w | Is name a writable file?
|
| -x | Is name an executable file?
|
| -z | Is name an empty file?
|
| -A | How long since name accessed?
|
| -B | Is name a binary file?
|
| -C | How long since name's inode accessed?
|
| -M | How long since name modified?
|
| -O | Is name owned by the "real user" only?*
|
| -R | Is name readable by the "real user" only?*
|
| -S | Is name a socket?
|
| -T | Is name a text file?
|
| -W | Is name writable by the "real user" only?*
|
| -X | Is name executable by the "real user" only?*
|
| * In this case, the "real user" is the userid specified at login, as opposed to the effective user ID, which is the userid under which you currently are working. (On some systems, a
command such as /user/local/etc/suid enables you to change your effective user ID.)
|
The following sections describe some of the more common file-test
operators and show you how they can be useful. (You'll also learn
about more of these operators on Chapter 12, "Working with the
File System.")
When a Perl program opens a file for writing, it destroys anything
that already exists in the file. This might not be what you want.
Therefore, you might want to make sure that your program opens
a file only if the file does not already exist.
You can use the -e file-test operator to test whether
or not to open a file for writing. Listing 6.8 is an example of
a program that does this.
Listing 6.8. A program that tests whether a file exists before
opening it for writing.
1: #!/usr/local/bin/perl
2:
3: unless (open(INFILE, "infile")) {
4: die ("Input file infile cannot be opened.\n");
5: }
6: if (-e "outfile") {
7: die ("Output file outfile already exists.\n");
8: }
9: unless (open(OUTFILE, ">outfile")) {
10: die ("Output file outfile cannot be opened.\n");
11: }
12: $line = <INFILE>;
13: while ($line ne "") {
14: chop ($line);
15: print OUTFILE ("\U$line\E\n");
16: $line = <INFILE>;
17: }
$ program6_8
Output file outfile already exists.
$

This program is the uppercase conversion program
again; most of it should be familiar to you.
The only difference is lines 6-8, which use the -e file-test
operator to check whether the output file outfile exists.
If outfile exists, the program aborts, which ensures
that the existing contents of outfile are not lost.
If outfile does not exist, the following expression fails:
-e "outfile"
and the program knows that it is safe to open outfile
because it does not already exist.
Using File-Test Operators in Expressions
If you don't need to know exactly why your program is failing,
you can combine all of the tests in Listing 6.8 into a single
statement, as follows:
open(INFILE, "infile") && !(-e "outfile") &&
open(OUTFILE, ">outfile") || die("Cannot open files\n");
Can you see how this works? Here's what is happening: The &&
operator, logical AND, is true only if both of its operands are
true. In this case, the two && operators indicate
that the subexpression up to, but not including, the ||
is true only if all three of the following are true:
open(INFILE, "infile")
!(-e "outfile")
open(OUTFILE, ">outfile")
All three are true only when the following conditions are met:
- The input file infile can be opened.
- The output file outfile does not already exist.
- The output file outfile can be opened.
If any of these subexpressions is false, the entire expression
up to the || is false. This means that the subexpression
after the || (the call to die) is executed,
and the program aborts.
Note that each of the three subexpressions associated with the
&& operators is evaluated in turn. This means
that the subexpression
!(-e "outfile")
is evaluated only if
open(INFILE, "infile")
is true, and that the subexpression
open(OUTFILE, ">outfile")
is evaluated only if
!(-e "outfile")
is true. This is exactly the same logic that Listing 6.8 uses.
If any of the subexpressions is false, the Perl interpreter doesn't
evaluate the rest of them because it knows that the final result
of
open(INFILE, "infile") && !(-e "outfile") &&
open(OUTFILE, ">outfile")
is going to be false. Instead, it goes on to evaluate the subexpression
to the right of the ||, which is the call to die.
This program logic is somewhat complicated, and you shouldn't
use it unless you feel really comfortable with it. The if
statements in Listing 6.8 do the same thing and are easier to
understand; however, it's useful to know how complicated statements
such as the following one work because many Perl programmers like
to write code that works in this way:
open(INFILE, "infile") && !(-e "outfile") &&
open(OUTFILE, ">outfile") || die("Cannot open files\n");
In the next few Chapters, you'll see several more examples of code
that exploits how expressions work in Perl. "Perl hackers"-experienced
Perl programmers-often enjoy compressing multiple statements into
shorter ones, and they delight in complexity. Be warned.
Before you can open a file for reading, you must have permission
to read the file. The -r file-test operator tests whether
you have permission to read a file.
Listing 6.9 checks whether the person running the program has
permission to access a particular file.
Listing 6.9. A program that tests for read permission on a
file.
1: #!/usr/local/bin/perl
2:
3: unless (open(MYFILE, "file1")) {
4: if (!(-e "file1")) {
5: die ("File file1 does not exist.\n");
6: } elsif (!(-r "file1")) {
7: die ("You are not allowed to read file1.\n");
8: } else {
9: die ("File1 cannot be opened\n");
10: }
11: }
$ program6_9
You are not allowed to read file1.
$

Line 3 of this program tries to open file1.
If the call to open fails, the program tries to find
out why.
First, line 4 tests whether the file actually exists. If the file
exists, the Perl interpreter executes line 6, which tests whether
the file has the proper read permission. If it does not, die
is called; it then prints the following message and exits:
You are not allowed to read file1.
| NOTE |
You do not need to use the -e file-test operator before using the -r file-test operator. If the file does not exist, -r returns false because you can't read a file that isn't there.
The only reason to use both -e and -r is to enable your program to determine exactly what is wrong.
|
You can use file-test operators to test for other permissions
as well. To check whether you have write permission on a file,
use the -w file-test operator.
if (-w "file1") {
print STDERR ("I can write to file1.\n");
} else {
print STDERR ("I can't write to file1.\n");
}
The -x file-test operator checks whether you have execute
permission on the file (in other words, whether the system thinks
this is an executable program, and whether you have permission
to run it if it is), as illustrated here:
if (-x "file1") {
print STDERR ("I can run file1.\n");
} else {
print STDERR ("I can't run file1.\n");
}
| NOTE |
If you are the system administrator (for example, you are running as user ID root) and have permission to access any file, the -r and -w file-test operators always return true if the file exists. Also, the -x test
operator always returns true if the file is an executable program.
|
The -z file-test operator tests whether a file is empty.
This provides a more refined test for whether or not to open a
file for writing: if the file exists but is empty, no information
is lost if you overwrite the existing file.
Listing 6.10 shows how to use -z.
Listing 6.10. A program that tests whether the file is empty
before opening it for writing.
1: #!/usr/local/bin/perl
2:
3: if (-e "outfile") {
4: if (!(-w "outfile")) {
5: die ("Missing write permission for outfile.\n");
6: }
7: if (!(-z "outfile")) {
8: die ("File outfile is non-empty.\n");
9: }
10: }
11: # at this point, the file is either empty or doesn't exist,
12: # and we have permission to write to it if it exists
$ program6_10
File outfile is non-empty.
$

Line 3 checks whether the file outfile
exists using -e. If it exists, it can only be opened
if the program has permission to write to the file; line 4 checks
for this using -w.
Line 7 uses -z to test whether the file is empty. If
it is not, line 7 calls die to terminate program execution.
The opposite of -z is the -s file-test operator,
which returns a nonzero value if the file is not empty.
$size = -s "outfile";
if ($size == 0) {
print ("The file is empty.\n");
} else {
print ("The file is $size bytes long.\n");
}
The -s file-test operator actually returns the size of
the file in bytes. It can still be used in conditional expressions,
though, because any nonzero value (indicating that the file is
not empty) is treated as true.
Listing 6.11 uses -s to return the size of a file that
has a name which is supplied via the standard input file.
Listing 6.11. A program that prints the size of a file in bytes.
1: #!/usr/local/bin/perl
2:
3: print ("Enter the name of the file:\n");
4: $filename = <STDIN>;
5: chop ($filename);
6: if (!(-e $filename)) {
7: print ("File $filename does not exist.\n");
8: } else {
9: $size = -s $filename;
10: print ("File $filename contains $size bytes.\n");
11: }
$ program6_11
Enter the name of the file:
file1
File file1 contains 128 bytes.
$

Lines 3-5 obtain the name of the file and remove
the trailing newline character.
Line 6 tests whether the file exists. If the file doesn't exist,
the program indicates this.
Line 9 stores the size of the file in the scalar variable $size.
The size is measured in bytes (one byte is equivalent to one character
in a character string).
Line 10 prints out the number of bytes in the file.
You can use file-test operators on file variables as well as character
strings. In the following example the file-test operator -z
tests the file represented by the file variable MYFILE:
if (-z MYFILE) {
print ("This file is empty!\n");
}
As before, this file-test operator returns true if the file is
empty and false if it is not.
 |
Remember that file variables can be used only after you open the file. If you need to test a particular condition before opening the file (such as whether the file is nonzero), test it using the name of the file.
|
Many UNIX utility programs are invoked using the following command
syntax:
programname file1 file2 file3 ...
A program that uses this command syntax operates on all of the
files specified on the command line in order, starting with file1.
When file1 has been processed, the program then proceeds
on to file2, and so on until all of the files have been
exhausted.
In Perl, it's easy to write programs that process an arbitrary
number of files because there is a special operator, the <>
operator, that does all of the file-handling work for you.
To understand how the <> operator works, recall
what happens when you put < and > around
a file variable:
$list = <MYFILE>;
This statement reads a line of input from the file represented
by the file variable MYFILE and stores it in the scalar
variable $list. Similarly, the statement
$list = <>;
reads a line of input and stores it in the scalar variable $list;
however, the file from which it reads is contained on the command
line. Suppose, for example, a program containing a statement using
the <> operator, such as the statement
$list = <>;
is called myprog and is called using the command
$ myprog file1 file2 file3
In this case, the first occurrence of the <> operator
reads the first line of input from file1. Successive
occurrences of <> read more lines from file1.
When file1 is exhausted, <> reads the
first line from file2, and so on. When the last file,
file3, is exhausted, <> returns an empty
string, which indicates that all the input has been read.
| NOTE |
If a program containing a <> operator is called with no command-line arguments, the <> operator reads input from the standard input file. In this case, the <> operator is equivalent to <STDIN>.
If a file named in a command-line argument does not exist, the Perl interpreter writes the following message to the standard error file:
Can't open name: No such file or directory
Here, name is a placeholder for the name of the file that the Perl interpreter cannot find. In this case, the Perl interpreter ignores name and continues on with the next file in the command line.
|
To see how the <> operator works, look at Listing
6.12, which displays the contents of the files specified on the
command line. (If you are familiar with UNIX, you will recognize
this as the behavior of the UNIX utility cat.) The output
from Listing 6.12 assumes that files file1 and file2
are specified on the command line and that each file contains
one line.
Listing 6.12. A program that displays the contents of one or
more files.
1: #!/usr/local/bin/perl
2:
3: while ($inputline = <>) {
4: print ($inputline);
5: }
$ program6_12 file1 file2
This is a line from file1.
This is a line from file2.
$

Once again, you can see how powerful and useful
Perl is. This entire program consists of only five lines, including
the header comment and a blank line.
Line 3 both reads a line from a file and tests to see whether
the line is the empty string. Because the assignment operator
= returns the value assigned, the expression
$inputline = <>
has the value "" (the null string) if and only
if <> returns the null string, which happens only
when there are no more lines to read from any of the input files.
This is exactly the point at which the program wants to stop looping.
(Recall that a "blank line" in a file is not the same
as the null string because the blank line contains the newline
character.) Because the null string is equivalent to false in
a conditional expression, there is no need to use a conditional
operator such as ne.
When line 3 is executed for the first time, the first line in
the first input file, file1, is read and stored in the
scalar variable $inputline. Because file1 contains
only one line, the second pass through the loop, and the second
execution of line 3, reads the first line of the second input
file, file2.
After this, there are no more lines in either file1 or
file2, so line 3 assigns the null string to $inputline,
which terminates the loop.
 |
When it reaches the end of the last file on the command line, the <> operator returns the empty string. However, if you use the <> operator after it has returned the empty string, the Perl interpreter assumes that you want to
start reading input from the standard input file. (Recall that <> reads from the standard input file if there are no files on the command line.)
This means that you have to be a little more careful when you use <> than when you are reading using <MYFILE> (where MYFILE is a file variable). If MYFILE has been exhausted, repeated attempts to read using
<MYFILE> continue to return the null string because there isn't anything left to read.
|
As you have seen, if you read from a file using <STDIN>
or <MYFILE> in an assignment to an array variable,
the Perl interpreter reads the entire contents of the file into
the array, as follows:
@array = <MYFILE>;
This works also with <>. For example, the statement
@array = <>;
reads all the contents all of the files on the command line into
the array variable @array.
As always, be careful when you use this because you might end
up with a very large array.
As you've seen, the <> operator assumes that its
command-line arguments are files. For example, if you start up
the program shown in Listing 6.12 with the command
$ program6_12 myfile1 myfile2
the Perl interpreter assumes that the command-line arguments myfile1
and myfile2 are files and displays their contents.
Perl enables you to use the command-line arguments any way you
want by defining a special array variable called @ARGV.
When a Perl program starts up, this variable contains a list consisting
of the command-line arguments. For example, the command
$ program6_12 myfile1 myfile2
sets @ARGV to the list
("myfile1", "myfile2")
| NOTE |
The shell you are running (sh, csh, or whatever you are using) is responsible for turning a command line such as
program6_12 myfile1 myfile2
into arguments. Normally, any spaces or tab characters are assumed to be separators that indicate where one command-line argument stops and the next begins. For example, the following are identical:
program6_12 myfile1 myfile2
program6_12 myfile1 myfile2
In each case, the command-line arguments are myfile1 and myfile2.
See your shell documentation for details on how to put blank spaces or tab characters into your command-line arguments.
|
As with all other array variables, you can access individual elements
of @ARGV. For example, the statement
$var = $ARGV[0];
assigns the first element of @ARGV to the scalar variable
$var.
You even can assign to some or all of @ARGV if you like.
For example:
$ARGV[0] = 43;
If you assign to any or all of @ARGV, you overwrite what
was already there, which means that any command-line arguments
overwritten are lost.
To determine the number of command-line arguments, assign the
array variable to a scalar variable, as follows:
$numargs = @ARGV;
As with all array variables, using an array variable in a place
where the Perl interpreter expects a scalar variable means that
the length of the array is used. In this case, $numargs
is assigned the number of command-line arguments.
 |
C programmers should take note that the first element of @ARGV, unlike argv[0] in C, does not contain the name of the program. In Perl, the first element of @ARGV is the first command-line argument.
To get the name of the program, use the system variable $0, which is discussed on Chapter 17, "System Variables."
|
To see how you can use @ARGV in a program, examine Listing
6.13. This program assumes that its first argument is a word to
look for. The remaining arguments are assumed to be files in which
to look for the word. The program prints out the searched-for
word, the number of occurrences in each file, and the total number
of occurrences.
This example assumes that the files file1 and file2
are defined and that each file contains the single line
This file contains a single line of input.
This example is then run with the command
$ programname single file1 file2
where programname is a placeholder for the name of the
program. (If you are running the program yourself, you can name
the program anything you like.)
Listing 6.13. A word-search and counting program.
1: #!/usr/local/bin/perl
2:
3: print ("Word to search for: $ARGV[0]\n");
4: $filecount = 1;
5: $totalwordcount = 0;
6: while ($filecount <= @ARGV-1) {
7: unless (open (INFILE, $ARGV[$filecount])) {
8: die ("Can't open input file $ARGV[$filecount]\n");
9: }
10: $wordcount = 0;
11: while ($line = <INFILE>) {
12: chop ($line);
13: @words = split(/ /, $line);
14: $w = 1;
15: while ($w <= @words) {
16: if ($words[$w-1] eq $ARGV[0]) {
17: $wordcount += 1;
18: }
19: $w++;
20: }
21: }
22: print ("occurrences in file $ARGV[$filecount]: ");
23: print ("$wordcount\n");
24: $filecount++;
25: $totalwordcount += $wordcount;
26: }
27: print ("total number of occurrences: $totalwordcount\n");
$ program6_13 single file1 file2
Word to search for: single
occurrences in file file1: 1
occurrences in file file2: 1
total number of occurrences: 2
$

Line 3 prints the word to search for. The program
assumes that this word is the first argument in the command line
and, therefore, is the first element of the array @ARGV.
Lines 7-9 open a file named on the command line. The first time
line 7 is executed, the variable $filecount has the value
1, and the file whose name is in $ARGV[1] is opened.
The next time through, $filecount is 2 and the
file named in $ARGV[2] is opened, and so on. If a file
cannot be opened, the program terminates.
Line 11 reads a line from a file. As before, the conditional expression
$line = <INFILE>
reads a line from the file represented by the file INFILE
and assigns it to $line. If the file is empty, $line
is assigned the null string, the conditional expression is false,
and the loop in lines 11-21 is terminated.
Line 13 splits the line into words, and lines 15-20 compare each
word with the search word. If the word matches, the word count
for this file is incremented. This word count is reset when a
new file is opened.
In Perl, the <> operator actually contains a hidden
reference to the array @ARGV. Here's how it works:
- When the Perl interpreter sees the <> for the
first time, it opens the file whose name is stored in $ARGV[0].
- After opening the file, the Perl interpreter executes the
following library function:
shift(@ARGV);
This library function gets rid of the first element of @ARGV
and moves every other element over one. This means that element
x of @ARGV becomes element x-1.
- The <> operator then reads all of the lines
of the file opened in step 1.
- When the <> operator exhausts an input file,
the Perl interpreter goes back to step 1 and repeats the cycle
again.
If you like, you can modify your program to retrieve a value from
the command line and then fix @ARGV so that the <>
operator can work properly. If you modify Listing 6.13 to do this,
the result is Listing 6.14.
Listing 6.14. A word-search and counting program that uses
<>.
1: #!/usr/local/bin/perl
2:
3: $searchword = $ARGV[0];
4: print ("Word to search for: $searchword\n");
5: shift (@ARGV);
6: $totalwordcount = $wordcount = 0;
7: $filename = $ARGV[0];
8: while ($line = <>) {
9: chop ($line);
10: @words = split(/ /, $line);
11: $w = 1;
12: while ($w <= @words) {
13: if ($words[$w-1] eq $searchword) {
14: $wordcount += 1;
15: }
16: $w++;
17: }
18: if (eof) {
19: print ("occurrences in file $filename: ");
20: print ("$wordcount\n");
21: $totalwordcount += $wordcount;
22: $wordcount = 0;
23: $filename = $ARGV[0];
24: }
25: }
26: print ("total number of occurrences: $totalwordcount\n");
$ program6_14 single file1 file2
Word to search for: single
occurrences in file file1: 1
occurrences in file file2: 1
total number of occurrences: 2
$

Line 3 assigns the first command-line argument,
the search word, to the scalar variable $searchword.
This is necessary because the call to shift in line 5
destroys the initial value of $ARGV[0].
Line 5 adjusts the array @ARGV so that the <>
operator can use it. To do this, it calls the library function
shift. This function "shifts" the elements
of the list stored in @ARGV. The element in $ARGV[1]
is moved to $ARGV[0], the element in $ARGV[2]
is moved to $ARGV[1], and so on. After shift
is called, @ARGV contains the files to be searched, which
is exactly what the <> operator is looking for.
Line 7 assigns the current value of $ARGV[0] to the scalar
variable $filename. Because the <> operator
in line 8 calls shift, the value of $ARGV[0]
is lost unless the program does this.
Line 8 uses the <> operator to open the file named
in $ARGV[0] and to read a line from the file. The array
variable @ARGV is shifted at this point.
Lines 9-16 behave as in Listing 6.13. The only difference is that
the search word is now in $searchword, not in $ARGV[0].
Line 18 introduces the library function eof. This function
indicates whether the program has reached the end of the file
being read by <>. If eof returns true,
the next use of <> opens a new file and shifts
@ARGV again.
Lines 19-23 prepare for the opening of a new file. The number
of occurrences of the search word is printed, the current word
count is added to the total word count, and the word count is
reset to 0. Because the new filename to be opened is in $ARGV[0],
line 23 preserves this filename by assigning it to $filename.
| NOTE |
You can use the <> operator to open and read any file you like by setting the value of @ARGV yourself. For example:
@ARGV = ("myfile1", "myfile2");
while ($line = <>) {
...
}
Here, when the statement containing the <> is executed for the first time, the file myfile1 is opened and its first line is read. Subsequent executions of <> each read another line of input from myfile1.
When myfile1 is exhausted, myfile2 is opened and read one line at a time.
|
On machines running the UNIX operating system, two commands can
be linked using a pipe. In this case, the standard output
from the first command is linked, or piped, to the standard input
to the second command.
Perl enables you to establish a pipe that links a Perl output
file to the standard input file of another command. To do this,
associate the file with the command by calling open,
as follows:
open (MYPIPE, "| cat >hello");
The | character tells the Perl interpreter to establish
a pipe. When MYPIPE is opened, output sent to MYPIPE
becomes input to the command
cat >hello
Because the cat command displays the contents of the
standard input file when called with no arguments, and >hello
redirects the standard output file to the file hello,
the open statement given here is identical to the statement
open (MYPIPE, ">hello");
You can use a pipe to send mail from within a Perl program. For
example:
open (MESSAGE, "| mail dave");
print MESSAGE ("Hi, Dave! Your Perl program sent this!\n");
close (MESSAGE);
The call to open establishes a pipe to the command mail
dave. The file variable MESSAGE is now associated
with this pipe. The call to print adds the line
Hi, Dave! Your Perl program sent this!
to the message to be sent to user ID dave.
The call to close closes the pipe referenced by MESSAGE,
which tells the system that the message is complete and can be
sent. As you can see, the call to close is useful here
because you can control exactly when the message is to be sent.
(If you do not call close, MESSAGE is closed-and
the message is sent-when the program terminates.)
Perl accesses files by means of file variables. File variables
are associated with files by the open statement.
Files can be opened in any of three modes: read mode, write mode,
and append mode. A file opened in read mode cannot be written
to; a file opened in either of the other modes cannot be read.
Opening a file in write mode destroys the existing contents of
the file.
To read from an opened file, reference it using <name>,
where name is a placeholder for the name of the file
variable associated with the file. To write to a file, specify
its file variable when calling print.
Perl defines three built-in file variables:
- STDIN, which represents the standard input file
- STDOUT, which represents the standard output file
- STDERR, which represents the standard error file
You can redirect STDIN and STDOUT by specifying
< and >, respectively, on the command
line. Messages sent to STDERR appear on the screen even
if STDOUT is redirected to a file.
The close function closes the file associated with a
particular file variable. close never needs to be called
unless you want to control exactly when a file is to be made inaccessible.
The file-test operators provide a way of retrieving information
on a particular file. The most common file-test operators are
- -e, which tests whether a file exists
- -r, -w, and -x, which test whether
a file has read, write, and execute permission, respectively
- -z, which tests whether a file is empty
- -s, which returns the size of a file
You can use -w and -z to ensure that you do
not overwrite a non-empty file.
The <> operator enables you to read data from files
specified on the command line. This operator uses the built-in
array variable @ARGV, whose elements consist of the items
specified on the command line.
Perl enables you to open pipes. A pipe links the output from your
Perl program to the input to another program.
| Q: | How many files can I have open at one time?
|
| A: | Basically, as many as you like. The actual limit depends on the limitations of your operating system.
|
| Q: | Why does adding a closing newline character to the text string affect how die behaves?
|
| A: | Perl enables you to choose whether you want the filename and line number of the error message to appear. If you add a closing newline character to the string, the Perl interpreter assumes that you want to
control how your error message is to appear.
|
| Q: | Which is better: to use <>, or to use @ARGV and shift when appropriate?
|
| A: | As is often the case, the answer is "It depends." If your program treats almost all of the command-line arguments as files, it is better to use <> because the mechanics of opening and
closing files are taken care of for you. If you are doing a lot of unusual things with @ARGV, it is better not to manipulate it to use <>, because things can get complicated and confusing.
|
| Q: | Can I open more than one pipe at a time?
|
| A: | Yes. Your operating system keeps all of the various commands and processes organized and keeps track of which output goes with which input.
|
| Q: | Can I redirect STDERR?
|
| A: | Yes, but there is (normally) no reason why you should. STDERR's job is to report extraordinary conditions, and you usually want to see these, not have them buried in a file somewhere.
|
| Q: | How many command-line arguments can I specify?
|
| A: | Basically, as many as your command-line shell can handle.
|
| Q: | Can I write to a file and then read from it later?
|
| A: | Yes, but you can't do both at the same time. To read from a file you have written to, close the file by calling close and then open the file in read mode.
|
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the
quiz and exercise answers before you go on to tomorrow's lesson.
- Define the following terms:
a. file variable
b. reserved word
c. file mode
d. append mode
e. pipe
- From where does the <> operator read its data?
- What do the following file-test operators do?
a. -r
b. -x
c. -s
- What are the contents of the array @ARGV when the
following Perl program is executed?
$ myprog file1 file2 file3
- How do you indicate that a file is to be opened:
a. In write mode?
b. In append mode?
c. In read mode?
d. As a pipe?
- What is the relationship between @ARGV and the <>
operator?
- Write a program that takes the values on the command line,
adds them together, and prints the result.
- Write a program that takes a list of files from the command
line and examines their size. If a file is bigger than 10,000
bytes, print
File name is a big file!
where name is a placeholder for the name of the
big file.
- Write a program that copies a file named file1 to
file2, and then appends another copy of file1
to file2.
- Write a program that counts the total number of words in the
files specified on the command line. When it has counted the words,
it sends a message to user ID dave indicating the total
number of words.
- Write a program that takes a list of files and indicates,
for each file, whether the user has read, write, or execute permission.
- BUG BUSTER: What is wrong with the following program?
#!/usr/local/bin/perl
open (OUTFILE, "outfile");
print OUTFILE ("This is my message\n");

|