Chapter 14
Scalar-Conversion and List-Manipulation Functions
CONTENTS
ToChapter, you learn about the built-in Perl functions that convert
scalar values from one form to another, and the Perl functions
that deal with variables that have not had values defined for
them.
You also learn about the built-in Perl functions that manipulate
lists and array variables. These functions are divided into two
groups:
- The functions that manipulate standard array variables and
their lists
- The functions that manipulate associative arrays
 |
Many of the functions described in toChapter's lesson use features of the UNIX operating system. If you are using Perl on a machine that is not running UNIX, some of these functions might not be defined or might behave differently.
Check the documentation supplied with your version of Perl for details on which functions are supported or emulated on your machine
|
The chop function was first discussed on Chapter 3, "Understanding
Scalar Values." It removes the last character from a scalar
value.
The syntax for the chop function is
chop (var);
var can be either a scalar value or a list, as described
in the following paragraphs.
For example:
$mystring = "This is a string";
chop ($mystring);
# $mystring now contains "This is a strin";
chop is used most frequently to remove the trailing newline
character from an input line, as follows:
$input = <STDIN>;
chop ($input);
The argument passed to chop can also be a list. In this
case, chop removes the last character from every element
of the list. For example, to read an entire input file into an
array variable and remove all of the trailing newline characters,
use the following statements:
@input = <STDIN>;
chop (@input);
chop returns the character chopped. For example:
$input = "12345";
$lastchar = chop ($input);
This call to chop assigns 5 to the scalar variable
$lastchar.
If chop is passed a list, the last character from the
last element of the list is returned:
@array = ("ab", "cd", "ef");
$lastchar = chop(@array);
This assigns f, the last character of the last element
of @array, to $lastchar.
The chomp function, defined only in Perl 5, checks whether
the last characters of a string or list of strings match the input
line separator defined by the $/ system variable. If
they do, chomp removes them.
The syntax for the chomp function is
result = chomp(var)
As in the chop function, var can be either a
scalar variable or a list. If var is a list, each element
of the list is checked for the input end-of-line string. result
is the total number of characters removed by chomp.
Listing 14.1 shows how chomp works.
Listing 14.1. A program that uses the chomp
function.
1: #!/usr/local/bin/perl
2:
3: $/ = "::"; # set input line separator
4: $scalar = "testing::";
5: $num = chomp($scalar);
6: print ("$scalar $num\n");
7: @list = ("test1::", "test2", "test3::");
8: $num = chomp(@list);
9: print ("@list $num\n");
$ program14_1
testing 2
test1 test2 test3 4
$

This program uses chomp to remove
the input line separator from both a scalar variable and an array
variable. The call to chomp in line 5 converts the value
of $scalar from testing:: to testing.
The number of characters removed, 2, is returned by chomp
and assigned to $num.
The call to chomp in line 8 checks each element of @list.
The first element is converted from test1:: to test1,
and the last element is converted from test3:: to test3.
(The second element is ignored, because it is not terminated by
the end-of-line specifier.) The total number of characters removed,
4 (two from the first element and two from the last), is returned
by chomp and assigned to $num.
| NOTE |
For more information on the $/ system variable, refer to Chapter 17, "System Variables.
|
The crypt function encrypts a string using the NBS Data
Encryption Standard (DES) algorithm.
The syntax for the crypt function is
result = crypt (original, salt);
original is the string to be encrypted, and salt
is a character string of two characters that defines how to change
the DES algorithm (to make it more difficult to decode). These
two characters can be any letter or digit, or one of the .
and / characters. After the algorithm is changed, the
string is encrypted using the resulting key.
result is the encrypted string. The first two characters
of result are the two characters specified in salt.
You can use crypt to set up a password checker similar
to those used by the UNIX login. Listing 14.2 is an example of
a program that prompts the user for a password and compares it
with a password stored in a special file.
Listing 14.2. A program that asks for and compares a password.
1: #!/usr/local/bin/perl
2:
3: open (PASSWD, "/u/jqpublic/passwd") ||
4: die ("Can't open password file");
5: $passwd = <PASSWD>;
6: chop ($passwd);
7: close (PASSWD);
8: print ("Enter the password for this program:\n");
9: system ("stty -echo");
10: $mypasswd = <STDIN>;
11: system ("stty echo");
12: chop ($mypasswd);
13: if (crypt ($mypasswd, substr($passwd, 0, 2)) eq $passwd) {
14: print ("Correct! Carry on!\n");
15: } else {
16: die ("Incorrect password: goodbye!\n");
17: }
$ program14_2
Enter the password for this program:
bluejays
Correct! Carry on!
$

Note that the password you type is not displayed
on the screen.
Lines 3-7 retrieve the correct password from the file /u/jqpublic/passwd.
This password can be created by another call to crypt.
For example, if the correct password is sludge, the call
that creates the string now stored in $passwd could be
the following, where $salt contains some two-character
string:
$retval = crypt ("sludge", $salt);
After the correct password has been retrieved, the next step is
line 8, which asks the user to type a password. By default, anything
typed in at the keyboard is immediately displayed on the screen;
this behavior is called input echoing. Input echoing is
not desirable if a password is being typed in, because someone
looking over the user's shoulder can read the password and break
into the program.
To make the password-checking process more secure, line 9 calls
the UNIX command stty -echo, which turns off input echoing;
now the password is not displayed on the screen when the user
types it. After the password has been entered, line 11 calls the
UNIX command stty echo, which turns input echoing back
on.
Line 13 calls crypt to check the password the user has
entered. Because the first two characters of the actual encrypted
password contain the two-character salt used in encryption, substr
is used to retrieve these two characters and use them as the salt
when encrypting the user's password. If the value returned by
crypt is identical to the encrypted password, the user's
password is correct; otherwise, the user has gotten it wrong,
and die terminates the program. (A gentler password-checking
program usually gives the user two or three chances to type a
password before terminating the program.)
This password checker is secure because the actual password does
not appear in the program in unencrypted form. (In fact, because
the password is in a separate file, it does not appear in the
program at all.) This makes it impossible to obtain the password
by simply examining the text file.
| NOTE |
The behavior of crypt is identical to that of the UNIX library function crypt. See the crypt(3) manual page for more information on DES encryption
|
The hex function assumes that a character string is a
number written in hexadecimal format, and it converts it into
a decimal number (a number in standard base-10 format).
The syntax for the hex function is
decnum = hex (hexnum);
hexnum is the hexadecimal character string, and decnum
is the resulting decimal number.
The following is an example:
$myhexstring = "1ff";
$num = hex ($myhexstring);
This call to hex assigns the decimal equivalent of 1ff
to $num, which means that the value of $num
is now 511. The value stored in $myhexstring
is not changed.
The value passed to the string can contain either uppercase or
lowercase letters (provided the letters are between a
and f, inclusive). This value can be the result of an
expression, as follows:
$num = hex ("f" x 2);
Here, the expression "f" x 2 is equivalent
to ff, which is converted to 255 by hex.
| NOTE |
To convert a string from a decimal value to a hexadecimal value, use sprintf and specify either %x (hexadecimal integer) or %lx (long hexadecimal integer)
|
 |
hex does not handle hexadecimal strings that start with the characters 0x or 0X. To handle these strings, either get rid of these characters using a statement such as
$myhexstring =~ s/^0[xX]//;
or call the oct function, which is described later in toChapter's lesson
|
The int function turns a floating-point number into an
integer by getting rid of everything after the decimal point.
The syntax for the int function is
intnum = int (floatnum);
floatnum is the floating-point number, and intnum
is the resulting integer.
The following is an example:
$floatnum = 45.6;
$intnum = int ($floatnum);
This call to int converts 45.6 to 45
and assigns it to $intnum. The value stored in $floatnum
is not changed.
int can be used in expressions as well; for example:
$intval = int (68.3 / $divisor) + 1;
 |
int does not round up when you convert from floating point to integer. To round up when you use int, add 0.5 first, as follows:
$intval = int ($mynum + 0.5);
Even then, you still might need to watch out for round-off errors. For example, if 4.5 is actually stored in the machine as, say, 4.499999999, adding 0.5 might still result in a number less than 5, which means that
int will truncate it to 4
|
The oct function assumes that a character string is a
number written in octal format, and it converts it into a decimal
number (a number in standard base-10 format).
The syntax for the oct function is
decnum = oct (octnum);
octnum is the octal character string, and decnum
is the resulting decimal number.
The following is an example:
$myoctstring = "177";
$num = oct ($myoctstring);
This call to oct assigns the decimal equivalent of 177
to $num, which means that the value of $num
is now 127. The value stored in $myoctstring is not changed.
The value passed to oct can be the result of an expression,
as shown in the following example:
$num = oct ("07" x 2);
Here, the expression "07" x 2 is equivalent
to 0707, which is converted to 455 by oct.
| NOTE |
To convert a string from a decimal value to an octal value, use sprintf and specify either %o (octal integer) or %lo (long octal integer)
|
The oct function also handles hexadecimal integers whose
first two characters start with 0x or 0X:
$num = oct ("0xff");
This call treats 0xff as the hexadecimal number ff
and converts it to 255. This feature of oct can be used
to convert any non-standard Perl integer constant.
Listing 14.3 is a program that reads a line of input and checks
whether it is a valid Perl integer constant. If it is, it converts
it into a standard (base-10) integer.
Listing 14.3. A program that reads any kind of integer.
1: #!/usr/local/bin/perl
2:
3: $integer = <STDIN>;
4: chop ($integer);
5: if ($integer !~ /^[0-9]+$|^0[xX][0-9a-fa-F]+$/) {
6: die ("$integer is not a legal integer\n");
7: }
8: if ($integer =~ /^0/) {
9: $integer = oct ($integer);
10: }
11: print ("$integer\n");
$ program14_3
077
63
$

The pattern in line 5 matches one of the following:
- One or more digits
- A string consisting of 0x or 0X followed
by one or more digits or by uppercase or lowercase letters between
a and f, inclusive
The first case matches any standard base-10 integer or octal integer
(because octal integers start with 0 and consist of the numbers
0 to 7). The second case matches any legal hexadecimal integer.
In both cases, the pattern matches only if there are no extraneous
characters (blank spaces, or other words or numbers) on the line.
Of course, it is easy to use the substitution operator to get
rid of these first, if you like.
Line 8 tests whether the integer is either an octal or hexadecimal
integer by searching for the pattern /^0/. If this pattern
is found, oct converts the integer to decimal, placing
the converted integer back in $integer. Note that line
8 does not need to determine which type of integer is contained
in $integer because oct processes both octal
and hexadecimal integers.
The ord and chr functions are similar to the
Pascal function of the same name. ord converts a single
character to its numeric ASCII equivalent, and chr converts
a number to its ASCII character equivalent.
The syntax for the ord function is
asciival = ord (char);
char is the string whose first character is to be converted,
and asciival is the resulting ASCII value.
For example, the following statement assigns the ASCII value for
the / character, 47, to $ASCIIval:
$ASCIIval = ord("/");
If the value passed to ord is a character string that
is longer than one character in length, ord converts
the first character in the string:
$mystring = "/ignore the rest of this string";
$charval = ord ($mystring);
Here, the first character stored in $mystring, /,
is converted and assigned to $charval.
The syntax for the chr function is
charval = chr (asciival);
asciival is the value to be converted, and charval
is the one-character string representing the character equivalent
of asciival in the ASCII character set.
For example, the following statement assigns / to $slash,
because 47 is the numeric equivalent of / in the ASCII
character set:
$slash = chr(47);
| NOTE |
The ASCII character set contains 256 characters. As a consequence, if the value passed to chr is greater than 256, only the bottom eight bits of the value are used.
This means, for example, that the following statements are equivalent:
$slash = chr(47);
$slash = chr(303);
$slash = chr(559);
In each case, the value of $slash is /
|
 |
The chr function is defined only in Perl 5. If you are using Perl 4, you will need to call sprintf to convert a number to a character:
$slash = sprintf("%c", 47);
This assigns / to $slash
|
In Perl, some functions or expressions behave differently when
their results are assigned to arrays than they do when assigned
to scalar variables. For example, the assignment
@var = @array;
copies the list stored in @array to the array variable
@var, and the assignment
$var = @array;
determines the number of elements in the list stored in @array
and assigns that number to the scalar variable $var.
As you can see, @array has two different meanings: an
"array meaning" and a "scalar meaning." The
Perl interpreter determines which meaning to use by examining
the rest of the statement in which @array occurs. In
the first case, the array meaning is intended, because the statement
is assigning to an array variable. Statements in which the array
meaning is intended are called array contexts.
In the second case, the scalar meaning of @array is intended,
because the statement is assigning to a scalar variable. Statements
in which the scalar meaning is intended are called scalar contexts.
The scalar function enables you to specify the scalar meaning
in an array context.
The syntax for the scalar function is
value = scalar (list);
list is the list to be used in a scalar context, and
value is the scalar meaning of the list.
For example, to create a list consisting of the length of an array,
you can use the following statement:
@array = ("a", "b", "c");
@lengtharray = scalar (@array);
Here, the number of elements of @array, 3, is converted
into a one-element list and assigned to @lengtharray.
Another useful place to use scalar is in conjunction
with the <> operator. Recall that the statement
$myline = <MYFILE>;
reads one line from the input file MYFILE, and
@mylines = <MYFILE>;
reads all of MYFILE into the array variable @mylines.
To read one line into the array variable @mylines (as
a one-element list), use the following:
@mylines = scalar (<MYFILE>);
Specifying scalar with <MYFILE> ensures
that only one line is read from MYFILE.
The pack function enables you to take a list or the contents
of an array variable and convert (pack) it into a scalar value
in a format that can be stored in actual machine memory or used
in programming languages such as C.
The syntax for the pack function is
formatstr = pack(packformat, list);
Here, list is a list of values; this list of values can,
as always, be the contents of an array variable. formatstr
is the resulting string, which is in the format specified by packformat.
packformat consists of one or more pack-format characters;
these characters determine how the list is to be packed. These
pack formats are listed in Table 14.1.
Table 14.1. Format characters for the pack
function.
| Character | Description
|
| a | ASCII character string padded with null characters
|
| A | ASCII character string padded with spaces
|
| b | String of bits, lowest first
|
| B | String of bits, highest first
|
| c | A signed character (range usually -128 to 127)
|
| C | An unsigned character (usually 8 bits)
|
| d | A double-precision floating-point number
|
| f | A single-precision floating-point number
|
| h | Hexadecimal string, lowest digit first
|
| H | Hexadecimal string, highest digit first
|
| i | A signed integer
|
| I | An unsigned integer
|
| l | A signed long integer
|
| L | An unsigned long integer
|
| n | A short integer in network order
|
| N | A long integer in network order
|
| p | A pointer to a string
|
| s | A signed short integer
|
| S | An unsigned short integer
|
| u | Convert to uuencode format
|
| v | A short integer in VAX (little-endian) order
|
| V | A long integer in VAX order
|
| x | A null byte
|
| X | Indicates "go back one byte"
|
| @ | Fill with nulls (ASCII 0)
|
One pack-format character must be supplied for each element in
the list. If you like, you can use spaces or tabs to separate
pack-format characters, because pack ignores white space.
The following is a simple example that uses pack:
$integer = pack("i", 171);
This statement takes the number 171, converts it into
the format used to store integers on your machine, and returns
the converted integer in $integer. This converted integer
can now be written out to a file or passed to a program using
the system or exec functions.
To repeat a pack-format character multiple times, specify a positive
integer after the character. The following is an example:
$twoints = pack("i2", 103, 241);
Here, the pack format i2 is equivalent to ii.
To use the same pack-format character for all of the remaining
elements in the list, use * in place of an integer, as
follows:
$manyints = pack("i*", 14, 26, 11, 83);
Specifying integers or * to repeat pack-format characters
works for all formats except a, A, and @.
With the a and A formats, the integer is assumed
to be the length of the string to create.
$mystring = pack("a6", "test");
This creates a string of six characters (the four that are supplied,
plus two null characters).
| NOTE |
The a and A formats always use exactly one element of the list, regardless of whether a positive integer is included following the character. For example:
$mystring = pack("a6", "test1", "test2");
Here, test1 is packed into a six-character string and assigned to $mystring. test2 is ignored.
To get around this problem, use the x operator to create multiple copies of the a pack-format character, as follows:
$strings = pack ("a6" x 2, "test1", "test2");
This packs test1 and test2 into two six-character strings (joined together)
|
The @ format is a special case. It is used only when
a following integer is specified. This integer indicates the number
of bytes the string must contain at this point; if the string
is smaller, null characters are added. For example:
$output = pack("a @6 a", "test", "test2");
Here, the string test is converted to ASCII format. Because
this string is only four characters long, and the pack format
@6 specifies that the packed scalar value must be six
characters long at this point, two null characters are added to
the string before test2 is packed.
The most frequent use of pack is to create data that
can be used by C programs. For example, to create a string terminated
by a null character, use the following call to pack:
$Cstring = pack ("ax", $mystring);
Here, the a pack-format character converts $mystring
into an ASCII string, and the x character appends a null
character to the end of the string. This format-a string followed
by null-is how C stores strings.
Table 14.2 shows the pack-format characters that have equivalent
data types in C.
Table 14.2. Pack-format characters and their C equivalents.
| Character | C equivalent
|
| C | char
|
| d | double
|
| f | float
|
| I | int
|
| I | unsigned int (or unsigned)
|
| l | long
|
| L | unsigned long
|
| s | short
|
| S | unsigned short
|
In each case, pack stores the value in your local machine's
internal format.
| TIP |
You usually won't need to use pack unless you are preparing data for use in other programs
|
The unpack function reverses the operation performed
by pack. It takes a value stored in machine format and
converts it to a list of values understood by Perl.
The syntax for the unpack function is
list = unpack (packformat, formatstr);
Here, formatstr is the value in machine format, and list
is the created list of values.
As in pack, packformat is a set of one or more
pack format characters. These characters are basically the same
as those understood by pack. Table 14.3 lists these characters.
Table 14.3. The pack-format characters, as used by
unpack.
| Character | Description
|
| a | ASCII character string, unstripped
|
| A | ASCII character string with trailing nulls and spaces stripped
|
| b | String of bits, lowest first
|
| B | String of bits, highest first
|
| c | A signed character (range usually -128 to 127)
|
| C | An unsigned character (usually 8 bits)
|
| d | A double-precision floating-point number
|
| f | A single-precision floating-point number
|
| h | Hexadecimal string, lowest digit first
|
| H | Hexadecimal string, highest digit first
|
| I | A signed integer
|
| I | An unsigned integer
|
| l | A signed long integer
|
| L | An unsigned long integer
|
| n | A short integer in network order
|
| N | A long integer in network order
|
| p | A pointer to a string
|
| s | A signed short integer
|
| S | An unsigned short integer
|
| u | Convert (uudecode) a uuencoded string
|
| v | A short integer in VAX (little-endian) order
|
| V | A long integer in VAX order
|
| x | Skip forward a byte
|
| X | Indicates "go back one byte"
|
| @ | Go to specified position
|
In almost all cases, a call to unpack undoes the effects
of an equivalent call to pack. For example, consider
Listing 14.4, which packs and unpacks a list of integers.
Listing 14.4. A program that demonstrates the relationship
between pack
and unpack.
1: #!/usr/local/bin/perl
2:
3: @list_of_integers = (11, 26, 43);
4: $mystring = pack("i*", @list_of_integers);
5: @list_of_integers = unpack("i*", $mystring);
6: print ("@list_of_integers\n");
$ program14_4
11 26 43
$

Line 4 calls pack, which takes all
of the elements stored in @list_of_integers, converts
them to the machine's integer format, and stores them in $mystring.
Line 5 calls unpack, which assumes that the string stored
in $mystring is a list of values stored in the machine's
integer format; it takes this string, converts each integer in
the string to a Perl value, and stores the resulting list of values
in @list_of_integers.
The only unpack operations that do not exactly mirror
pack operations are those specified by the a and A
formats. The a format converts a machine-format string
into a Perl value as is, whereas the A format converts
a machine-format string into a Perl value and strips any trailing
blanks or null characters.
The A format is useful if you want to convert a C string
into the string format understood by Perl. The following is an
example:
$perlstring = unpack("A", $Cstring);
Here, $Cstring is assumed to contain a character string
stored in the format used by the C programming language (a sequence
of bytes terminated by a null character). unpack strips
the trailing null character from the string stored in $Cstring,
and stores the resulting string in $perlstring.
The @ pack-format character tells unpack to
skip to the position specified with the @. For example,
the following statement skips four bytes in $packstring,
and then unpacks a signed integer and stores it in $skipnum.
$skipnum = unpack("@4i", $packstring);
| NOTE |
If unpack is unpacking a single item, it can be stored in either an array variable or a scalar variable. If an array variable is used to store the result of the unpack operation, the resulting list consists of a single element
|
If an * character appears after the @ pack-format
character, unpack skips to the end of the value being
unpacked. This can be used in conjunction with the X
pack-format character to unpack the right end of the packed value.
For example, the following statement treats the last four bytes
of a packed value as a long unsigned integer and unpacks them:
$longrightint = unpack("@* X4 L", $packstring);
In this example, the @* pack format specifier skips to
the end of the value stored in $packstring. Then, the
X4 specifier backs up four bytes. Finally, the L
specifier treats the last four bytes as a long unsigned integer,
which is unpacked and stored in $longrightint.
 |
The number of bytes unpacked by the s, S, i, I, l, and L formats depends on your machine. Many UNIX machines store short integers in two bytes of memory, and integer and long integer values in four
bytes. However, other machines might behave differently. In general, you cannot assume that programs that use pack and unpack will behave in the same way on different machines
|
The unpack function enables you to decode files that
have been encoded by the uuencode encoding program. To
do this, use the u pack-format specifier.
| NOTE |
uuencode, a coding mechanism available on most UNIX systems, converts all characters (including unprintable characters) into printable ASCII characters. This ensures that you can safely transmit files across remote networks
|
Listing 14.5 is an example of a program that uses unpack
to decode a uuencoded file.
Listing 14.5. A program that decodes a uuencoded
file.
1: #!/usr/local/bin/perl
2:
3: open (CODEDFILE, "/u/janedoe/codefile") ||
4: die ("Can't open input file");
5: open (OUTFILE, ">outfile") ||
6: die ("Can't open output file");
7: while ($line = <CODEDFILE>) {
8: $decoded = unpack("u", $line);
9: print OUTFILE ($decoded);
10: }
11: close (OUTFILE);
12: close (CODEDFILE);

The file variable CODEDFILE represents
the file that was previously encoded by uuencode. Lines
3 and 4 open the file (or die trying). Lines 5 and 6 open the
output file, which is represented by the file variable OUTFILE.
Lines 7-10 read and write one line at a time. Line 7 starts off
by reading a line of encoded input into the scalar variable $line.
As with any other input file, the null string is returned if CODEDFILE
is exhausted.
Line 8 calls unpack to decode the line. If the line is
a special line created by uuencode (for example, the
first line, which lists the filename and the size, or the last
line, which marks the end of the file), unpack detects
it and converts it into the null string. This means that the program
does not need to contain special code to handle these lines.
Line 9 writes the decoded line to the output file represented
by OUTFILE.
| NOTE |
You can use pack to uuencode lists of elements, as in the following:
@encoded = pack ("u", @decoded);
Here, the elements in @decoded are encoded and stored in the array variable @encoded. The list in @encoded can then be decoded using unpack, as follows:
@decoded = unpack ("u", @encoded);
Although pack uses the same uuencode algorithm as the UNIX uuencode utility, you cannot use the UNIX uudecode program on data encoded using pack because pack does not supply the header and footer
(beginning and ending) lines expected by uudecode.
If you really need to use uudecode with a file created by writing out the output from pack, you'll need to write out the header and footer files as well. (See the UNIX manual page for uuencode for more details.
|
The vec function enables you to treat a scalar value
as a collection of chunks, with each chunk consisting of a specified
number of bits; this collection is known as a vector. Each
call to vec accesses a particular chunk of bits in the
vector (known as a bit vector).
The syntax for the vec function is
retval = vec (vector, index, bits);
vector is the scalar value that is to be treated as a
vector. It can be any scalar value, including the value of an
expression.
index behaves like an array subscript. It indicates which
chunk of bits to retrieve. An index of 0 retrieves the first chunk,
1 retrieves the second, and so on. Note that retrieval is from
right to left. The first chunk of bits retrieved when the index
0 is specified is the chunk of bits at the right end of the vector.
bits specifies the number of bits in each chunk; it can
be 1, 2, 4, 8, 16, or 32.
retval is the value of the chunk of bits. This value
is an ordinary Perl scalar value, and it can be used anywhere
scalar values can be used.
Listing 14.6 shows how you can use vec to retrieve the
value of a particular chunk of bits.
Listing 14.6. A program that illustrates the use of vec.
1: #!/usr/local/bin/perl
2:
3: $vector = pack ("B*", "11010011");
4: $val1 = vec ($vector, 0, 4);
5: $val2 = vec ($vector, 1, 4);
6: print ("high-to-low order values: $val1 and $val2\n");
7: $vector = pack ("b*", "11010011");
8: $val1 = vec ($vector, 0, 4);
9: $val2 = vec ($vector, 1, 4);
10: print ("low-to-high order values: $val1 and $val2\n");
$ program14_6
high-to-low order values: 3 and 13
low-to-high order values: 11 and 12
$

The call to pack in line 3 assumes
that each character in the string 11010011 is a bit to
be packed. The bits are packed in high-to-low order (with the
highest bit first), which means that the vector stored in $vector
consists of the bits 11010011 (from left to right). Grouping
these bits into chunks of four produces 1101 0011, which
are the binary representations of 13 and 3, respectively.
Line 4 retrieves the first chunk of four bits from $vector
and assigns it to $val1. This is the chunk 0011,
because vec is retrieving the chunk of bits at the right
end of the bit vector. Similarly, line 5 retrieves 1101,
because the index 1 specifies the second chunk of bits from the
right; this chunk is assigned to $val2. (One way to think
of the index is as "the number of chunks to skip." The
index 1 indicates that one chunk of bits is to be skipped.)
Line 7 is similar to line 3, but the bits are now stored in low-to-high
order, not high-to-low. This means that the string 11010011
is stored as the following (which is 11010011 reversed):
11001011
When this bit vector is grouped into chunks of 4 bits, you get
the following, which are the binary representations of 12 and
11, respectively:
1100 1011
Lines 8 and 9, like lines 4 and 5, retrieve the first and second
chunk of bits from $vector. This means that $val1
is assigned 11 (the first chunk), and $val2 is assigned
12 (the second chunk).
| NOTE |
You can use vec to assign to a chunk of bits by placing the call to vec to the left of an assignment operator. For example:
vec ($vector, 0, 4) = 11;
This statement assigns 11 to the first chunk of bits in $vector. Because the binary representation of 11 is 1011, the last four bits of $vector become 1011
|
By default, all scalar variables and elements of array variables
that have not been assigned to are assumed to contain the null
string. This ensures that Perl programs don't crash when using
uninitialized scalar variables.
In some cases, a program might need to know whether a particular
scalar variable or array element has been assigned to or not.
The built-in function defined enables you to check for
this.
The syntax for the defined function is
retval = defined (expr);
Here, expr is anything that can appear on the left of
an assignment statement, such as a scalar variable, array element,
or an entire array. (An array is assumed to be defined if at least
one of its elements is defined.) retval is true (a nonzero
value) if expr is defined, and false (0) if it is not.
Listing 14.7 is a simple example of a program that uses defined.
Listing 14.7. A program that illustrates the use of defined.
1: #!/usr/local/bin/perl
2:
3: $array[2] = 14;
4: $array[4] = "hello";
5: for ($i = 0; $i <= 5; $i++) {
6: if (defined ($array[$i])) {
7: print ("element ", $i+1, " is defined\n");
8: }
9: }
$ program14_7
element 3 is defined
element 5 is defined
$

This program assigns values to two elements
of the array variable @array: the element with subscript
2 (the third element), and the element with subscript
4 (the fifth element).
The loop in lines 5-9 checks each element of @array to
see whether it is defined. Because the third and fifth elements-$array[2]
and $array[4], respectively-are defined, defined
returns true when $i is 2 and when $i
is 4.
| NOTE |
Many functions that return the null string actually return a special "undefined" value that is treated as if it is the null string. If this undefined value is passed to defined, defined returns false.
Functions that return undefined include the read function (discussed on Chapter 12, "Working with the File System") and fork (introduced on Chapter 13, "Process, String, and
Mathematical Functions"). Many functions discussed toChapter and on Chapter 15, "System Functions," also return the special undefined value when an error occurs.
The general rule is: A function that returns the null string when an error or exceptional condition occurs is usually really returning the undefined value
|
The undef function undefines a scalar variable, array
element, or an entire array.
The syntax of the undef function is
retval = undef (expr);
As in calls to defined, expr can be anything
that can appear to the left of a Perl assignment statement. retval
is always the special undefined value discussed in the previous
section, "The defined Function"; this undefined
value is equivalent to the null string.
The following are some examples of undef:
undef ($myvar);
undef ($array[3]);
undef (@array);
In the first case, the scalar variable $myvar becomes
undefined. The Perl interpreter now treats $myvar as
if it has never been assigned to. Needless to say, any value previously
stored in $myvar is now lost.
In the second example, the fourth element of @array is
marked as undefined. Its value, if any, is lost. Other elements
of @array are unaffected.
In the third and final example, all the elements of @array
are marked as undefined. This lets the Perl interpreter free up
any memory used to store the values of @array, which
might be useful if your program is working with large arrays.
For example, if you have used an array to read in an entire file,
as in the following:
@bigarray = <STDIN>;
you can use the following statement to tell the Perl interpreter
that you don't need the contents of the input file and that the
interpreter can throw them away:
undef (@bigarray);
Calls to undef can omit expr. In this case,
undef does nothing and just returns the undefined value.
Listing 14.8 shows how this can be useful.
Listing 14.8. A program that illustrates the use of undef
to represent an unusual condition.
1: #!/usr/local/bin/perl
2:
3: print ("Enter the number to divide:\n");
4: $value1 = <STDIN>;
5: chop ($value1);
6: print ("Enter the number to divide by:\n");
7: $value2 = <STDIN>;
8: chop ($value2);
9: $result = &safe_division($value1, $value2);
10: if (defined($result)) {
11: print ("The result is $result.\n");
12: } else {
13: print ("Can't divide by zero.\n");
14: }
15:
16: sub safe_division {
17: local ($dividend, $divisor) = @_;
18: local ($result);
19:
20: $result = ($divisor == 0) ? undef :
21: $dividend / $divisor;
22: }
$ program14_8
Enter the number to divide:
26
Enter the number to divide by:
0
Can't divide by zero.
$

Lines 20 and 21 illustrate how you can use
undef. If $divisor is 0, the program is attempting
to divide by 0. In this case, the subroutine safe_division
calls undef, which returns the special undefined value.
This value is assigned to $result and passed back to
the main part of the program.
Line 10 tests whether safe_division has returned the
undefined value by the calling defined function. If defined
returns false, $result contains the undefined value,
and an attempted division by 0 has been detected.
| NOTE |
You can use undef to undefine an entire subroutine, if you like. The following example:
undef (&mysub);
frees the memory used to store mysub; after this, mysub can no longer be called.
You are not likely to need to use this feature of undef, but it might prove useful in programs that consume a lot of memory
|
The following functions manipulate standard array variables and
the lists that they store:
- grep
- splice
- shift
- unshift
- push
- pop
- split
- sort
- reverse
- map
- wantarray
The grep function provides a convenient way of extracting
the elements of a list that match a specified pattern. (It is
named after the UNIX search utility of the same name.)
The syntax for the grep function is
foundlist = grep (pattern, searchlist);
pattern is the pattern to search for. searchlist
is the list of elements to search in. foundlist is the
list of elements matched.
Here is an example:
@list = ("This", "is", "a", "test");
@foundlist = grep(/^[tT]/, @list);
Here, grep examines all the elements of the list stored
in @list. If a list element contains the letter t
(in either uppercase or lowercase), the element is included as
part of @foundlist. As a result, @foundlist
consists of two elements: This and test.
Listing 14.9 is an example of a program that uses grep.
It searches for all integers on an input line and adds them together.
Listing 14.9. A program that demonstrates the use of grep.
1: #!/usr/local/bin/perl
2:
3: $total = 0;
4: $line = <STDIN>;
5: @words = split(/\s+/, $line);
6: @numbers = grep(/^\d+[.,;:]?$/, @words);
7: foreach $number (@numbers) {
8: $total += $number;
9: }
10: print ("The total is $total.\n");
$ program14_9
This line of input contains 8, 11 and 26.
The total is 45.
$

Line 5 splits the input line into words, using
the standard pattern /\s+/, which matches one or more
tabs or blanks. Some of these words are actually numbers, and
some are not.
Line 6 uses grep to match the words that are actually
numbers. The pattern /^\d+[.,;:]?$/ matches if a word
consists of one or more digits followed by an optional punctuation
character. The words that match this pattern are returned by grep
and stored in @numbers. After line 6 has been executed,
@numbers contains the following list:
("8,", "11", "26.")
Lines 7-9 use a foreach loop to total the numbers. Note
that the totaling operation works properly even if a number being
added contains a closing punctuation character: when the Perl
interpreter converts a string to an integer, it reads from left
to right until it sees a character that is not a digit. This means
that the final word, 26., is converted to 26,
which is the expected number.
Because split and grep each return a list and
foreach expects a list, you can combine lines 5-9 into
a single loop if you want to get fancy.
foreach $number (grep (/^\d+[.,;:]?$/, split(/\s+/, $line))) {
$total += $number;
}
As always, there is a trade-off of speed versus readability: this
code is more concise, but the code in Listing 14.9 is more readable.
Using grep with the File-Test Operators
A useful feature of grep is that it can be used to search
for any expression, not just patterns. For example, grep
can be used in conjunction with readdir and the file-test
operators to search a directory.
Listing 14.10 is an example of a program that searches all the
readable files of the current directory for a particular word
(which is supplied on the command line). Files whose names begin
with a period are ignored.
Listing 14.10. A program that uses grep
with the file-test operators.
1: #!/usr/local/bin/perl
2:
3: opendir(CURRDIR, ".") ||
4: die("Can't open current directory");
5: @filelist = grep (!/^\./, grep(-r, readdir(CURRDIR)));
6: closedir(CURRDIR);
7: foreach $file (@filelist) {
8: open (CURRFILE, $file) ||
9: die ("Can't open input file $file");
10: while ($line = <CURRFILE>) {
11: if ($line =~ /$ARGV[0]/) {
12: print ("$file:$line");
13: }
14: }
15: close (CURRFILE);
16: }
$ program14_10 pattern
file1:This line of this file contains the word "pattern".
myfile:This file also contains abcpatterndef.
$

Line 3 of this program opens the current directory.
If it cannot be opened, line 4 calls die, which terminates
the program.
Line 5 is actually three function calls in one, as follows:
- readdir retrieves a list of all of the files in the
directory.
- This list of files is passed to grep, which uses
the -r file test operator to search for all files that
the user has permission to read.
- This list of readable files is passed to another call to grep,
which uses the expression !/^\./ to match all the files
whose names do not begin with a period.
The resulting list-all the files in the current directory that
are readable and whose names do not start with a period-is assigned
to @filelist.
The rest of the program contains nothing new. Line 6 closes the
open directory, and lines
7-16 read each file in turn, searching for the word specified
on the command line. (Recall that the built-in array @ARGV
lists all the arguments supplied on the command line and that
the first word specified on the command line is stored in $ARGV[0].)
Line 11 prints any lines containing the word to search for, using
the format employed by the UNIX grep command (the filename,
followed by :, followed by the line itself).
The splice function enables you to modify the list stored
in an array variable. By passing the appropriate arguments to
splice, you can add elements to the middle of a list,
delete a portion of a list, or replace a portion of a list.
The syntax for the splice function is
retval = splice (array, skipelements, length, newlist)
array is the array variable containing the list to be
spliced. skipelements is the number of elements to skip
before splicing. length is the number of elements to
be replaced. newlist is the list to be spliced in; this
list can be stored in an array variable or specified explicitly.
If length is greater than 0, retval is the list
of elements replaced by splice.
The following sections provide examples of what you can do with
splice.
Replacing List Elements
You can use splice to replace a sublist (a set of elements
in a list) with another sublist. The following is an example:
@array = ("1", "2", "3", "4");
splice (@array, 1, 2, ("two", "three"));
This call to splice takes the list stored in @array,
skips over the first element, and replaces the next two elements
with the list ("two", "three"). The
new value of @array is the list
("1", "two", "three", "4")
If the replacement list is longer than the original list, the
elements to the right of the replaced list are pushed to the right.
For example:
@array = ("1", "2", "3", "4");
splice (@array, 1, 2, ("two", "2.5", "three"));
After this call, the new value of @array is the following:
("1", "two", "2.5", "three", "4")
Similarly, if the replacement list is shorter than the original
list, the elements to the right of the original list are moved
left to fill the resulting gap. For example:
@array = ("1", "2", "3", "4");
splice (@array, 1, 2, "twothree");
After this call to splice, @array contains the
following list:
("1", "twothree", "4")
| NOTE |
You do not need to put parentheses around the list you pass to splice. For example, the following two statements are equivalent:
splice (@array, 1, 2, ("two", "three"));
splice (@array, 1, 2, "two", "three")
When the Perl interpreter sees the second form of splice, it assumes that the fourth and subsequent arguments are the replacement list.
|
Listing 14.11 is an example of a program that uses splice
to replace list elements. It reads a file containing a form letter,
and replaces the string <name> with a name read
from the standard input file. It then writes out the new letter.
The output shown assumes that the file form contains
Hello <name>!
This is your lucky Chapter, <name>!
Listing 14.11. A program that uses splice
to replace list elements.
1: #!/usr/local/bin/perl
2:
3: open (FORM, "form") || die ("Can't open form letter");
4: @form = <FORM>;
5: close (FORM);
6: $name = <STDIN>;
7: @nameparts = split(/\s+/, $name);
8: foreach $line (@form) {
9: @words = split(/\s+/, $line);
10: $i = 0;
11: while (1) {
12: last if (!defined($words[$i]));
13: if ($words[$i] eq "<name>") {
14: splice (@words, $i, 1, @nameparts);
15: $i += @nameparts;
16: } elsif ($words[$i] =~ /^<name>/) {
17: $punc = $words[$i];
18: $punc =~ s/<name>//;
19: @temp = @nameparts;
20: $temp[@temp-1] .= $punc;
21: splice (@words, $i, 1, @temp);
22: $i += @temp;
23: } else {
24: $i++;
25: }
26: }
27: $line = join (" ", @words);
28: }
29: $i = 0;
30: while (1) {
31: if (!defined ($form[$i])) {
32: $~ = "FLUSH";
33: write;
34: last;
35: }
36: if ($form[$i] =~ /^\s*$/) {
37: $~ = "FLUSH";
38: write;
39: $~ = "BLANK";
40: write;
41: $i++;
42: next;
43: }
44: if ($writeline ne "" &&
45: $writeline !~ / $/) {
46: $writeline .= " ";
47: }
48: $writeline .= $form[$i];
49: if (length ($writeline) < 60) {
50: $i++;
51: next;
52: }
53: $~ = "WRITELINE";
54: write;
55: $i++;
56: }
57: format WRITELINE =
58: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~
59: $writeline
60: .
61: format FLUSH =
62: ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
63: $writeline
64: .
65: format BLANK =
66:
67: .
$ program14_11
Fred
Hello Fred! This is your lucky Chapter, Fred!
$

This program starts off by reading the entire
form letter from the file named form into the array variable
@form. This makes it possible to format the form letter
output later on.
Lines 6 and 7 read the name from the standard input file and break
into individual words. This list of words is stored in the array
variable @nameparts.
The loop in lines 8-28 reads each line in the form letter and
looks for occurrences of the string <name>. First,
line 9 breaks the line into individual words. This list of words
is stored in the array variable @words.
The while loop starting in line 11 then examines each
word of @words in turn. Line 12 checks whether the loop
has reached the end of the list by calling defined; if
the loop is past the end of the list, defined will return
false, indicating that the array element is not defined.
Lines 13-15 check whether a word consists entirely of the string
<name>. If it does, line 14 calls splice;
this call replaces the word <name> with the words
in the name list @nameparts.
If a word is not equal to the string <name>, it
might still contain <name> followed by a punctuation
character. To test for this, line 16 tries to match the pattern
/^<name>/. If it matches, lines 17 and 18 isolate
the punctuation in a single word. This punctuation is stored in
the scalar variable $punc.
Lines 19 and 20 create a copy of the name array @nameparts
and append the punctuation to the last element of the array. This
ensures that the punctuation will appear in the form letter where
it is supposed to-right after the last character of the substituted
name. Line 21 then calls splice as in line 14.
After the words in @words have been searched and the
name substituted for <name>, line 27 joins the
words back into a single line. As an additional benefit, the multiple
spaces and tabs in the original line have now been replaced by
a single space, which will make the eventual formatted output
look nicer.
Lines 30-56 write out the output. The string to be written is
stored in the scalar variable $writeline. The program
ensures that the form-letter output is formatted by doing the
following:
- First, the print format WRITELINE is defined to use
the ^<<<< value-field format. This format
fits as much of the contents of $writeline into the line
as possible and then deletes the part of $writeline that
has been written out.
- Lines 36-43 enable you to add paragraphs to your form letter.
Line 36 tests whether an input line is blank. If it is, the FLUSH
print format is used to write out any output from previous lines
that has not yet been printed. (Because the output line specified
by FLUSH starts with ~~, the line is printed
only if it is not blank-in other words, if $writeline
actually contains some leftover text.) Then, the BLANK
print format writes a blank line.
- Lines 44-47 check whether a space needs to be placed between
the end of one input line and the beginning of the next when formatting.
- Lines 49-52 ensure that $writeline is always long
enough to fill the value field specified by WRITELINE.
This guarantees that there will be no unnecessary space in any
of the output lines.
- When @form has been completely read, lines 32-34
ensure that all of the output from previous lines has been written
by using the FLUSH print format.
(For more information on the print formats used in this example,
refer to Chapter 11, "Formatting Your Output.")
| NOTE |
You can use splice to splice the contents of a scalar variable into an array. For example:
splice (@array, 8, 1, $name);
This creates a one-element list consisting of the contents of $name and adds it to the list stored in @array (as the eighth element)
|
Appending List Elements
You can use splice to add a sublist anywhere in a list.
To do this, specify a length field of 0. For example:
splice (@array, 5, 0, "Hello", "there");
This call to splice adds the list ("Hello",
"there") to the list stored in @array.
Hello becomes the new sixth element of $list,
and there becomes the new seventh element; the existing
sixth and seventh elements, if they exist, become the new eighth
and ninth elements, and every other element is also pushed to
the right.
To add a new element to the end of an existing array, specify
a skipelements value of -1, as shown in the
following:
splice (@array, -1, 0, "Hello");
This adds Hello as the last element of the list stored
in @array.
Listing 14.12 is an example of a program that uses splice
to insert an element into a list. This program inserts a word
count after every tenth word in a file.
Listing 14.12. A program that uses splice
to insert array elements.
1: #!/usr/local/bin/perl
2:
3: $count = 0;
4: while ($line = <STDIN>) {
5: chop ($line);
6: @words = split(/\s+/, $line);
7: $added = 0;
8: for ($i = 0; $i+$added < @words; $i++) {
9: if ($count > 0 && ($count + $i) % 10 == 0) {
10: splice (@words, $i+$added, 0,
11: $count + $i);
12: $added += 1;
13: }
14: }
15: $count += @words - $added;
16: $line = join (" ", @words);
17: print ("$line\n");
18: }
$ program14_12
Here is a line with some words on it.
Here are some more test words to count.
A B C D E F G H I J K L M N O P
^D
Here is a line with some words on it.
Here 10 are some more test words to count.
A B C 20 D E F G H I J K L M 30 N O P
$

This program, like many of the others you have
seen, reads one line at a time and breaks the line into words;
the array variable @words contains the list of words
for a particular line.
The scalar variable $count contains the number of words
in the lines previously read. Lines 8 through 14 read each word
in the current input line in turn; at any given point, the counting
variable $i lists the number of words read in the line,
and the sum of $count and $i lists the total
number of words read in all input lines.
Line 9 adds the value stored in $count to the value stored
in $i; if this value, the current word number, is a multiple
of ten, lines 10 and 11 call splice and insert the current
word number into the list. As a result, every tenth word is followed
by its word number.
The scalar variable $added counts the number of elements
added to the list; this ensures that the word numbers added by
lines 10 and 11 are not included as part of the word count.
After the word numbers have been inserted into the list, line
16 rebuilds the input line by joining the elements of @words;
this new input line includes the word numbers. Line 17 then prints
the rebuilt line.
Deleting List Elements
You can use splice to delete list elements without replacing
them. To do this, call splice and omit the newlist
argument. For example:
@deleted = splice (@array, 8, 2);
This call to splice deletes the ninth and tenth elements
of the list stored in @array. If @array contains
subsequent elements, these elements are shifted left to fill the
gap. The list of deleted elements is returned and stored in @deleted.
Listing 14.13 reads an input file, uses splice to delete
all words greater than five characters long, and writes out the
result.
Listing 14.13. A program that uses splice
to delete words.
1: #!/usr/local/bin/perl
2:
3: while ($line = <STDIN>) {
4: @words = split(/\s+/, $line);
5: $i = 0;
6: while (defined($words[$i])) {
7: if (length($words[$i]) > 5) {
8: splice(@words, $i, 1);
9: } else {
10: $i++;
11: }
12: }
13: $line = join (" ", @words);
14: print ("$line\n");
15: }
$ program14_13
this is a test of the program which removes long words
^D
this is a test of the which long words
$

This program reads one line of input at a time
and breaks each input line into words. Line 7 calls length
to determine the length of a particular word. If the word is greater
than five characters in length, line 8 calls splice to
remove the word from the list.
| NOTE |
You also can omit the length argument when you call splice. If you do, splice deletes everything after the element specified by skipelements:
splice (@array, 7);
This deletes the seventh and all subsequent elements of the list stored in @array.
To delete the last element of a list, specify -1 as the skipelements argument.
splice (@array, -1);
In all cases, splice returns the list of deleted elements
|
One list operation that is frequently needed in a program is to
remove an element from the front of a list. Because this operation
is often performed, Perl provides a special function, shift,
that handles it.
shift removes the first element of the list and moves
(or "shifts") every remaining element of the list to
the left to cover the gap. shift then returns the removed
element.
The syntax for the shift function is
element = shift (arrayvar);
shift is passed one argument: an array variable that
contains a list. element is the returned element.
| NOTE |
shift returns the undefined value (equivalent to the null string) if the list is empty
|
Here is a simple example using shift:
@mylist = ("1", "2", "3");
$firstval = shift(@mylist);
This call to shift removes the first element, 1,
from the list stored in @mylist. This element is assigned
to $firstval. @mylist now contains the list
("2", "3").
If you do not specify an array variable when you call shift,
the Perl interpreter assumes that shift is to remove
the first element from the system array variable @ARGV.
This variable lists the arguments supplied on the command line
when the program is started up. For example, if you call a Perl
program named foo with the following command:
foo arg1 arg2 arg3
@ARGV contains the list ("arg1", "arg2",
"arg3").
This default feature of shift makes it handy for processing
command-line arguments. Listing 14.14 is a simple program that
prints out its arguments.
Listing 14.14. A program that uses shift
to process the command-line arguments.
1: #!/usr/local/bin/perl
2:
3: while (1) {
4: $currarg = shift;
5: last if (!defined($currarg));
6: print ("$currarg\n");
7: }
$ program14_14 arg1 arg2 arg3
arg1
arg2
arg3
$

When this program is called, the array variable
@ARGV contains a list of the values supplied as arguments
to the program. Line 4 calls shift to remove the first
argument from the list and assign it to $currarg.
If there are no elements (or none remaining), shift returns
the undefined value, and the call to defined in line
5 returns false. This ensures that the loop terminates when there
are no more arguments to read.
| NOTE |
The shift function is equivalent to the following call to splice:
splice (@array, 0, 1)
|
To undo the effect of a shift function, call unshift.
The syntax for the unshift function is
count = unshift (arrayvar, elements);
arrayvar is the list (usually stored in an array variable)
to add to, and elements is the element or list of elements
to add. count is the number of elements in the resulting
list.
The following is an example of a call to unshift:
unshift (@array, "newitem");
This adds the element newitem to the front of the list
stored in @array. The other elements of the list are
moved to the right to accommodate the new item.
You can use unshift to add more than one element to the
front of an array. For example:
unshift (@array, @sublist1, "newitem", @sublist2);
This adds a list consisting of the list stored in @sublist1,
the element newitem, and the list stored in @sublist2
to the front of the list stored in @array.
unshift returns the number of elements in the new list,
as shown in the following:
@array = (1, 2, 3);
$num = unshift (@array, "newitem");
This assigns 4 to $num.
| NOTE |
The unshift function is equivalent to calling splice with a skipelements value of 0 and a length value of 0. For example, the following statements are equivalent:
unshift (@array, "item1", "item2");
splice (@array, 0, 0, "item1", "item2")
|
As you have seen, the unshift function adds an element
to the front of a list. To add an element to the end of a list,
call the push function.
The syntax for the push function is
push (arrayvar, elements);
arrayvar is the list (usually stored in an array variable)
to add to, and elements is the element or list of elements
to add.
The following is an example that uses push:
push (@array, "newitem");
This adds the element newitem to the end of the list.
The end of the list is always assumed to be the last defined element.
For example, consider the following statements:
@array = ("one", "two");
$array[3] = "four";
push (@array, "five");
Here, the first statement creates a two-element list and assigns
it to @array. The second statement assigns four
to the fourth element of @array. Because the fourth element
is now the last element of @array, the call to push
creates a fifth element, even though the third element is undefined.
@array now contains the list
("one", "two", "", "four", "five");
The undefined third element is, as always, equivalent to the null
string.
As with unshift, you can use push to add multiple
elements to the end of a list, as in this example:
push (@array, @sublist1, "newitem", @sublist2);
Here, the list consisting of the contents of @sublist1,
the element newitem, and the contents of @sublist2
is added to the end of the list stored in @array.
| NOTE |
push is equivalent to a call to splice with the skiparguments argument set to the length of the array. This means that the following statements are equivalent:
push (@array, "newitem");
splice (@array, @array, 0, "newitem")
|
The pop function undoes the effect of push.
It removes the last element from the end of a list. The removed
element is returned.
The syntax for the pop function is
element = pop (arrayvar);
arrayvar is the array element from which an element is
to be removed. element is the returned element.
For example, the following statement removes the last element
from the list stored in @array and assigns it to the
scalar variable $popped:
$popped = pop (@array);
If the list passed to pop is empty, pop returns
the undefined value.
| NOTE |
pop is equivalent to a call to splice with a skipelements value of -1 (indicating the last element of the array). This means that the following statements behave in the same way:
$popped = pop (@array);
$popped = splice (@array, -1)
|
The functions you have just seen are handy for constructing two
commonly used data structures: stacks and queues. The following
sections provide examples that use a stack and a queue.
Creating a Stack
A stack is a data structure that behaves like a stack of
plates in a cupboard: the last item added to the stack is always
the first item removed. Data items that are added to the stack
are said to be pushed onto the stack; items which are removed
from the stack are popped off the stack.
As you might have guessed, the functions push and pop
enable you to create a stack in a Perl program. Listing 14.15
is an example of a program that uses a stack to perform arithmetic
operations. It works as follows:
- Two numbers are pushed onto the stack.
- The program reads an arithmetic operator, such as +
or -. The two numbers are popped off the stack, and the
operation is performed.
- The result of the operation is pushed onto the stack, enabling
it to be used in further arithmetic operations.
After all the arithmetic operations have been performed, the stack
should consist of a single element, which is the final result.
The numbers and operators are read from the standard input file.
Note that Listing 14.15 is the "inverse" of Listing
9.12. In the latter program, the arithmetic operators appear first,
followed by the values.
Listing 14.15. A program that uses a stack to perform arithmetic.
1: #!/usr/local/bin/perl
2:
3: while (defined ($value = &read_value)) {
4: if ($value =~ /^\d+$/) {
5: push (@stack, $value);
6: } else {
7: $firstpop = pop (@stack);
8: $secondpop = pop (@stack);
9: push (@stack,
10: &do_math ($firstpop, $secondpop, $value));
11: }
12: }
13: $result = pop (@stack);
14: if (defined ($result)) {
15: print ("The result is $result.\n");
16: } else {
17: die ("Stack empty when printing result.\n");
18: }
19:
20: sub read_value {
21: local ($retval);
22: $input =~ s/^\s+//;
23: while ($input eq "") {
24: $input = <STDIN>;
25: return if ($input eq "");
26: $input =~ s/^\s+//;
27: }
28: $input =~ s/^\S+//;
29: $retval = $&;
30: }
31:
32: sub do_math {
33: local ($val2, $val1, $operator) = @_;
34: local ($result);
35:
36: if (!defined($val1) || !defined($val2)) {
37: die ("Missing operand");
38: }
39: if ($operator =~ m.^[+-/*]$. ) {
40: eval ("\$result = \$val2 $operator \$val1");
41: } else {
42: die ("$operator is not an operator");
43: }
44: $result; # ensure the proper return value
45: }
$ program14_15
11 4 + 26 -
^D
The result is 11.
$

Before going into details, let's first take
a look at how the program produces the final result, which is
11:
- The program starts off by reading the numbers 11 and 4 and
pushing them onto the stack. If the stack is listed from the top
down, it now looks like this:
4
11
Another way to look at the stack is this: At present,
the list stored in @stack is (11, 4).
- The program then reads the + operator, pops the 4
and 11 off the stack, and performs the addition, pushing
the result onto the stack. The stack now contains a single value:
15
- The next value, 26, is pushed onto the stack, which
now looks like this:
26
15
- The program then reads the - operator, pops 15
and 26 off the stack, and subtracts 15 from
26. The result, 11, is pushed onto the stack.
- Because there are no more operations to perform, 11
becomes the final result.
This program delegates to the subroutine read_value the
task of reading values and operators. This subroutine reads a
line of the standard input file and extracts the non-blank items
on the line. Each call to read_value extracts one item
from an input line; when an input line is exhausted, read_value
reads the next one. When the input file is exhausted and there
are no more items to return, $input becomes the undefined
value, which is equivalent to the null string; the call to defined
in line 3 tests for this condition.
If an item returned by read_value is a number, line 5
calls push, which pushes the number onto the stack. If
an item is not a number, the program assumes it is an operator.
At this point, pop is called twice to remove the last
two numbers from the stack, and do_math is called to
perform the arithmetic operation.
The do_math subroutine uses a couple of tricks. First,
defined is called to see whether there are, in fact,
two numbers to add. If one or both of the numbers does not exist,
the program terminates.
Next, the subroutine uses the pattern m.^[+-*/]$. to
check whether the character string stored in $operator
is, in fact, a legal arithmetic operator. (Recall that you can
use a pattern delimiter other than / by specifying m
followed by the character you want to use as the delimiter. In
this case, the period character is the pattern delimiter.)
Finally, the subroutine calls eval to perform the arithmetic
operation. eval replaces the name $operator
with its current value, and then treats the resulting character
string as an executable statement; this performs the arithmetic
operation specified by $operator. Using eval
here saves space; the only alternative is to use a complicated
if-elseif structure.
The result of the operation is returned in $result. Lines
9 and 10 then pass this value to push, which pushes the
result onto the stack. This enables you to use the result in subsequent
operations.
When the last arithmetic operation has been performed, the final
result is stored as the top element of the stack. Line 13 pops
this element, and line 15 prints it.
Note that this program always assumes that the last element pushed
onto the stack is to be on the left of the arithmetic operation.
To reverse this, all you need to do is change the order of $val1
and $val2 in line 33. (Some programs that manipulate
stacks also provide an operation which reverses the order of the
top two elements of a stack.)
 |
The pop function returns the undefined value if the stack is empty. Because the undefined value is equivalent to the null string, and the null string is treated as 0 in arithmetic operations, your program will not complain if you try to pop a
number from an empty stack.
To ensure that you get the result you want, always call defined after you call pop to ensure that a value has actually been popped from the stack
|
Creating a Queue
A queue is a data structure that processes data in the
order in which it is entered; such data structures are known as
first-in, first-out (or FIFO) structures. (A stack, on the other
hand, is an example of a last-in, first-out, or LIFO, structure.)
To create a queue, use the function push to add items
to the queue, and call shift to remove elements from
it. Because push adds to the right of the list and shift
removes from the left, elements are processed in the order in
which they appear.
Listing 14.16 is an example of a program that uses a queue to
add a set of numbers retrieved via a pipe. Each input line can
consist of more than one number, and the numbers are added in
the order listed.
The input/output example shown for this listing assumes that the
numbers retrieved via the pipe are 11, 12, and 13.
Listing 14.16. A program that illustrates the use of a queue.
1: #!/usr/local/bin/perl
2:
3: open (PIPE, "numbers|") ||
4: die ("Can't open pipe");
5: $result = 0;
6: while (defined ($value = &readnum)) {
7: $result += $value;
8: }
9: print ("The result is $result.\n");
10:
11: sub readnum {
12: local ($line, @numbers, $retval);
13: while ($queue[0] eq "") {
14: $line = <PIPE>;
15: last if ($line eq "");
16: $line =~ s/^\s+//;
17: @numbers = split (/\s+/, $line);
18: push (@queue, @numbers);
19: }
20: $retval = shift(@queue);
21: }
$ program14_16
The result is 36.
$

This program assumes that a program named numbers
exists, and that its out-put is a stream of numbers. Multiple
numbers can appear on a single line of this output. Lines 3 and
4 associate the file variable PIPE with the output from
the numbers command.
Lines 6-8 call the subroutine readnum to obtain a number
and then add it to the result stored in $result. This
subroutine reads input from the pipe, breaks it into individual
numbers, and then calls push to add the numbers to the
queue stored in @queue. Line 20 then calls shift
to retrieve the first element in the queue, which is returned
to the main program.
If an input line is blank, the call to split in line
17 produces the empty list, which means that nothing is added
to @queue. This ensures that input is read from the pipe
until a non-blank line is read or until the input is exhausted.
The split function was first discussed on Chapter 5, "Lists
and Array Variables." It splits a character string into a
list of elements.
The usual syntax for the split function is
list = split (pattern, value);
Here, value is the character string to be split. pattern
is a pattern to be searched for. A new element is started every
time pattern is matched. (pattern is not included
as part of any element.) The resulting list of elements is returned
in list.
For example, the following statement breaks the character string
stored in $line into elements, which are stored in @list:
@list = split (/:/, $line);
A new element is started every time the pattern /:/ is
matched. If $line contains This:is:a:string,
the resulting list is ("This", "is", "a",
"string").
If you like, you can specify the maximum number of elements of
the list produced by split by specifying the maximum
as the third argument. For example:
$line = "This:is:a:string";
@list = split (/:/, $line, 3);
As before, this breaks the string stored in $line into
elements. After three elements have been created, no more new
elements are created. Any subsequent matches of the pattern are
ignored. In this case, the list assigned to @list is
("This", "is", "a:string").
| TIP |
If you use split with a limit, you can assign to several scalar variables at once:
$line = "11 12 13 14 15";
($var1, $var2, $line) = split (/\s+/, $line, 3);
This splits $line into the list ("11", "12", "13 14 15"). $var1 is assigned 11, $var2 is assigned 12, and $line is assigned "13 14 15".
This enables you to assign the "leftovers" to a single variable, which can then be split again at a later time
|
The sort function sorts a list in alphabetical order,
as follows:
@sorted = sort (@list);
The sorted list is returned.
The reverse function reverses the order of a list:
@reversed = reverse (@list);
For more information on the sort and reverse
functions, see Chapter 5. For information on how you can specify the
sort order that sort is to use, see Chapter 9, "Using
Subroutines."
The map function, defined only in Perl 5, enables you
to use each of the elements of a list, in turn, as an operand
in an expression.
The syntax for the map function is
resultlist = map(expr, list);
list is the list of elements to be used as operands or
arguments; this list is copied by map, but is not itself
changed. expr is the expression to be repeated. The results
of the repeated evaluation of the expression are stored in a list,
which is returned in resultlist.
expr assumes that the system variable $_ contains
the element of the list currently being used as an operand. For
example:
@list = (100, 200, 300);
@results = map($_+1, @list);
This evaluates the expression $_+1 for each of 100,
200, and 300 in turn. The results, 101,
201, and 301, respectively, are formed into
the list (101, 201, 301). This list is then assigned
to @results.
To use map with a subroutine, just pass $_ to
the subroutine, as in the following:
@results = map(&mysub($_), @list);
This calls the subroutine mysub once for each element
of the list stored in @list. The values returned by mysub
are stored in a list, which is assigned to @results.
This also works with built-in functions:
@results = map(chr($_), @list);
@results = map(chr, @list); # same as above,
Âsince $_ is the default argument for chr
This converts each element of the list in @list to its
ASCII character equivalent. The resulting list of characters is
stored in @results.
| NOTE |
For more information on the $_ system variable, refer to Chapter 17
|
In Perl, the behavior of some built-in functions depends on whether
they are dealing with scalar values or lists. For example, the
chop function either chops the last character of a single
string or chops the last character of every element of a list:
chop($scalar); # chop a single string
chop(@array); # chop every element of an array
Perl 5 enables you to define similar two-way behavior for your
subroutines using the wantarray function. (This function
is not defined in Perl 4.)
The syntax for the wantarray function is
result = wantarray();
result is a non-zero value if the subroutine is expected
to return a list, and is zero if the subroutine is expected to
return a scalar value.
Listing 14.17 illustrates how wantarray works.
Listing 14.17. A program that uses the wantarray
function.
1: #!/usr/local/bin/perl
2:
3: @array = &mysub();
4: $scalar = &mysub();
5:
6: sub mysub {
7: if (wantarray()) {
8: print ("true\n");
9: } else {
10: print ("false\n");
11: }
12: }
$ program14_17
true
false
$

When mysub is first called in line
3, the return value is expected to be a list, which means that
wantarray returns a non-zero (true) value in line 7.
The second call to mysub in line 4 expects a scalar return
value, which means that wantarray returns zero (false).
Perl provides a variety of functions that operate on associative
arrays. Most of these functions are described in detail on Chapter
10, "Associative Arrays"; a brief description of each
function is presented here.
The keys function returns a list of the subscripts of
the elements of an associative array.
The syntax for keys is straightforward:
list = keys (assoc_array);
assoc_array is the associative array from which subscripts
are to be extracted, and list is the returned list of
subscripts.
For example:
%array = ("foo", 26, "bar", 17);
@list = keys(%array);
This call to keys assigns ("foo", "bar")
to @list. (The elements of the list might be in a different
order. To specify a particular order, sort the list using the
sort function.)
keys often is used with foreach, as in the following
example:
foreach $subscript (keys (%array)) {
# stuff goes here
}
This loops once for each subscript of the array.
The values function returns a list consisting of all
the values in an associative array.
The syntax for the values function is
list = values (assoc_array);
assoc_array is the associative array from which values
are to be extracted, and list is the returned list of
values.
The following is an example that uses values:
%array = ("foo", 26, "bar", 17);
@list = values(%array);
This assigns the list (26, 17) to @list (not
necessarily in this order).
The each function returns an associative array element
as a two-element list. The list consists of the associative array
subscript and its associated value. Successive calls to each
return another associative array element.
The syntax for the each function is
pair = each (assoc_array);
assoc_array is the associative array from which pairs
are to be returned, and pair is the subscript-element
pair returned.
The following is an example:
%array = ("foo", 26, "bar", 17);
@list = each(%array);
The first call to each assigns either ("foo",
26) or ("bar", 17) to @list.
A subsequent call returns the other element, and a third call
returns an empty list. (The order in which the elements are returned
depends on how the list is stored; no particular order is guaranteed.)
The delete function deletes an associative array element.
The syntax for the delete function is
element = delete (assoc_array_item);
assoc_array_item is the associative array element to
be deleted, and element is the value of the deleted element.
The following is an example:
%array = ("foo", 26, "bar", 17);
$retval = delete ($array{"foo"});
After delete is called, the associative array %array
contains only one element: the element with the subscript bar.
$retval is assigned the value of the deleted element
foo, which in this case is 26.
The exists function, defined only in Perl 5, enables
you to determine whether a particular element of an associative
array exists.
The syntax for the exists function is
result = exists(element);
element is the element of the associative array that
is being tested for existence. result is non-zero if
the element exists, and zero if it does not.
The following is an example:
$result = exists($myarray{$mykey});
$result is nonzero if $myarray{$mykey} exists.
ToChapter, you learned about functions that manipulate scalar values
and convert them from one form to another, and about functions
that manipulate lists.
The chop function removes the last character from a scalar
value or from each element of a list.
The crypt function encrypts a scalar value, using the
same method that the UNIX password encryptor uses.
The int function takes a floating-point number and gets
rid of everything after the decimal point.
The defined function checks whether a scalar variable,
array element, or array has been assigned to. The undef
function enables you to treat a previously defined scalar variable,
array element, or array as if it is undefined. scalar
enables you to treat an array or list as if it is a scalar value.
The other functions described in toChapter's lesson convert values
from one form into another. The hex and oct
functions read hexadecimal and octal constants and convert them
into decimal form. The ord function converts a character
into its ASCII decimal equivalent. pack and unpack
convert a scalar value into a format that can be stored in machine
memory, and vice versa. vec enables you to treat a value
as an array of numeric values, each of which is a certain number
of bits long.
The grep function enables you to extract the elements
of a list that match a particular pattern. This function can be
used in conjunction with the file-test operators.
The splice function enables you to extract a portion
of a list or insert a sublist into a list. The shift
and pop functions remove an element from the left and
right ends of a list, and the unshift and push
functions add one or more elements to the left and right ends
of a list. You can use push, pop, and shift
to create stacks and queues.
The split function enables you to break a character string
into list elements. You can impose an upper limit on the number
of list elements to be created.
The sort function sorts a list in a specified order.
The reverse function reverses the order of the elements
in a list.
The map function copies a list and then performs an operation
on every element of the list.
The wantarray function enables you to determine whether
the statement that called a subroutine is expecting a scalar return
value or a list.
Five functions are defined that manipulate associative arrays:
- keys, which returns a list of the array subscripts
- values, which returns a list of the array values
- each, which returns a two-element list consisting
of an array subscript and its value
- delete, which deletes an element
- exists, which checks whether a particular element
exists
| Q: | Why is the undefined value equivalent to the null string?
|
| A: | Basically, to keep Perl programs from blowing up if they try to access a variable that has not yet been assigned to.
|
| Q: | Why does oct handle hexadecimal constants that start with 0x or 0X?
|
| A: | There is no particular reason, except that it's a little more convenient. If you find that it bothers you to use oct to convert a hexadecimal constant, get rid of the leading 0x or
0X (using the substitute operator) and call hex instead.
|
| Q: | I want to put a password check in my program. How can I ensure that it is secure?
|
| A: | Do two things:
- Don't include the unencrypted text of your password in your program source. People can then find out the password just by reading the file.
- Use a password that is not a real English-language word or proper name. Include at least one digit. This makes your password harder to "crack."
|
| Q: | Why does int truncate instead of rounding?
|
| A: | Some programs might find it useful to just retrieve the integer part of a floating-point number. (For example, in earlier chapters, you have seen int used in conjunction with rand to return a
random integer.)
You can always add 0.5 to your number before calling int, which will effectively round it up when necessary.
|
| Q: | When I pack integers using the s or i pack-format characters, the bits don't appear in the order I was expecting. What is happening?
|
| A: | Most machines enable you to store integers that are more than one byte long (two- and four-byte integers usually are supported). However, each machine does not store a multibyte integer in the same way. Some
machines store the most significant byte of a word at a lower address; these machines are called big-endian machines because the big end of a word is first. Other machines, called little-endian machines, store the least significant byte of a
word at a lower byte address.
If you are not getting the result you expect, you might be expecting big-endian and getting little-endian, or vice versa.
|
| Q: | The splice function works by shifting elements to the right or left to make room or fill gaps. Is this inefficient?
|
| A: | No. The Perl interpreter actually stores a list as a sequence of pointers (memory addresses). All splice has to do is rearrange the pointers. This holds true also for sort and
reverse.
|
| Q: | Can I use each to work through an associative array in a specified order?
|
| A: | No. If you need to access the elements of an associative array in a specified order, use keys and sort to sort the subscripts, and then retrieve the value associated with each element.
|
| Q: | If I am using values with foreach, can I retrieve the subscript associated with a particular value if I need it?
|
| A: | No. If you are likely to need the subscripts as well as their values, use each or keys.
|
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the
quiz and exercise answers before you go on to tomorrow's lesson.
- What format does each of the following pack-format characters
specify?
a. A
b. A
c. d
d. p
e. @
- What do these unpack-format specifiers do?
a. "a"
b. "@4A10i*"
c. "@*X4C*"
d. "ix4iX8i"
e. "b*X*B*"
- What value is stored in $value by the following?
a. The statements
$vector = pack ("b*", "10110110");
$value = vec ($vector, 3, 1);
b. The statements
$vector = pack ("b*", "10110110");
$value = vec ($vector, 1, 2);
- What's the difference between defined and undef?
- Assume @list contains ("1", "2",
"3", "4", "5"). What are the
contents of @list after the following statement?
a. splice (@list, 0, 1, "new");
b. splice (@list,
2, 0, "test1", "test2");
c. splice (@list, -1, 1, "test1",
"test2");
d. splice (@list, 2, 1);
e. splice (@list, 3);
- What do the following statements return?
a. grep (!/^!/, @array);
b. grep (/\b\d+\b/,
@array);
c. grep (/./, @array);
d. grep (//, @array);
- What is the difference between shift and unshift?
- What arguments to splice are equivalent to the following
function calls?
a. shift (@array);
b. pop (@array);
c. push (@array, @sublist);
d. unshift (@array, @sublist);>
- How can you create a stack using shift, pop,
push, or unshift?
- How can you create a queue using shift, pop,
push, or unshift?
- Write a program that reads two binary strings of any
length, adds them together, and writes out the binary output.
(Hint: This is a really nasty problem. To get this to work, you
will need to ensure that your bit strings are a multiple of eight
bits by adding zeros at the front.)
- Write a program that reads two hexadecimal strings of any
length, adds them together, and writes out the hexadecimal output.
(Hint: This is a straightforward modification of Exercise 1.)
- Write a program that uses int to round a value to
two decimal places. (Hint: This is trickier than it seems.)
- Write a program that encrypts a password and then asks the
user to guess it. Give the user three chances to get it right.
- BUG BUSTER: What is wrong with the following program?
#!/usr/local/bin/perl
$bitstring = "00000011";
$packed = pack("b*", $bitstring);
$highbit = vec($packed, 0, 1);
print ("The high-order bit is $highbit\n");
- Write a program that uses splice to sort a list in
numeric order.
- Write a program that "flips" an associative array;
that is, the subscripts of the old array become the values of
the new, and vice versa. Print an error message if the old array
has two subscripts with identical values.
- Write a program that reads a file from standard input, breaks
each line into words, uses grep to get rid of all words
longer than five characters, and prints the file.
- Write a program that reads an input line and uses split
to read and print one word of the line at a time.
- BUG BUSTER: What is wrong with the following subroutine?
sub retrieve_first_element {
local ($retval);
$retval = unshift(@array);
}

|