Chapter 18
References in Perl 5
CONTENTS
ToChapter's lesson describes the use of Perl references and the concept
of pointers. ToChapter's lesson also shows you how to use references
to create complex data structures, pass pointers around, and work
with subroutines. You learn the following topics:
- Hard and symbolic references
- Using references to arrays and scalars
- Passing arrays to subroutines by reference
- References to subroutines
A reference is simply a pointer to something, such as a
Perl variable, array, hash (also known as an associative array),
or even a subroutine. The concept of a reference is probably familiar
to Pascal or C programmers. A reference is simply an address to
a value. How you use that value is up to you as the programmer
and what the language lets you get away with. In Perl, you can
refer to a pointer as a reference; in fact, you can use the terms
pointer and reference interchangeably without any loss of meaning.
References are useful in creating complex data structures in Perl.
In fact, you cannot really define any complicated structures in
Perl without using references.
The two types of references in Perl 5 are hard and symbolic.
A symbolic reference contains the name of a variable. Symbolic
references are useful for creating variable names and addressing
them at runtime. Basically, a symbolic link is like the name of
a file or a soft link on a UNIX system. Hard references are more
like hard links in the file system (that is, merely another path
to the same underlying item).
Perl 4 permits only symbolic references, which are difficult to
use. For example, in Perl 4, you have to use names to index to
an associative array called _main{} of symbol names for
a package. Perl 5 now lets you have hard references to data.
Hard references keep track of reference counts. When the reference
count becomes zero, Perl automatically frees the item referred
to. If that item happens to be a Perl object, the object is destructed-freed
to the memory pool. Perl is object-oriented in itself because
everything in Perl is an object. Packages and modules make it
much easier to use objects in Perl.
Hard references are easy to use in Perl as long as you use them
as scalars. To use hard references as anything but scalars, you
have to explicitly de-reference the variable and tell it how you
want it to behave. If this sounds confusing, don't worry; references
are covered on Chapter 19, "Object-Oriented Programming in Perl,"
to help make this concept clearer.
In toChapter's lesson, a scalar value refers to a variable such as
$pointer. The variable $pointer contains one
data item; whether the item is a number, string, or an address
is determined by how you use it.
Any scalar can hold a hard reference, and because arrays and hashes
do contain scalars, it follows that you can now easily build complex
data structures of different combinations of arrays of arrays,
arrays of hashes, hashes of functions, and so on. As long as you
understand that you are working only with scalars, you should
be able to navigate through the most complex structures with proper
dereferencing.
Let's cover some of the basics first before we get too deep into
the chapter.
To use the value of $pointer as the pointer to an array,
you reference the items in the array as @$pointer. This
notation of "@$pointer" roughly translates
to "take the address in $pointer and then use it
as an array." Similarly for hashes, you would use %$pointer
as the reference to the first element in the hash.
Because there are several ways to construct references, you can
have references to just about anything, such as arrays, scalar
variables, subroutines, file handles, and, yes-to the delight
of C programmers-even other references. Perl gives you the power
to write enough complicated code to hang yourself.
Now look at some of the ways that you can create and use references
in Perl.
Using the backslash operator is analogous to using the ampersand
(&) operator in C to pass the address of an operator.
Usually, you use the backslash operator to create a second, new
reference to a variable. The following code shows how to create
a reference to a scalar variable:
$variable = 22;
$pointer = \$variable;
$ice = "jello"
$iceptr = \$ice;
$pointer points to the location that contains the value
of $variable. The pointer $iceptr points to
"jello". Even if the original reference $variable
gets destroyed, you can still access the value from the $pointer
reference. There is a hard reference at work here, so you will
have to get rid of both $pointer and $variable
for the space in which 22 is allocated to be freed back
to the memory pool.
In the preceding code, the variable $pointer contains
the address of $variable, not the value itself. To get
the value, you have to de-reference $pointer with two
$$. The following sample script shows how this works:
#!/usr/bin/perl
$value = 10;
$pointer = \$value;
printf "\n Pointer Address $pointer of $value \n";
printf "\n What Pointer *($pointer) points to $$pointer\n";
The $value in the script is set to 10. The $pointer
is set to point to the address of $value. The two printf
statements show how the value of the variable is referenced. If
you run the script shown, you see something very close to the
following output:
Pointer Address SCALAR(0x806c520) of 10
What Pointer *(SCALAR(0x806c520)) points to 10
The address in the output from your script will probably be different
from what's shown. However, you can see that $pointer
gave the address and $$pointer gave the value of the
scalar that $variable points to.
Pay attention to how the address is shown in the pointer variable.
The word SCALAR is followed by a long hexadecimal number.
The word SCALAR tells you that the address points to
a scalar variable. The number following SCALAR is the
address where the actual value of the scalar variable is kept.
| NOTE |
A pointer is an address. The data at that address is referred to by a pointer. If the pointer happens to point to an invalid address, you can get bad data. Generally, Perl will simply return a NULL value, but you should not rely on this, and
should program to initialize all your pointers to refer to valid data items
|
Perhaps the most important point you must remember about Perl
is that all Perl @ARRAYs and %HASHes are always
one-dimensional. As such, the arrays and hashes hold scalar
values only and do not directly contain other arrays or complex
data structures. A member of an array is either a number or a
reference (including strings).
You can use the backslash operator on arrays and hashes just as
you would for scalar variables. You would use something like Listing
18.1 for arrays.
Listing 18.1. Using the backslash operator on arrays.
1 #!/usr/bin/perl
2 #
3 # Using Array references
4 #
5 $pointer = \@ARGV;
6 printf "\n Pointer Address of ARGV = $pointer\n";
7 $i = scalar(@$pointer);
8 printf "\n Number of arguments : $i \n";
9 $i = 0;
10 foreach (@$pointer) {
11 printf "$i : $$pointer[$i++]; \n";
12 }
$ test 1 2 3 4
Pointer Address of ARGV = ARRAY(0x806c378)
Number of arguments : 4
0 : 1;
1 : 2;
2 : 3;
3 : 4;
Examine the lines that pertain to references in the shell script
shown, which prints the contents of the input argument array @ARGV.
Line 5 is where the reference $pointer is set to point
to the array @ARGV. Line 6 simply prints the address
of ARGV. You probably will never have to use the address
of ARGV, but had you been using another array, this is
a quick way to get to the address of the first element of the
array.
| NOTE |
Pointers are referred to as references, and vice versa
|
The $pointer returns the address of the first element
of an array. In Listing 18.1, the array happened to be @ARGV.
A pointer to an array should sound familiar to C programmers because
a reference to a one-dimensional array is simply a pointer to
the first element of the array.
Line 7 calls the function scalar() (not to be
confused with the type of variable scalar) to get the
count of the number of elements in an array. The parameter passed
in could be @ARGV, but with the pointer $pointer,
you must specify the type of parameter that is expected by the
scalar() function. Therefore, you specify the type of
parameter as an array by using @$pointer.
The type of $pointer in this case is a pointer
to the array whose number of elements you must return from the
scalar() function. The call to the function has @$pointer
as the passed parameter. The $pointer gives the address
of the first element, and the @ sign forces the passing
of the address of the first element as an array reference.
Line 10 contains the same reference to the array that line 7 contains.
Line 11 lists all the elements of the array using the $$pointer[$i]
item. How do you interpret this? The $pointer points
to the first element in the array. The program then gets the ($i
- 1)-th item in the array ($pointer[$i++]) and increments
$i. Finally, the value at $$pointer[$i] is returned
as a scalar. Because the autoincrement operator is low on the
operator precedence priority list, $i is incremented
last of all.
You can also use the backslash operator with associative arrays.
The idea is the same-you are substituting the $pointer
for all references to the name of the associative array. The number
following the word ARRAY in the pointer address of ARGV
in the previous example is the address of ARGV. The address
itself won't do you any good, because most programs do not need
this information, but just realize that references to arrays and
scalars are displayed with the type that they happen to be pointing
to.
For pointers to functions, the address is printed with the word
CODE, and for a hash, it is printed as HASH.
See Listing 18.2 for an example of how to print out an address
to a hash.
Listing 18.2. Using references to a hash.
#!/usr/bin/perl
1#
2 # Using Associative Array references
3 #
4 %month = (
5 '01', 'Jan',
6 '02', 'Feb',
7 '03', 'Mar',
8 '04', 'Apr',
9 '05', 'May',
10 '06', 'Jun',
11 '07', 'Jul',
12 '08', 'Aug',
13 '09', 'Sep',
14 '10', 'Oct',
15 '11', 'Nov',
16 '12', 'Dec',
17 );
18
19 $pointer = \%month;
20
21 printf "\n Address of hash = $pointer\n ";
22
23 #
24 # The following lines would be used to print out the
25 # contents of the associative array if %month was used.
26 #
27 # foreach $i (sort keys %month) {
28 # printf "\n $i $$pointer{$i} ";
29 # }
30
31 #
32 # The reference to the associative array via $pointer
33 #
34 foreach $i (sort keys %$pointer) {
35 printf "$i is $$pointer{$i} \n";
36 }
$ mth
Address of hash = HASH(0x806c52c)
01 is Jan
02 is Feb
03 is Mar
04 is Apr
05 is May
06 is Jun
07 is Jul
08 is Aug
09 is Sep
10 is Oct
11 is Nov
12 is Dec

The reference to the associative array is made
with the code in line 19, $pointer = \%month;. As with
ordinary arrays, the references to the elements of the array are
made with the $$pointer{$index} construct. Of course,
because the array is really a hash, the $index is the
key into the hash and not a number. See lines 34 and 35 to see
how elements in the array are being referenced.
You don't have to construct associative arrays using the comma
operator. You can use the => operator instead. In
the later Perl module and sample code in this chapter, you will
see the => operator, which is the same as the comma
operator. Using => makes the code a bit easier to
read. See Listing 18.3 for a sample usage of the => operator.
Listing 18.3. Using the =>
operator.
1 #!/usr/bin/perl
2 #
3 # Using Array references
4 #
5 %weekChapter = (
6 '01' => 'Mon',
7 '02' => 'Tue',
8 '03' => 'Wed',
9 '04' => 'Thu',
10 '05' => 'Fri',
11 '06' => 'Sat',
12 '07' => 'Sun',
13 );
14 $pointer = \%weekChapter;
15 $i = '05';
16 printf "\n ================== start test ================= \n";
17 #
18 # These next two lines should show an output
19 #
20 printf '$$pointer{$i} is ';
21 printf "$$pointer{$i} \n";
22 printf '${$pointer}{$i} is ';
23 printf "${$pointer}{$i} \n";
24 printf '$pointer->{$i} is ';
25
26 printf "$pointer->{$i}\n";
27 #
28 # These next two lines should not show anything
29 #
30 printf '${$pointer{$i}} is ';
31 printf "${$pointer{$i}} \n";
32 printf '${$pointer->{$i}} is ';
33 printf "${$pointer->{$i}}";
34 printf "\n ================== end of test ================= \n";
35
================== start test =================
$$pointer{$i} is Fri
${$pointer}{$i} is Fri
$pointer->{$i} is Fri
${$pointer{$i}} is
${$pointer->{$i}} is
================== end of test =================

As you can see, the first two lines provided
the expected output. The first reference is used in the same way
as references to regular arrays. The second line uses the ${pointer}
and then indexes using {$i}, and the leftmost $
de-references (gets) the value at the location reached after the
indexing. See Lines 20 through 23.
| NOTE |
When in doubt, print it out. Always use the print statements in Perl to print out values of suspect code. This way you can be sure of how Perl is interpreting your code. Print statements are a cheap tool to use for learning how the Perl interpreter
works
|
Then, two lines of the output didn't work as expected. In the
third line, $pointer{$i} tries to reference an array
where there is no first element. Because the first element does
not point to a valid string, nothing is printed. Nothing is printed
in the fourth line of the output for the same reason. See lines
30 through 33.
You create a reference to an array through the statement @array
= list. You use square brackets to create a reference to
a complex anonymous array. Consider the following statement, which
sets the parameters for a three-dimensional drawing program:
$line = ['solid', 'black', ['1','2','3'] , ['4', '5', '6']];
The preceding statement constructs an array of four elements.
The array is referred to by the scalar $line. The first
two elements are scalars, indicating the type and color of the
line to draw. The next two elements are references to anonymous
arrays and contain the starting and ending points of the line.
To get to the elements of the inner array elements, you can use
the following multidimensional syntax:
| $arrayReference->[$index] | single-dimensional array
|
| $arrayReference->[$index1][$index2]
| two-dimensional array |
| $arrayReference->[$index1][$index2][$index3]
| three-dimensional array |
You can create as complex a structure as your sanity, design practices,
and computer memory allow. Be kind to the person who might have
to manage your code-please keep it as simple as possible. On the
other hand, if you are just trying to impress someone with your
coding ability, Perl gives you a lot of opportunity to mystify
yourself and improve your social life.
| TIP |
When you have more than three dimensions for any array, consider using a different data structure to simplify the code.
|
Let's see how creating arrays within arrays works in practice.
See Listing 18.4 to see how to print out the information pointed
at by the $list reference.
Listing 18.4. Using multi-dimensional array references.
1 #!/usr/bin/perl
2 #
3 # Using Multi-dimensional Array references
4 #
5 $line = ['solid', 'black', ['1','2','3'] , ['4', '5', '6']];
6 print "\$line->[0] = $line->[0] \n";
7 print "\$line->[1] = $line->[1] \n";
8 print "\$line->[2][0] = $line->[2][0] \n";
9 print "\$line->[2][1] = $line->[2][1] \n";
10 print "\$line->[2][2] = $line->[2][2] \n";
11 print "\$line->[3][0] = $line->[3][0] \n";
12 print "\$line->[3][1] = $line->[3][1] \n";
13 print "\$line->[3][2] = $line->[3][2] \n";
14 print "\n"; # The obligatory output beautifier.
$line->[0] = solid
$line->[1] = black
$line->[2][0] = 1
$line->[2][1] = 2
$line->[2][2] = 3
$line->[3][0] = 4
$line->[3][1] = 5
$line->[3][2] = 6
What about the third dimension for an array? Look at a modified
version of the same program but add a new twist to the list just
created. See Listing 18.5.
Listing 18.5. Using multi-dimensional array references again.
1 #!/usr/bin/perl
2 #
3 # Using Multi-dimensional Array references again
4 #
5 $line = ['solid', 'black', ['1','2','3', ['4', '5', '6']]];
6 print "\$line->[0] = $line->[0] \n";
7 print "\$line->[1] = $line->[1] \n";
8 print "\$line->[2][0] = $line->[2][0] \n";
9 print "\$line->[2][1] = $line->[2][1] \n";
10 print "\$line->[2][2] = $line->[2][2] \n";
11 print "\$line->[2][3][0] = $line->[2][3][0] \n";
12 print "\$line->[2][3][1] = $line->[2][3][1] \n";
13 print "\$line->[2][3][2] = $line->[2][3][2] \n";
14 print "\n";
There is no output for this listing.

In this example of an array that's three deep,
you must use a reference such as $line ->[2][3][0].
For a C programmer, this is akin to the statement Array_pointer[2][3][0],
where the pointer is pointing to what's declared as an array with
three indices.
Can you see how easy it is to set up complex structures of arrays
within arrays? The examples shown thus far have used only hard-coded
numbers as the indices. There is nothing preventing you from using
variables instead.
As with array constructors, you can mix and match hashes and arrays
to create as complex a structure as you want.
Let's see how these two hashes and arrays can be combined. Listing
18.6 uses the point numbers and coordinates to define a cube.
Listing 18.6. Defining a cube.
1 #!/usr/bin/perl
2 #
3 # Using Multi-dimensional Array and Hash references
4 #
5 %cube = (
6 '0', ['0', '0', '0'],
7 '1', ['0', '0', '1'],
8 '2', ['0', '1', '0'],
9 '3', ['0', '1', '1'],
10 '4', ['1', '0', '0'],
11 '5', ['1', '0', '1'],
12 '6', ['1', '1', '0'],
13 '7', ['1', '1', '1']
14 );
15 $pointer = \%cube;
16 print "\n Da Cube \n";
17 foreach $i (sort keys %$pointer) {
18 $list = $$pointer{$i};
19 $x = $list->[0];
20 $y = $list->[1];
21 $z = $list->[2];
22 printf " Point $i = $x,$y,$z \n";
23 }
There is no output for this listing.

In Listing 18.6, %cube contains point
numbers and coordinates in a hash. Each coordinate itself is an
array of three numbers. The $list variable is used to
get a reference to each coordinate definition with the following
statement:
$list = $$pointer{$i};
After you get the list, you can reference off of it to get to
each element in the list with the following statement:
$x = $list->[0];
$y = $list->[1];
The same result-assigning values to $x, $y,
and $z-could be achieved with the following two lines
of code:
($x,$y,$z) = @$list;
$x = $list->[0];
This works because you are de-referencing what $list
points to and using it as an array, which in turn is assigned
to the list ($x,$y,$z). The $x is still assigned
with the -> operator.
When you're working with hashes or arrays, de-referencing by ->
is similar to de-referencing by $. When you are accessing
individual array elements, you are often faced with writing statements
such as the following:
$$names[0] = "Kamran";
$names->[0] = "Kamran";
Both lines are equivalent. The $names in the first line
has been replaced with the -> operator in the second
line. In the case of hashes, the two statements that do the same
type of referencing are listed as shown in the following code:
$$lastnames{"Kamran"} = "Husain";
$lastnames->{"Kamran"} = "Husain";
Array references are created automatically when they are first
referenced in the left side of an equation. Using a reference
such as $array[$i] creates an array into which
you can index with $I. Scalars and even multidimensional
arrays are created the same way. The following statement creates
the contours array if it did not already exist:
$contours[$x][$y][$z] = &xlate($mouseX,$mouseY);
Arrays in Perl can be created and grown on demand. Referencing
them for the first time creates the array. Referencing them again
at different indices creates the referenced elements for you.
In the same way you reference individual items such as arrays
and scalar variables, you can also point to subroutines. This
is similar to pointing to a function in C. To construct such a
reference, you use the following type of statement:
$pointer_to_sub = sub { ... declaration of sub ... } ;
Notice the use of the semicolon at the end of the sub
declaration. The subroutine pointed to by $pointer_to_sub
points to the same function reference even if this statement
is placed in a loop. This feature of Perl enables you to declare
anonymous sub() functions in a loop without worrying
about whether you are chewing up memory by declaring the same
function over and over.
To call a subroutine by reference, you must use the following
type of reference:
&$pointer_to_sub( parameters );
This code works because you are de-referencing the $pointer_to_sub
and using it with the ampersand (&) as a pointer
to a function. The parameters portion might or might
not be empty depending on how your function is defined.
The code within a sub is simply a declaration created
through a previous statement. The code within the sub
is not executed immediately, however. It is compiled and set for
each use. Consider Listing 18.7.
Listing 18.7. References to subroutines.
1 #!/usr/bin/perl
2 sub print_coor{
3 my ($x,$y,$z) = @_;
4 print "$x $y $z \n";
5 return $x;};
6 $k = 1;
7 $j = 2;
8 $m = 4;
9 $this = print_coor($k,$j,$m);
10 $that = print_coor(4,5,6);
$ test
1 2 3
4 5 6

This output reflects that the assignment of
$x, $y, and $z was done when the first
declaration of print_coor was encountered as a call.
In Listing 18.7, each reference $this and $that
points to a different subroutine, the arguments to which were
passed at run- time.
Subroutines are not limited to returning data types only; they
can also return references to other subroutines. The returned
subroutines run in the context of the calling routine but are
set up in the original call that created them. This behavior is
due to the way closure is handled in Perl. Closure means
that if you define a function in one context, it runs in that
particular context where it was first defined. (See a guide on
object-oriented programming to get more information on closure.)
For an example of how closure works, Listing 18.8 shows code that
you could use to set up different types of error messages. Such
subroutines are useful in creating templates of all error messages.
Listing 18.8. Using closures.
#!/usr/bin/perl
sub errorMsg {
my $lvl = shift;
#
# define the subroutine to run when called.
#
return sub {
my $msg = shift; # Define the error type now.
print "Err Level $lvl:$msg\n"; }; # print later.
}
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");
&$severe("Divide by zero");
&$fatal("Did you forget to use a semi-colon?");
&$annoy("Uninitialized variable in use");
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");

The subroutine errorMsg declared here
uses a local variable called lvl. After this declaration,
errorMsg uses $lvl in the subroutine it returns
to the caller. The value of $lvl is therefore set in
the context when the subroutine errorMsg is first called,
even though the keyword my is used. The three calls that
follow set up three different $lvl variable values, each
in their own context:
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");
When the subroutine, errorMsg, returns, the value of
$lvl is retained for each context in which $lvl
was declared. The $msg value from the referenced call
is used, but the value of $lvl remains what was first
set in the actual creation of the function.
Sounds confusing? It is. This is primarily the reason you do not
see such code in most Perl programs.
Using arrays is great for collecting relevant information in one
place. Now let's see how we can work with multiple arrays through
subroutines. You pass one or more arrays into Perl subroutines
by reference. However, you have to keep in mind a few subtle things
about using the @_ symbol when you process these arrays
in the subroutine. Look at Listing 18.9, which is an example of
a subroutine that expects a list of names and a list of phone
numbers.
Listing 18.9. Passing multiple arrays.
1 #!/usr/bin/perl
2 @names = (mickey, goofy, daffy );
3 @phones = (5551234, 5554321, 666 );
4 $i = 0;
5 sub listem {
6 my (@a,@b) = @_;
7 foreach (@a) {
8 print "a[$i] = ". $a[$i] . " " . "\tb[$i] = " . $b[$i] ."\n";
9 $i++;
10 }
11 }
12 &listem(@names, @phones);
a[0] = mickey b[0] =
a[1] = goofy b[1] =
a[2] = daffy b[2] =
a[3] = 5551234 b[3] =
a[4] = 5554321 b[4] =
a[5] = 666 b[5] =

Whoa! What happened to the @b array,
and why is the rest of @a just like the array @b?
This result occurs because the array @_ of parameters
in a subroutine is one-I repeat, only one-long list of parameters.
If you pass in fifty arrays, the @_ is one array of all
the elements of the fifty arrays concatenated together.
In the subroutine in Listing 18.9, the assignment my (@a,
@b) = @_ gets loosely interpreted by your Perl interpreter
as, "Let's see, @a is an array, so assign one array
from @_ to @a and then assign everything else
to @b." Never mind that the @_ is
itself an array and will therefore get assigned to @a,
leaving nothing to assign to @b.
To illustrate this point, let's change the script to how it appears
in Listing 18.10.
Listing 18.10. Passing a scalar and an array.
#!/usr/bin/perl
@names = (mickey, goofy, daffy );
@phones = (5551234, 5554321, 666 );
$i = 0;
sub listem {
my ($a,@b) = @_;
print " \$a is " . $a . "\n";
foreach (@b) {
print "b[$i] = $b[$i] \n";
$i++;
}
# --------------------------------------------------
# Actually, you could write the for loop as
# foreach (@b) {
# print $_ . "\n" ;
# }
# This your secret answer to Quiz question 18.4.
# ----------------------------------------------------
}
&listem(@names, @phones);
$ testArray
$a is mickey
b[0] = goofy
b[1] = daffy
b[2] = 5551234
b[3] = 5554321
b[4] = 666

Do you see how $a was assigned the
first value and then @b was assigned the rest of the
values? In order get around this @_ interpretation feature
and pass arrays into subroutines, you have to pass arrays in by
reference, which you do by modifying the script to look like the
following:
#!/usr/bin/perl
@names = (mickey, goofy, daffy );
@phones = (5551234, 5554321, 666 );
$i = 0;
sub listem {
my ($a,$b) = @_;
foreach (@$a) {
print "a[$i] = " . @$a[$i] . " " . "\tb[$i] = " . @$b[$i] ."\n";
$i++;
}
}
&listem(\@names, \@phones);
The following major changes were necessary to bring the original
script to this point:
- The local variables for the sub listem are now scalars,
not array references. As a result, $a is the first item
on the @_ list, and $b is the second item.
- The local parameters ($a and $b) are used
as array references with the statements @$a and @$b,
respectively.
- The call to the subroutine passes the references to the arrays
with the backslash, \@names and \@phones, thus
passing only two items to the subroutine.
The following output matches what we expected:
$ testArray2
a[0] = mickey b[0] = 5551234
a[1] = goofy b[1] = 5554321
a[2] = daffy b[2] = 666
 |
DO pass by reference whenever possible.
DO pass arrays by reference when you are passing more than one array to a subroutine.
DON'T use (@variable)=@_ in a subroutine unless you want to concatenate all the passed parameters into one long array
|
When used in a subroutine argument list, scalar variables are
always passed by reference. You do not have a choice here. You
can, however, modify the values of these variables if you really
want to. To access these variables, you can use the @_
array and index each individual element in it using $_[$index],
where $index counts from zero up.
Arrays and hashes are different beasts altogether. You can either
pass them as references once or pass references to each element
in the array. For long arrays, the choice should be fairly obvious-pass
the reference to the array only. In either case, you can use the
references to modify what you want in the original array.
The @_ mechanism concatenates all the input arrays in
a subroutine into one long array. This feature is nice if you
do want to process the incoming arrays as one long array. Usually,
you want to keep the arrays separate when you process them in
a subroutine, and passing by reference is the best way to do that.
Hold that thought: Don't use globals.
In short, pass by reference and respect the value of any global
variable unless there is a strong compelling reason not to.
Sometimes, you have to write the same output to different output
files. For example, an application programmer might want the output
to go to the screen in one instance, the printer in another, and
a file in another-or even all three at the same time. Rather than
make separate statements for each handle, it would be nice to
write something like the following:
spitOut(\*STDIN);
spitOut(\*LPHANDLE);
spitOut(\*LOGHANDLE);
Notice that the file handle reference is sent with the \*FILEHANDLE
syntax because you refer to the symbol table in the current
package. In the subroutine that handles the output to the file
handle, you would have code that looks something like the following:
sub spitOut {
my $fh = shift;
print $fh "Gee Wilbur, I like this lettuce\n";
}
In UNIX (and other operating systems), the asterisk is a sort
of wildcard operator. In Perl, you can refer to other variables
and so on by using the asterisk operator:
*iceCream;
When used in this manner, the asterisk is also known as a typeglob.
The asterisk at the beginning of a term can be thought of as a
wildcard match for all the mangled names generated internally
by Perl.
You can use a typeglob in the same way you use a reference because
the de-reference syntax always indicates the kind of reference
you want. ${*iceCream} and ${\$iceCream} both
indicate the same scalar variable. Basically, *iceCream
refers to the entry in the internal _main associative
array of all symbol names for the _main package. *kamran
really translates to $_main{'kamran'} if you are in the
_main package context. If you are in another package,
the _packageName{} hash is used.
When evaluated, a typeglob produces a scalar value that represents
the first objects of that name. This includes file handles, format
specifiers, and subroutines.
Using brackets around references makes constructing strings easier:
$road = ($w) ? "free":"high";
print "${road}way";
The preceding line prints highway or freeway
depending on the value of $w. This syntax will be familiar
to you if you write make files or shell scripts. In fact, you
can use this ${variable} construct outside of double
quotes, as in the following example:
print ${road};
print ${road} . "way";
print ${ road } . "way";
You can also use reserved words in the ${ } brackets.
Check out the following lines:
$if = "road";
print "\n ${if} way \n";
Using reserved words for anything other than their intended purpose,
however, is playing with fire. Be imaginative and make up your
own variables. You can use reserved words but will have to remember
to force interpretation as a reserved word by adding anything
that makes it more than a reference. It's generally not a good
idea to use a variable called ${while}, because it is
confusing to read.
When you work with hashes, you have to create an extra reference
to the index. In other words, you cannot use something like this:
$clients { \$credit } = "despicable" ;
The \$credit variable will be converted to a string and
won't be used correctly as an index in the hash. You have to use
a two-step procedure such as this:
$chist = \@credit;
$x{ $chist } = "despicable";
The preceding section brings up an interesting point about curly
braces for a use other than indexing into hashes. In Perl, curly
braces are usually reserved for delimiting blocks of code. Assume
you were returning the passed list by sorting it in reverse order.
The passed list is in @_ of the called subroutine, so
the following two statements are equivalent:
sub backward {
{ reverse sort @_ ; }
};
sub backward {
reverse sort @_ ;
};
When preceded by the @ operator, curly braces enable
you to set up small blocks of evaluated code.
#!/usr/bin/perl
sub average {
($a,$b,$c) = @_;
$x = $a + $b + $c;
$x2 = $a*$a + $b*$b + $c*$c;
return ($x/3, $x2/3 ); }
$x = 1;
$y = 34;
$x = 47;
print "The midpt is @{[&average($x,$y,$z)]} \n";
This script prints 27 and 1121.6666. In the last line of code
with the @{} in the double-quoted string, the contents
of the @{} are evaluated as a block of code. The block
creates a reference to an anonymous array that contains the results
of the call to the subroutine average($x,$y,$z). The
array is constructed because of the brackets around the call.
As a result, the [] construct returns a reference to
an array, which in turn is converted by @{} into a string
and inserted into the double-quoted string.
By now, you should be able to see the difference between hard
and symbolic links. Let's look at some of the minor details of
the two types of links and how these links are handled in Perl.
When you use a symbolic reference that does not exist, Perl creates
the variable for you and uses it. For variables that already exist,
the value of the variable is substituted for the $variable
string. This substitution is a powerful feature of Perl because
you can construct variable names from variable names.
Consider the following example:
1 $lang = "java";
2 $java = "coffee";
3 print "${lang}\n";
4 print "hot${lang}\n";
5 print "$$lang \n"
Look at line 5. The $$lang is first reduced to $java.
Then recognizing that $java can also be re-parsed, the
value of $java ("coffee") is used.
The value of the scalar produced by $$lang is taken to
be the name of a new variable, and the variable at $name
is used. The following is the output from this example:
java
hotjava
coffee
The difference between a hard reference ($lang) and a
symbolic reference ($$lang) is how the variable name
is derived. With a hard reference, you are referring to a variable's
value directly. Either the variable exists in the symbol table
for the package you are in (that is, which lexical context you
are in), or the variable does not exist. With a symbolic reference,
you are using another level of indirection by constructing or
deriving a symbol name from an existing variable.
To force only hard references in a program and protect yourself
from accidentally creating symbolic references, you can use the
module called strict, which forces Perl to do strict
type checking. To use this module, place the following statement
at the top of your Perl script:
use strict 'refs';
From this point on, only hard references are allowed for the rest
of the script. You place this use strict ... statement
within curly braces to limit the type checking to the code block
within the braces. For example, in the following code, the type
checking would be limited to the code in the subroutine java():
sub java {
use strict "refs";
#
# type checking here.
}
...
# no type checking here.
To turn off the strict type checking at any time within a code
block, use this statement:
no strict 'refs';
One last point: Symbolic references cannot be used on variables
declared with the my construct because these variables
are not kept in any symbol table. Variables declared with the
my construct are valid only for the block in which they
are created. Variables declared with the local word are
visible to all ensuing lower code blocks because they are in a
symbol table.
In addition to consulting the obvious documents such as the Perl
man pages, look at the Perl source code for more information.
The 't/op' directory in the Perl source tree has some
regression test routines that should definitely get you thinking.
A lot of documents and references are available at the Web sites
www.perl.com and www.metronet.com.
The two types of references in Perl 5 are hard and symbolic. Hard
links work like hard links in UNIX file systems. You can have
more than one hard link to the same item; Perl keeps a reference
count for you. This reference count is incremented or decremented
as references to the item are created or destroyed. When the count
goes to zero, the link and the object it is pointing to are both
destroyed. Symbolic links, which are created through the ${}
construct, are useful in providing multiple stages of references
to objects.
You can have references to scalars, arrays, hashes, subroutines,
and even other references. References themselves are scalars and
have to be de-referenced to the context before being used. Use
@$pointer for an array, %$pointer for a hash,
&$pointer for a subroutine, and so on for dereferencing.
Multidimensional arrays are possible using references in arrays
and hashes.
Parameters are passed into a subroutine through references. The
@_ array is really all the passed parameters concatenated
in one long array. To send separate arrays, use the references
to the individual items.
Tomorrow's lesson covers Perl objects and references to objects.
We have deliberately not covered Perl objects in this chapter
because it requires some knowledge of references. References are
used to create and refer to objects, constructors, and packages.
| Q: | How do I know what type of address a pointer is pointing to?
|
| A: | The address printed out with the print statement on a reference has a qualifier word in front of it. For example, a reference to a hash has the word HASH followed by an address value, an
array has the word ARRAY, and so on.
|
| Q: | How are multidimensional arrays possible using Perl?
|
| A: | References in Perl point to scalars only. References to arrays point to the beginning of the array. Arrays can contain references to other arrays, hashes, and so on. The way to create multidimensional arrays
in Perl is by using references to references.
|
| Q: | What's the best way to pass more than one array into a subroutine?
|
| A: | Pass references to the arrays, using the \@arrayname for each array passed-as in the following call:
mysub(\@one, \@two);
Within the subroutine, take each reference off one at a time.
my ($a, $b) = @_;
Now use @$a and @$b to get to the arrays passed into the subroutines.
|
| Q: | Why is *moo more efficient to use than $_main{'moo'}? Is there a difference in usage?
|
| A: | Both *moo and $_main{'moo'} mean the same variable (as long as you aren't using a package). *moo is more efficient because the reference is looked up once at compile time, whereas
$_main{'moo'} is evaluated at runtime and evaluated each time it is run.
|
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the
quiz and exercise answers before you go on to tomorrow's lesson.
- Given that $pointer is a pointer to a hash, what's
wrong with the following line of code?
$x= ${$pointer->{$i}};
- Why is $b not being set in the following line of
code? What do you have to do to make it okay?
sub xxx {
my ($a, $b) = @_;
}
- What's the difference between these two lines of code?
printf "$i : $$pointer[$i++]; ";
printf " and $i : $pointer->[$i++];
\n";
- What do the following lines of code print out?
$HelpHelpHelp = \\\"Help";
print $$$$HelpHelpHelp;
- What's the use of the ${variable} construct? How
could the following three lines of code be rewritten?
$name = ${$scalarref};
draw(@{$coordinates}, $display);
${$months}[0] = "March";
- Write a Perl script to print out address types of different
variables and complex structures.
- Write a Perl code fragment that constructs an array of pointers
to functions. How would you use it?
Strong Hint:
$foo = sub foo { print "foo\n"; }
$bar = sub bar { print "bar\n";
}
$yuk = sub yuk { print "yuk\n"; }
$huh = sub huh { print "huh\n"; }
@list = ($foo, $bar, $yuk, $huh);
- Explain the difference between hard and symbolic references.
- Write a Perl subroutine that takes two arrays as arguments
and returns the reverse-sorted copy of each array.
- Modify the following script to print the value of $this
and $that. Are they the same? If not, why not?
#!/usr/bin/perl
sub print_coor{
my ($x,$y,$z) = @_;
print "$x $y $z \n";
return $x;};
$k = 1;
$j = 2;
$m = 4;
$this = print_coor($k,$j,$m);
$that = print_coor(4,5,6);

|