Friday, March 21, 2003

Scalars (cont'd)

Variables

In perl, the type of a variable is determined by its very first character. For scalars, the very first character is always a dollar sign ($). Other data types use characters such as @ and % as their prefix (more on these later). We can assign scalars to scalar variables and we can display them. For example:


#!/usr/bin/perl -w

use strict;

my $str = "Hello, world!";
my $num = 12345.678;

print "\$str is \"$str\" and \$num is $num\n";

We assign two scalars to two variables. Note the use of the my keyword in perl. Normally, perl does not require that you declare your variables before you use them. Unfortunately, this can lead to very difficult-to-debug errors in your perl scripts. The use strict; forces us to declare all of our variables before use. In order to declare a variable, we precede its first occurrence with the word my. As in C and C++ we can initialize the variable when we declare it, but we are not obligated to do so.

We could also use parenthesis combine the two assignments in one single statement as follows:

my ($str, $num) = ("Hello, world!", 12345.78);

The argument to the print is enclosed in double quotes. As a result the \n code sequence at the end will be treated as a newline. We also specify the two variables inside the double quoted string as well. When perl displays this string, the variable names will be substituted with their actual value -- this process is called interpolation. The following output is the result:

$str is "Hello, world!" and $num is 12345.678

In order to literally display a dollar sign and double quote inside this string, we must escape these characters by placing a backslash before them.

Note that interpolation does not happen inside single quoted strings. Therefore, if we had used single quotes instead of double quotes around the argument to the print function, the output would have been:

\$str is \"$str\" and \$num is $num\n

Which is quite different from the output given above. There is no newline at the end of the string. The last two characters displayed are a backslash and an n. The backslashes which were used to escape special character (the $ and the ") inside the double quoted string are now displayed literally when used inside the single quoted string.

Conditionals and Looping

As expected, perl supports the if control structure which is similar to C and C++:


#!/usr/bin/perl -w

use strict;

my $num = 123;
my $str = 5;

if ($num gt $str) {
	print "$num gt $str\n";
} else {
	print "$num le $str\n";
}

if ($num > $str) {
	print "$num > $str\n";
} else {
	print "$num <= $str\n";
}

The braces around each of the blocks of code are required by perl even though they each contain only one statement (this is different from C and C++). Also note that there are separate string and numeric comparison operators. When comparing numeric values, we use the traditional < > <= >= == relational operators. When comparing strings, we use the analogous le gt le ge eq operators. Accidentally using the numeric comparison operator when comparing two strings is a common novice mistake in perl.

The output from the above program is:

123 le 5
123 > 5

The string 123 is alphabetically less than (or equal to) 5 (since the character 1 is less than the character 5). However, the number 123 is obviously numerically greater than 5.

When a value is used in a conditional, the conditional is treated as true if value is either 0, the empty string, or the special value undef. The string "0" is also treated as false. All other values are true.

`while` loops and input

Perl also supports iteration via the while control structure:


#!/usr/bin/perl -w

use strict;

print "Input a number: ";
chomp(my $num = <STDIN>);
my $sum;

while ($num) {
	$sum += $num--;
	print "Running total is $sum\n";
}

The special notation <STDIN> means read a line from input. STDIN is analogous to stdin in C and cin in C++. It represents the standard input stream, typically the keyboard. As with C and C++, you can redirect standard input to a perl script from a file by using the < redirection symbol on the command line, although as we'll see later, this is rarely done with perl scripts.

The line of input assigned to the $num variable has a newline at the end of it. In order to get rid of the new line we use the chomp function. This function, when applied to a scalar variable will remove a newline character (if one exists) from the end of the variable. If no newline character is at the end, the function does nothing.

As with C and C++, perl supports post/pre increment/decrement operators that behave in much the same way. Unlike C and C++, we can apply these operators to strings as well as numbers:


#!/usr/bin/perl -w

use strict;

my $str = "aa";

while ($str ne "zz") {
	print $str++, "\n";
}
print "$str\n";

The above program outputs:

aa
ab
ac
... 670 lines deleted ...
zx
zy
zz

Note that when printing the string we specify a comma separated list of arguments to the print function. If we had written print "$str++\n";, then ++ would have been treated as if it were a string instead of an operator, the $str scalar would therefore not be updated and the loop would continue forever, displaying the string "aa++".

We do the last print to finish off the output of the last string zz

`undef` and `defined`

When a scalar is created in perl, it is given the initial value of undef (which means that it does not have a meaning value). We can test for this value by using the o For example, consider the following script.


#!/usr/bin/perl -w

use strict;

my $str;
printf "\$str is %s\n", $str;
printf "\$str is %s\n", defined $str ? $str : "<undefined>";

$str = "Hello";
printf "\$str is %s\n", defined $str ? $str : "<undefined>";

$str = undef;
printf "\$str is %s\n", defined $str ? $str : "<undefined>";

$str++;
printf "\$str is %s\n", defined $str ? $str : "<undefined>";

We can use the defined function to determine if a variable has been given a value and we can use the undef operator to return a variable to an "uninitialized" state.

A variable which is uninitialized has a 0 value when used in a numeric context. When used in a string context it's value is the empty string. Therefore, in the above code, after we explicitly set $str to undef and then increment it, its new value will be 1. (This value is then converted to a string during the subsequent printf operation.)

Attempting to display an uninitialized (or undef'ed) variable will trigger the Use of uninitialized value warning when the -w option is used to start perl.

Note that perl supports the ternary operator ? : that functions the same way as it does in C and C++. Perl also has a printf function (like C). Because printf is a more expensive function than print, you should use print where possible and use printf only when formatted output is required.

Last modified: Mon Mar 24 11:59:59 2003