ECE 161: Programming Languages

Fall 2005

Monday 5:00 - 6:00 Rm 643E, Wednesday 3:00 - 5:00 Rm 621E

Click here for main page of course...

Topic #12: Perl

Perl is the only truly interpreted language that we have covered in our lectures. As soon as you finish typing or changing your source code, you can run it using a Perl interpreter, which, on most Unix systems, is just called "perl". Also unlike other languages we have considered, there is no specific starting point (e.g., a "main" function); the code just starts running from top to bottom, skipping functions, if any, unless they are called. Some people consider Perl more of a scripting language than a full-fledged programming language, but that is really just a matter of semantics. Of the languages we have seen, Perl has the simplest "Hello World!" program, consisting of just a single statement that calls the "print" function. (Functions calls in Perl are typically not required to use parentheses around arguments unless you are calling a user defined function that has not been previously declared.) Although interpreted languages tend to be slower than compiled languages, Perl has been optimized for certain functionality, and in particular text processing, so it is often used for programs that perform a lot of text processing; Perl is also a good choice for quickly writing scripts if performance is not particularly important.

Perl has three types of variables: scalars, arrays (a.k.a. lists), and associative arrays (a.k.a. hashes). Variables do not need to be declared; the interpreter will just figure out what type of variable it is based on context. If you want to be forced to declare variables (to help catch errors), you can include "use strict;" at the top of your program. Variables are global by default, but you can make them local to a scope (e.g., a function) by using the "my" keyword.

All scalar variables (including integers, floating point numbers, and strings) start with "$"; all arrays start with "@"; and all associative arrays start with "%". When referring to an element of an array or associative array, you use a "$", since the value is a scalar. In addition to the loops we have seen in other languages, Perl includes a "foreach" loop to loop through the scalar values in an array. Perl also provides the keywords "keys" and "values" to get all of the indices or values from an associative array (which are placed into a regular array). The provided "sort" command will sort the elements of an array, and it can be used in conjunction with "keys" to loop through the elements of a hash in order. If an array is assigned to a scalar, the scalar really is assigned the length of the array (i.e., the number of elements in the array). Perl performs garbage collection automatically, so you don't have to worry about losing references to memory, e.g. if you reassign an array or hash.

Perl provides STDIN as the standard input stream, and a line can be read from a stream by enclosing the name of the stream within angle brackets "<>". The newline character is part of the line, but it can be removed with "chomp" or "chop" (the former only removes the last character of a string if it is the newline character, but the latter removes the last character of a string no matter what). You can use the "open" command to open streams referring to files for reading or writing. If you open a file for reading, it can be treated just like STDIN.

There are certain situation in Perl in which certain varaibles are set up for you automatically. For example, functions in Perl do not have to declare their return type or parameters. If parameters are passed to a function, they will be stored in an array automatically named @_. When a line is read from a stream, it is automatically placed in a variable named $_. This can be useful for looping through lines read from a stream.

One of the things that makes Perl a very useful language is that it recognizes regular expressions. Perl understands many symbols in regular expressions, including "*" (zero or more instances), "." (any character), etc. In Perl, a string can be "matched" to a regular expression. Pieces of a string matched to the regular expression can be captured, meaning that these pieces of the string that match sections of the regular expression are automatically placed in automatically created variables named $1, $2, etc. There are also several special functionalities that regular expressions can be used to perform in Perl; e.g., search and replace. We have seen that these capabilities have enabled us to write a word counting program in Perl that is much shorter than the versions in C or C++. In the next version of Perl, regular expressions will become even more useful!