EID 151: Programming Languages

Spring 2004

Monday 3:00 - 5:00 Rm 521, Thursday 4:00 - 5:00 Rm 621

Click here for main page of course...

Topic #12: Perl

Perl is the only truly interpreted language that we have covered in our lectures (if you can consider it covered after only one lecture). As soon as you finish typing or changing your source code, you can run it using a Perl interpreter, which, on most Unix systems, is just called "perl". Also unlike other languages we have considered, there is no specific starting point (e.g. a "main" function), the code just starts running from top to bottom, skipping functions, if any, unless they are called. Some people consider Perl more of a scripting language than a full-fledged programming language, but that is really just a matter of semantics. Of the languages we have seen, Perl has the simplest "Hello World!" program, consisting of just a single "print" statement. Although interpreted languages tend to be slower than compiled languages, Perl has been optimized for certain functionality, and in particular text processing, so it is often used for programs that perform allot of text processing; Perl is also a good choice for quickly writing scripts if performance is not particularly important.

Perl has three types of variables: scalars, arrays, and associative arrays (a.k.a. hashes). Variables do not need to be declared, the interpreter will just figure out what type of variable it is based on context. If you want to be forced to declare variables (to help catch errors), you can include "use strict" at the top of your program. Variables are global by default, but you can make them local to a scope (e.g. a function) by using the "my" keyword.

All scalar variables (including integers, floating point numbers, and strings) start with "$", all arrays start with "@", and all associative arrays start with "%". When referring to an element of an array or associative array, you use a "$", since the value is scalar. In addition to the loops we have seen in other languages, Perl includes a "for_each" loop to loop through the scalar values in an array. Perl also provides the keywords "keys" and "values" to get all of the indices or values from an associative array (which are placed into a regular array). The provided "sort" command will sort the elements of an array, and it can be used in conjunction with keys to loop through the elements of a hash in order. If an array is assigned to a scalar, the scalar really is assigned the length of the array (i.e. the number of elements). Perl performs garbage collection automatically, so you don't have to worry about losing references to memory, e.g. if you reassign an array or hash.

Perl provides STDIN as the standard input string, and a line can be read from a string by enclosing the name of the stream within angle brackets "<>". The newline character is part of the line, but it can be removed with chomp or chop (the former only removes the last character of a string if it is the newline character, the latter removes the last character of a string no matter what). You can use the "open" command to open streams referring to files for reading or writing. If you open a file for reading, it can be treated just like STDIN.

There are certain situation in Perl in which certain varaibles are set up for you automatically. For example, functions in Perl do not have to declare their return type or parameters. If parameters are passed to a function, they will be stored in an array automatically named @_. When a line is read from a stream, it is automatically placed in a variable named $_. This can be useful for looping through lines coming from a stream.

One of the things that makes Perl a very useful language is that it recognizes regular expressions. Perl understands many symbols in regular expressions, including "*" (zero or more instances), "." (any character), etc. We went over some of the syntax of regular expressions in our lecture, but there is really a lot more Perl can do with them. In Perl, a string can be "matched" to a regular expression; this means that pieces of the string that match sections of the regular expression automatically placed in automatically created variables named $1, $2, etc. There are also several special functionalities that regular expressions can be used to perform in Perl, e.g. search and replace.