$attempt = 1; $date = '11/12/2013'; print "attempt $attempt on $date\n"; @elements = ('CDS', 'mRNA', 'tRNA'); print "First element: $elements[0]\n"; %roman = (1, 'I', 2, 'II', 3, 'III'); print "Roman for 3: $roman{3}\n";An easy way to try out Perl code is the debugger. It can be started by typing 'perl -d -e 42' at the command line. This will give a new prompt ('DB<1>') after which Perl statements can be typed for testing. The debugger provides extra functionality, for example examining the content of variables, which can be particularly useful for beginners.
A couple of rules are worth noting from the lines above:
# some standard mathematical operations print 3 * (5 + 10) - 2**4; # processing the content of variables $total_error = $false_positive + $false_negative; # increase value in variable $minutes by 30 $minutes += 30; # increase value in variable $hour by one $hour++; # decrease by one $remaining--; # repeat 'CG' 12 times $motif = 'CG' x 12; # the dot concatenates strings and content of variables $chr = 'chr' . $roman{$chr_number}; # two dots create lists by expanding from lower to higher border @hex = (1..9, a..f);
# functions for scalars $seq_len = length($seq); $rev_seq = reverse($seq); $upper_case = uc($seq); $lower_case = lc($seq); $codon = substr $seq, 0, 3; # remove white-space from end of line chomp $input_line; # functions for arrays @array = split //, $string; $first_element = shift @array; $last_element = pop @array; unshift @array, $first_element; push @array, $last_element; @alphabetically_sorted = sort @names; @numerically_sorted = sort { $a <=> $b } @values; # functions for hashes if (defined $description{$gene}) { print $description{$gene} } else { print 'not available'; } foreach (keys %headers) { print ">$_\n$headers{$_}\n"; }
# a progress meter for reading in long files: if ($line % 1000 == 0) { print STDERR " $line "; } # collect lines of sequence into one long lower-case string: while (<>) { chomp; $seq .= lc $_; } # exact motif search if (substr($seq, $pos, 10) eq $motif) { print "Motif found at position $pos!\n"; } # pad number with zeros at the front $num = '0'.$num until (length($num) >= $max_len);The line 'while (<>) {}' is a special Perl construct that reads line by line from standard input and stores each line in the special variable '$_'. A file name specified on the command line would be automatically opened by the shell and fed into the Perl program.
Regular expressions are specified within delimiters ('/' by default) and applied to the content of a variable with the '=~' operator. If a second expression is provided, then the first pattern will be replaced with the second. In addition, modifiers can be used, such as 'i' for case-insensitive matches and 'g' for global matches, instead of just the first one.
Special characters are available to match groups of characters, such as '\w' for any alphanumerical character, '\d' for numbers, and '\s' for white space. The negated class, e.g. not a digit, can be accessed through capital letters, such as '\D', '\W', and '\S'.
Occurrences can be specified through numbers in curly brackets, e.g. '{3}' for exactly three, or '{4,10}' for four to 10, or '{2,}' for two or more occurrences of a pattern. Special cases are '+' for one or more matches, '*' for zero or more matches, and '?' for zero or one match.
To refer to the matched patterns afterwards, round brackets are used and the special variables $1, $2, ..., depending on how many patterns are specified.
The following examples illustrate their usage:
# search $_ for the word regulator (ignoring case) and print if found if (/regulator/i) { print;} # check for non-numerical input if ($input =~ /\D/) { warn "Non-numerical input in '$input'\n"; } # remove all white space $input =~ s/\s//g; # find a pattern that is repeated at least 3 times and print if ($input =~ /(CG{3,})/) { print "Found pattern $1!\n"; } # split a string at tabulators and collect the elements in an array @list = split /\t/, $input;