Forking in Perl 2

Build up an index of files to process, e.g. SAM files. Fork out 16 child processes, each time processing and eliminating one file from the index. As with all my code, use at your own risk. Comments and suggestions always welcome.

#!/usr/bin/perl

use strict;
use warnings;

my $fork_process = '16';
my @file_for_processing = ();
my @child = ();

opendir(DIR,'.') || die "Could not open current directory: $!\n";
while(my $file = readdir(DIR)){
   next unless $file =~ /\.sam.gz$/;
   push(@file_for_processing,$file);
}
closedir(DIR);

while(scalar(@file_for_processing) > 0){
   for (1 .. $fork_process){
      my $pid = fork();
      if ($pid) {
         # parent
         push(@child, $pid);
         pop(@file_for_processing);
      } elsif ($pid == 0) {
         # child
         if (scalar(@file_for_processing) > 0){
            print "CHILD: processing $file_for_processing[-1]\n";
         }
         exit(0);
      } else {
         die "Couldn't fork: $!\n";
      }
   }
   foreach my $pid (@child) {
      waitpid($pid, 0);
   }
}

exit(0);

For more information see Forking in Perl.

Using Perl to log transform data

Very simple code using Perl to log transform (base 2) a list of numbers. 0 values are converted into 0.5, since you cannot take the logarithm of 0. For this example numbers are stored in the array @n.

#!/bin/env perl

use strict;
use warnings;

#my list of numbers
my @n= qw/0 2 4 6 8 10 20 40 80 160 320/;

#my log transformed numbers
my @l= log2(@n);
print join("\t", @l), "\n";
#outputs
#-1.00   1.00    2.00    2.58    3.00    3.32    4.32    5.32    6.32    7.32    8.32

exit(0);

sub log2 {
   my @n = @_;
   my @t = ();
   foreach my $n (@n){
      if ($n == 0){
         $n = '0.5';
      }
      my $t = log($n)/log(2);
      #rounded to two decimal places
      $t = sprintf("%.2f",$t);
      push(@t,$t);
   }
   return(@t);
}

Using R

I know this post is about using Perl to log transform data, but I've been using R more and more and it's much easier.

number <- c(0, 2, 4, 6, 8, 10, 20, 40, 80, 160, 320)
number
 [1]   0   2   4   6   8  10  20  40  80 160 320
log2(number)
 [1]     -Inf 1.000000 2.000000 2.584963 3.000000 3.321928 4.321928 5.321928
 [9] 6.321928 7.321928 8.321928
#log base 10
log(base=10, x=number)
 [1]      -Inf 0.3010300 0.6020600 0.7781513 0.9030900 1.0000000 1.3010300
 [8] 1.6020600 1.9030900 2.2041200 2.5051500