One of my complaints about programming is that it’s often overcomplicated. Sometimes you need a simple solution with minimal dependencies that gets the job done. This is certainly true of utility scripts designed to run on more than one system or in divergent environments.
I have a section of code that must be run over and over, takes several seconds to run, doesn’t put much load on the system and the overall script run time is too long–a perfect case for multi processing or multi threading. It’s written in perl and the server has plenty of memory and cpu so so I chose multi processing.
Perl multi-processing is implemented on top of UNIX multi processing so the how to reads like a computer science class on UNIX programming. Here’s a working prototype that took a lot of tweaking to develop. Details of problems I encountered are below the code.
#!/usr/bin/perl -w
#
# simple multi processes prototype
# Morgan Jones (mjones7@morganjones.org)
# $Id: multi_proc_proto.pl 125 2008-12-29 21:50:41Z morgan $
use strict;
use POSIX ":sys_wait_h";
my $parallelism = 4; # number of processes to run simultaneously.
my $pids; # keep track of PIDs as child processes run
for (my $i=0; $i<100; $i++) {
my $proc_running = 0; # flag to indicate a process has been started.
do {
my $pidcount = keys %$pids;
print "pidcount: $pidcount, parallelism: $parallelismn";
if ($pidcount < $parallelism) {
print "starting \"job\" $in";
my $pid = fork();
if (defined $pid && $pid == 0) {
# do the meat of the work here..
my $t;
do { $t = int (rand(3)); } until ($t>0);
sleep $t;
exit 0;
} elsif (defined $pid && $pid > 0) {
# keep track of running processes
$pids->{$pid} = 1;
$proc_running++;
} else {
# TODO: count number of failures and exit after too many
print "problem forking..Sleeping and retrying..n";
sleep 1;
next;
}
} else {
my $proc_reclaimed = 0;
for my $p (keys %$pids) {
# be careful waiting on $p or substituting'0' for WNOHANG
my $ret = waitpid(-1, WNOHANG);
if ($ret<0 || $ret>0) {
print "process $ret finished..n";
delete $pids->{$ret};
$proc_reclaimed = 1;
}
}
# only sleep if one or more process didn't finish.. this
# allows us to spin off a new process if one's available
# but keeps us from busy waiting if all the processes
# are busy.
unless ($proc_reclaimed) {
print "sleeping: no processes reclaimed...n";
sleep 1;
}
}
} until ($proc_running);
}
My initial version did a waitpid($p, WNOHANG)
, I also tried waitpid(-1, 0)
. In both cases it caused the load on the system to go up very quickly, eventually locking the system completely. Searching out result sets less of a few hundred doesn't exhibit the problem.
Running it through the debugger showed the child process finishing. strace showed the process completing. ps would show the process still running. The only symptom was a high sys cpu utilization.