13 July 2010
A Batch Script for Hadoop on VMs
I've been benchmarking modifications to Hadoop in virtual machines lately, with others using the same server to benchmark their code. Paul suggested I should write a batch script we can use to run our benchmark jobs with the PBS batch scheduler, which is already set up on our server. The trickiness is that ideally we should not be running VMs when we're not using them, to avoid memory and cache contention. So, I wrote the following script, which starts the VMs, starts Hadoop, runs Hadoop commands from a file, shuts down Hadoop, and shuts down the VMs.
#!/bin/zsh
#PBS -N hadoop_job
#PBS -l nodes=1:ppn=8
#PBS -V
###############################################################################
# #
# submit_hadoop.sh #
# Adam Wolfe Gordon, June 2010 #
# #
# Usage: ./submit_hadoop.sh #
# #
# Starts hadoop in virtual machines, runs the hadoop commands from a file #
# called commands in the current (or PBS working) directory, then shuts down #
# hadoop and the virtual machines. #
# #
# Useful for batch scheduled submission of VM-based hadoop jobs when #
# benchmarking on a shared system. #
# #
# Relies on some zsh-isms, so probably don't run it with another shell. #
# #
###############################################################################
# Set these to where hadoop lives, and where your VMs live.
# Your VMs must be started by a script called run in $VM_HOME.
HADOOP_HOME=/local/data/awolfe/hadoop_stuff/hadoop-0.20.2/hadoop
VM_HOME=/local/data/awolfe/hadoop_stuff/ubuntu_vms
# If we were submitted with qsub, then go into our work directory.
if [ -n $PBS_O_WORKDIR ]; then
cd $PBS_O_WORKDIR;
fi;
# Make sure the commands file exists.
if [ !-e commands ]; then
echo "commands file not found. Aborting."
exit 1;
fi;
# Start the virtual machines
$VM_HOME/run;
sleep 2;
# Wait for them to come up
for i in $(cat $HADOOP_HOME/conf/slaves); do
while true; do
nc -w0 $i 22;
if [ $? = 0 ]; then break; fi;
done;
done;
# Format the HDFS, since it will have gone away on the VMs.
yes Y | $HADOOP_HOME/bin/hadoop namenode -format;
# Start hadoop
$HADOOP_HOME/bin/start-all.sh;
# Wait hadoop to come up
for i in $(cat $HADOOP_HOME/conf/slaves); do
# HDFS
while true; do
nc -w0 $i 50010;
if [ $? = 0 ]; then break; fi;
done;
while true; do
nc -w0 $i 50075;
if [ $? = 0 ]; then break; fi;
done;
# MR
while true; do
nc -w0 $i 50060;
if [ $? = 0 ]; then break; fi;
done;
done;
# Now we can run our job
export TIMEFMT="TIME: %J -- %U user %S system %P cpu %*E total";
(cat commands |
while read cmd; do
d="/usr/bin/time -f '%C -- %U user %S system %P cpu %e total' $HADOOP_HOME/bin/hadoop $cmd 2>&1";
eval "$d";
done;
) 2>&1 > output.txt;
# Shut down hadoop
$HADOOP_HOME/bin/stop-all.sh
# Shut down the VMs - this requires passwordless ssh and passwordless sudo for /sbin/halt
for i in $(cat $HADOOP_HOME/conf/slaves); do
ssh $i sudo /sbin/halt;
done;
RSS Feed