13 July 2010
A Batch Script for Hadoop on VMs
I've been benchmarking modifications to Hadoop in virtual machines lately, with others using the same server to benchmark their code. Paul suggested I should write a batch script we can use to run our benchmark jobs with the PBS batch scheduler, which is already set up on our server. The trickiness is that ideally we should not be running VMs when we're not using them, to avoid memory and cache contention. So, I wrote the following script, which starts the VMs, starts Hadoop, runs Hadoop commands from a file, shuts down Hadoop, and shuts down the VMs.
RSS Feed