DYNAMITE INSTALLATION AND USE IN ROCKS 2.3.2 CLUSTERS HOWTO =========================================================== (By Jose Luis Guisado, University of Extremadura, Spain, Aug-2005) E-mail: jlguisado at unex.es Dynamite is an automated load balancing system, through the migration of tasks of a parallel program using PVM. For more information and downloading: http://www.science.uva.nl/research/scs/Software/dynamite/index.html *DYNAMITE INSTALLATION: ----------------------- - Create directories ~/dynamite and ~/dynamite/distribution - Download distribution tar files in ~/dynamite/distribution: - Checkpointer source: dynckpt-RH-2.0.2-bin.tar.gz - Checkpointer testsuite: ckpt-test.tar.gz - DPVM archive: dpvm-2.0.tar.gz - DPVM testsuite: dpvm-test.tar.gz - Monitor/scheduler: monitor.tar.gz - Unpack everything in ~/dynamite 1) DINAMIC PVM (dpvm) INSTALLATION: - Add the following lines to ~/.bashrc # Added to setup Dynamite: # define PVM-related environment variables export PVM_ROOT=$HOME/dynamite/dpvm-2.0 export PVM_RSH=/usr/bin/ssh export PVM_ARCH=LINUX export PVM_DPATH=$PVM_ROOT/lib/pvmd # define Dynamite-related environment variables # Directory where the checkpoint files are stored: export DPVM_CKPTDIR=$HOME/dynamite/dynckpt-RH-2.0.2-bin/std # define monitor-related environment variables export HOST=$HOSTNAME # add PVM commands directory to your shell path export PATH=$PATH:$PVM_ROOT/lib # generic export PATH=$PATH:$PVM_ROOT/lib/$PVM_ARCH # arch-specific # add PVM executable directory to your shell path export PATH=$PATH:$PVM_ROOT/bin/$PVM_ARCH # add PVM man pages directory to MANPATH export MANPATH=$MANPATH:$PVM_ROOT/man # aliases and functions alias pvm='$PVM_ROOT/lib/pvm' - Execute ~/.bash_profile (which executes ~/.bashrc): $source .bash_profile - Modify /dynamite/dpvm-2.0/Makefile: PVM_ARCH = LINUX ... CC = /opt/gcc32/bin/gcc (The last line is to use gcc version 3.2 instead of the standard gcc version of Rocks 2.3.2 (gcc 2.96), which is a faulty development version) - Create directory to store output messages from compilation: $mkdir ~/dynamite/output - Compile DPVM: $cd ~/dynamite/dpvm-2.0 2>&1 make|tee ../output/dpvm-2.0.out make clean 2) CHECKPOINTER INSTALLATION: - After unpackaging the new versions of the dynamic loader (ld.so) with checkpointing support are ready to use. (They come in archive file dynckpt-RH-2.0.2-bin.tar.gz). 3) MONITOR INSTALLATION: - Modify /dynamite/monitor/Makefile as seen here: [jlguisado@abacus monitor]$ diff Makefile Makefile.orig 1c1 < # ARCH = SUN4SOL2 --- > ARCH = SUN4SOL2 3,4c3 < ARCH = LINUX < OSTYPE = linux --- > # ARCH = LINUX 7,9c6 < #CC = /opt/gcc32/bin/g++ < OPTIONS = -I$(PVM_ROOT)/include -I/usr/i386-glibc21-linux/include # -DDO_DUMPS #-DDEBUG < #OPTIONS = -I$(PVM_ROOT)/include # -DDO_DUMPS #-DDEBUG --- > OPTIONS = -I$(PVM_ROOT)/include # -DDO_DUMPS #-DDEBUG 13c10 < # LIBS = -lpthread -lpvm3 -lsocket -lnsl --- > LIBS = -lpthread -lpvm3 -lsocket -lnsl 15c12 < LIBS = -lpthread -lpvm3 -lnsl --- > # LIBS = -lpthread -lpvm3 -lnsl 43a41,44 > mon_slave: mon_slave.o mon_util.o mon_global.o > $(CC) $(CFLAGS) -o mon_slave mon_slave.o mon_global.o > mon_util.o \ > -L$(PVM_ROOT)/lib/$(PVM_ARCH) $(LIBS) -lkstat > 45,46c46,47 < # $(CC) $(CFLAGS) -o mon_slave mon_slave.o mon_global.o mon_util.o \ < # -L$(PVM_ROOT)/lib/$(PVM_ARCH) $(LIBS) -lkstat --- > # $(CC) $(CFLAGS) -o mon_slave mon_slave.o mon_global.o mon_util.o \ > # -L$(PVM_ROOT)/lib/$(PVM_ARCH) $(LIBS) 48,50d48 < mon_slave: mon_slave.o mon_util.o mon_global.o < $(CC) $(CFLAGS) -o mon_slave mon_slave.o mon_global.o mon_util.o \ < -L$(PVM_ROOT)/lib/$(PVM_ARCH) $(LIBS) - Compile monitor: $cd ~/dynamite/monitor 2>&1 make|tee ../output/monitor.out *DYNAMITE USE: -------------- - ENABLE RSH IN THE CLUSTER: In order to use dynamite, you must enable RSH on master and compute nodes. a) Enable rsh in the compute nodes following Rocks documentation (section 4.9): -Modify the default kickstart graph: Edit the file /home/install/profiles/2.3.2/graphs/default/rsh.xml and uncomment the following block of code: The uncommented block should look like this: Now re-install your compute nodes to pickup the changes. b) Enable rsh in the master node: Edit file /etc/xinetd.d/rsh and change disable to "= no". # /etc/init.d/xinetd restart # cp /usr/bin/rsh /usr/bin/rsh.ant # mv /usr/bin/rsh.bak /usr/bin/rsh - Link your PVM application with Dynamite dynamic loader and with the replacement Dynamite PVM library. Example of Makefile.aimk: # Makefile.aimk for PVM example programs. # # Set PVM_ROOT to the path where PVM includes and libraries are installed. # Set PVM_ARCH to your architecture type (SUN4, HP9K, RS6K, SGI, etc.) # Set ARCHLIB to any special libs needed on PVM_ARCH (-lrpc, -lsocket, etc.) # otherwise leave ARCHLIB blank # # PVM_ARCH and ARCHLIB are set for you if you use "$PVM_ROOT/lib/aimk" # instead of "make". # # aimk also creates a $PVM_ARCH directory below this one and will cd to it # before invoking make - this allows building in parallel on different arches. # SDIR = .. #BDIR = $(HOME)/pvm3/bin BDIR = $(SDIR)/../../bin XDIR = $(BDIR)/$(PVM_ARCH) OPTIONS = -O CFLAGS = $(OPTIONS) -I../../../include $(ARCHCFLAGS) DYNAMIC_LOADER= $(HOME)/dynamite/dynckpt-RH-2.0.2-bin/std/ld.so LIBS = -lpvm3 $(ARCHLIB) GLIBS = -lgpvm3 #F77 = f77 FORT = `case "$(FC)@$(F77)" in *@) echo $(FC) ;; @*) echo $(F77) ;; *) echo f77;; esac` FFLAGS = -g $(ARCHFFLAGS) FLIBS = -lfpvm3 #LFLAGS = $(LOPT) -L../../lib/$(PVM_ARCH) LFLAGS = $(LOPT) -L../../../lib/$(PVM_ARCH) -Wl,-dynamic-linker,$(DYNAMIC_LOADER) -rdynamic -L/lib -Wl,-rpath,/lib CPROGS = gexample hello hello_other master1 slave1 spmd \ timing timing_slave nntime FPROGS = fgexample fmaster1 fslave1 fspmd hitc hitc_slave testall default: hello hello_other all: c-all f-all c-all: $(CPROGS) f-all: $(FPROGS) clean: rm -f *.o $(CPROGS) $(FPROGS) $(XDIR): - mkdir $(BDIR) - mkdir $(XDIR) hello: $(SDIR)/hello.c $(XDIR) $(CC) $(CFLAGS) -o $@ $(SDIR)/hello.c $(LFLAGS) $(LIBS) mv $@ $(XDIR) hello_other: $(SDIR)/hello_other.c $(XDIR) $(CC) $(CFLAGS) -o $@ $(SDIR)/hello_other.c $(LFLAGS) $(LIBS) mv $@ $(XDIR) - Initialize Dynamic PVM by invoking the PVM console (using the Dynamite version!): Start DPVM console: $ ~/dynamite/dpvm-2.0/lib/LINUX/pvm Add nodes: pvm> add compute-0-0 compute-0-1 ... - Start your PVM application as usually. - Perform migrations manually using the "move" command in the PVM console. - Alternatively, you can start monitor to automatically migrate tasks: - First, modify the resource file (resource.txt) to reflect your environment. For instance: [options] MAX_SEARCH_SECS = 10 START_SLAVE_COMMAND = /home/jlgldynamite/dynamite/monitor/mon_slave CPU_WEIGHT = 1.0 MIG_WEIGHT = 0.3 DUMP_DIR = /home/jlgldynamite/dynamite/monitor/dumps [hosts] #Host IP address CPU MEM abacus 10.1.1.1 compute-0-0 10.255.255.254 compute-0-1 10.255.255.253 compute-0-2 10.255.255.252 - Initialize DPVM console and add hosts. - Start the Dynamite monitor:$ ~/dynamite/monitor/monitor /home/jlgldynamite/dynamite/monitor/resource.txt - Start your application. - The monitor can be started independently of DPVM, but it does need DPVM to do anything useful, because it needs a list of PVM tasks to monitor. Once both PVM and the monitor are running, "ps a" in DPVM console should display some extra tasks, one per node: they are the slave monitor tasks.