Banner

Platform LSF examples

When your script is ready, submit your batch job to LSF for scheduling as shown in the Running jobs documentation.

Batch script examples

Script for pure MPI job

To submit a simple MPI batch job, see Submitting jobs.

Script for hybrid job

Use one of these examples to create your batch script for submitting a hybrid MPI/OpenMP job to LSF.

Seriel Job Script

#!/bin/bash
#
#BSUB -a serial # set serial operating environment
#BSUB -J job_name # job name
#BSUB -W 00:10 # wall-clock time (hrs:mins)
#BSUB -n 1 # number of tasks in job
#BSUB -q normal # queue
#BSUB -e serial.error.%J # error file name in which %J is replaced by the job ID
#BSUB -o serial.output.%J # output file name in which %J is replaced by the job ID
#BSUB -x # Exclusive execution mode. The job is running exclusively on a host


./hello_seq # Executable name

Openmp Job Script

#!/bin/bash
#
#BSUB -J job_name # job name
#BSUB -W 00:10 # wall-clock time (hrs:mins)
#BSUB -n 1 # number of tasks in job
#BSUB -q normal # queue
#BSUB -e omp.error.%J # error file name in which %J is replaced by the job ID
#BSUB -o omp.output.%J # output file name in which %J is replaced by the job ID
#BSUB -x # Exclusive execution mode. The job is running exclusively on a host


export OMP_NUM_THREADS=4
export MP_TASK_AFFINITY=core:$OMP_NUM_THREADS


./hello ## Executable name

Parallel Job Mpi Script

#!/bin/bash
#BSUB -J job_name # job name
#BSUB -W 01:00 # wall-clock time (hrs:mins)
#BSUB -n 64 # number of tasks in job
#BSUB -R "span[ptile=16]" # run 16 MPI tasks per node
#BSUB -q normal # queue
#BSUB -e mpi.error.%J # error file name in which %J is replaced by the job ID
#BSUB -o mpi.output.%J # output file name in which %J is replaced by the job ID
#BSUB -x # Exclusive execution mode. The job is running exclusively on a host


rm -f hostfile
cat $LSB_DJOB_HOSTFILE > hostfile


export SAVE_ALL_TASKS=no
export LD_LIBRARY_PATH=/gpfs1/home/Libs/INTEL/DAPL/dapl-2.1.10/lib:$LD_LIBRARY_PATH
ulimit -c unlimited
export I_MPI_HYDRA_BOOTSTRAP=ssh
export I_MPI_DEBUG=5
export I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u
export I_MPI_FABRICS=shm:dapl
export DAPL_UCM_REP_TIME=2000
export DAPL_UCM_RTU_TIME=2000
export DAPL_UCM_CQ_SIZE=2000
export DAPL_UCM_QP_SIZE=2000
export DAPL_UCM_RETRY=7
export DAPL_ACK_RETRY=7
export DAPL_ACK_TIMER=20
export I_MPI_DAPL_UD=enable
export I_MPI_DAPL_UD_DIRECT_COPY_THRESHOLD=2097152
export I_MPI_FALLBACK=0
export FORT_BUFFERED=yes


/usr/bin/time -p mpiexec.hydra -f ./hostfile -perhost 16 -np 64 -genvall $EXECUTABLE_NAME

Parallel Job With Hybrid (Mpi+Openmp) Script

#!/bin/bash
#BSUB -J job_name # job name
#BSUB -W 01:00 # wall-clock time (hrs:mins)
#BSUB -n 64 # number of tasks in job
#BSUB -R "span[ptile=4]" # run four MPI tasks per node
#BSUB -q normal # queue
#BSUB -e hybrid.error.%J # error file name in which %J is replaced by the job ID
#BSUB -o hybrid.output.%J # output file name in which %J is replaced by the job ID
#BSUB -x # Exclusive execution mode. The job is running exclusively on a host


rm -f hostfile
cat $LSB_DJOB_HOSTFILE > hostfile


export SAVE_ALL_TASKS=no
export LD_LIBRARY_PATH=/gpfs1/home/Libs/INTEL/DAPL/dapl-2.1.10/lib:$LD_LIBRARY_PATH
ulimit -c unlimited
export I_MPI_HYDRA_BOOTSTRAP=ssh
export I_MPI_DEBUG=5
export I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u
export I_MPI_FABRICS=shm:dapl
export DAPL_UCM_REP_TIME=2000
export DAPL_UCM_RTU_TIME=2000
export DAPL_UCM_CQ_SIZE=2000
export DAPL_UCM_QP_SIZE=2000
export DAPL_UCM_RETRY=7
export DAPL_ACK_RETRY=7
export DAPL_ACK_TIMER=20
export I_MPI_DAPL_UD=enable
export I_MPI_DAPL_UD_DIRECT_COPY_THRESHOLD=2097152
export OMP_NUM_THREADS=4
export I_MPI_FALLBACK=0
export OMP_STACKSIZE=16000000
export KMP_AFFINITY=verbose,scatter
export FORT_BUFFERED=yes


/usr/bin/time -p mpiexec.hydra -f ./hostfile -perhost 4 -np 64 -genvall $EXECUTABLE_NAME

For tcsh users

#!/bin/tcsh
#
#BSUB -a poe # set parallel operating environment
#BSUB -P project_code # project code
#BSUB -J hybrid_job_name # job name
#BSUB -W 00:10 # wall-clock time (hrs:mins)
#BSUB -n 32 # number of tasks in job
#BSUB -R "span[ptile=4]" # run four MPI tasks per node
#BSUB -q regular # queue
#BSUB -e errors.%J.hybrid # error file name in which %J is replaced by the job ID
#BSUB -o output.%J.hybrid # output file name in which %J is replaced by the job ID

setenv OMP_NUM_THREADS 4
setenv MP_TASK_AFFINITY core:$OMP_NUM_THREADS

mpirun.lsf ./program_name.exe

For bash users

#!/bin/bash
#
#BSUB -a poe # set parallel operating environment
#BSUB -P project_code # project code
#BSUB -J hybrid_job_name # job name
#BSUB -W 00:10 # wall-clock time (hrs:mins)
#BSUB -n 32 # number of tasks in job
#BSUB -R "span[ptile=4]" # run four MPI tasks per node
#BSUB -q regular # queue
#BSUB -e errors.%J.hybrid # error file name in which %J is replaced by the job ID
#BSUB -o output.%J.hybrid # output file name in which %J is replaced by the job ID

export OMP_NUM_THREADS=4
export MP_TASK_AFFINITY=core:$OMP_NUM_THREADS

mpirun.lsf ./program_name.exe

Batch script alternatives

You are not restricted to using regular login shells such as tcsh and bash for writing batch scripts for submitting jobs through Platform LSF. Using more powerful languages can make some things—regular expressions, file manipulations, and so on—easier to handle inside a script.

Here are examples using Perl and Python.


LSF script in Perl

#!/usr/bin/perl -w
use strict;
#
#BSUB -W 00:10
#BSUB -q regular
#BSUB -n 12
#BSUB -J myjob.pl
#BSUB -P project_code
#BSUB -oo output


system("mpirun.lsf ./hello");

LSF script in Python

#!/usr/bin/python
import os
#
#BSUB -W 00:10
#BSUB -q regular
#BSUB -n 12
#BSUB -J myjob.py
#BSUB -P project_code #BSUB -oo output


os.system("mpirun.lsf ./hello")

Dependent jobs

It is possible to schedule jobs to run only after other jobs have either started running or completed their runs. For example, you might schedule a preprocessing job to run; start a computation job when the preprocessing job is complete; then start a post-processing job when the computation is done.

To schedule such a series, use bsub -w [job-dependency-expression] to specify the job dependencies you need.

To illustrate, let's say you have you have three jobs to run:

pre.lsf: a preprocessing job
main.lsf: a computation job
post.lsf: a post-processing job

The main job can be run only when the preprocessing job finishes, and the post-processing job can be run only when the computation job finishes.

Dependent jobs example

To schedule the jobs described above to run in sequence, follow this example.

bsub < pre.lsf

You will see a message like this when your job is submitted successfully:

Job 1440 is submitted to default queue [regular].

Note the job ID, which you will need for the next step.

bsub -w "done(1440)" < main.lsf

You will see a message like this when the main job is submitted successfully:

Job 1441 is submitted to default queue [regular].

Again, note the job ID for the next step.

bsub -w "done(1441)" < post.lsf

You will see a message like this when your job is submitted successfully:

Job 1442 is submitted to default queue [regular].

bjobs

bjobs -l 1441

Such a job typically would be pending because the job on which it depends has not met the required conditions.

Conditions and operators

Frequently used dependency conditions include the following:

LSF also supports the following logic operators for complicated job-dependency control.

____________________________________________________________________________________

Using array syntax

Array syntax is an efficient way of to submit multiple jobs simultaneously. It is particularly useful when submitting large numbers of jobs—for ensemble forecasting and forecast evaluation, for example.

Here, the array [1-10] indicates you are submitting the myjob.csh job script 10 times.

bsub -J "myjob[1-10]" < myjob.csh

The output from a subsequent bjobs command demonstrates that all of the array elements have the same job ID, but element numbers are appended to the job name.

Script example

When you submit a job array, each array value is stored as variable $LSB_JOBINDEX, as shown in this sample script. This enables you to run an ensemble of forecasts by submitting one job and having the ensemble member equal $LSB_JOBINDEX.

#!/bin/csh
#
#BSUB -W 1
#BSUB -q regular
#BSUB -n 1
#BSUB -P project_code
#BSUB -R "span[ptile=32]"
#BSUB -J fcst
#BSUB -o wrf.%I
#BSUB -e wrf.%I

set mem = $LSB_JOBINDEX

touch -f mem_${mem}

exit 0

Array elements

Array elements do not have to be numbered consecutively as they appear in the sample output above [1-10].

Here are two alternatives that can be used as examples.

Killing jobs

To kill an entire array, just specify the single job number:

bkill 123456

To kill a single element of an array, include the element number. In this example, you're killing job myjob[5].

bkill "123456[5]"

Limiting the number of concurrent jobs

When it's desirable to limit how many array members can run concurrently, specify the limit by adding %val in the submission. In this example, no more than 15 jobs will run simultaneously.

bsub -J "myjob[1-50]%15" < myjob.sh

Dependency conditions

Arrays can be used in dependency conditions. In this example, myjob2 won't execute until all elements of myjob1 are done.

bsub -J "myjob[1-10]" < myjob1.sh
bsub -w "ended(myjob[1-10])" -J "myjob2[1-10]" < myjob2.sh