Scientific Computing and Data / High Performance Computing / Documentation / Multiple Serial Jobs
Documentation
Software and Packages
- Software and Applications
- Modules
- Compiling
- Schrodinger Suite
- MATLAB, Simulink and MATLAB Distributed Compute Server
- Singularity
Queues and Resources
- LSF Queues And Policies
- GPGPU
- GPU Etiquette
- Access TSM with GUI
- Access TSM with Command Line
- Checkpoint Restart
- Disaster Recovery Plan
Job Submission
- Job Execution
- Multiple Serial Jobs
Directories
Rstudio
Services
Multiple Serial Jobs
How to submit multiple serial jobs over more than a single node:
Sometimes users want to submit large numbers of independent serial jobs as a single batch. Rather than using a script to repeatedly call bsub, a self-scheduling utility (“selfsched”) can be used to have multiple serial jobs bundled and scheduled over more than a single node with one bsub command.
Usage:
In your batch script, load the self-scheduler module and execute it using mpi wrapper (“mpirun”) in a parallel mode:
module load selfsched mpirun selfsched < YourInputForSelfScheduler
where YourInputForSelfScheduler is a file containing serial job commands like,
/my/bin/path/Exec_1 < my_input_parameters_1 > output_1.log /my/bin/path/Exec_2 < my_input_parameters_2 > output_2.log /my/bin/path/Exec_3 < my_input_parameters_3 > output_3.log . . .
Each line has 2048 character limit and TAB is not allowed.
Please note that one of compute cores is used to monitor and schedule serial jobs over the rest of cores, so the actual number of cores used for the real computation is (the total number of cores assigned – 1).
A simple utility (“PrepINP”) is also provided to facilitate generation of YourInputForSelfScheduler file. The self-scheduler module has to be loaded first.
Usage:
module load selfsched PrepINP < templ.txt > YourInputForSelfScheduler
templ.txt contains input parameters with the number fields replaced by “#” to generate YourInputForSelfScheduler file.
Example 1:
1 10000 2 F ← start, end, stride, fixed field length? /my/bin/path/Exec_# < my_input_parameters_# > output_#.log
The output will be
/my/bin/path/Exec_1 < my_input_parameters_1 > output_1.log /my/bin/path/Exec_3 < my_input_parameters_3 > output_3.log /my/bin/path/Exec_5 < my_input_parameters_5 > output_5.log . . . /my/bin/path/Exec_9999 < my_input_parameters_9999 > output_9999.log
Example 2:
1 10000 1 T ← start, end, stride, fixed field length? 5 ← field length /my/bin/path/Exec_# < my_input_parameters_# > output_#.log
The output will be
/my/bin/path/Exec_00001 < my_input_parameters_00001 > output_00001.log /my/bin/path/Exec_00002 < my_input_parameters_00002 > output_00002.log /my/bin/path/Exec_00003 < my_input_parameters_00003 > output_00003.log . . . /my/bin/path/Exec_10000 < my_input_parameters_10000 > output_10000.log