Commit 9c91841c authored by Peter-Bernd Otte's avatar Peter-Bernd Otte

More examples.

parent 6543abb2
# Workload Manager
On HIMster 2 / Mogon 2, load the following module first
```bash
module load lang/Python/3.6.6-foss-2018b
```
On a single node run with
```bash
mpirun -n 4 ./wkmgr.py -v date
```
or for multi nodes:
```bash
#reserve ressources
salloc -p parallel --reservation=himkurs -A m2_himkurs -N 1 -t 1:00:00
#load modules for demo analysis and MPI4Py
module load math/SUNDIALS/2.7.0-intel-2018.03
module load lang/Python/3.6.6-foss-2018b
#run
srun -n 20 ~/workload-manager/wkmgr.py -v -i ~/workload-manager/examples/LGS/Run27_LaPalma_Profile_I50 ~/workload-manager/examples/LGS/PulsedLGS
```
with loader (untested on cluster so far) replace `wkmgr.py` with `wkloader.py`.
## Hints
......@@ -43,3 +20,87 @@ with loader (untested on cluster so far) replace `wkmgr.py` with `wkloader.py`.
- Usage of Python makes changes for users simple
- Only one disadvantage: Number of ranks fixed during runtime -- in contrast to SLRUM jobs.
## Usage
First steps (aka hello world):
1. On HIMster 2 / Mogon 2, load the following module first
```bash
module load lang/Python/3.6.6-foss-2018b
```
to enable Python 3.6 and MPI4Py support. You can also add this line to your `~/.bashrc` configuration file to speed up the process when you login again.
2. Next, test the parameters for the workload-manager. To do so, run short tests (with the dry-run option) on the headnode. More examples with different parameters see in the next chapter
* On a head node run with
```bash
./wkmgr.py -n [YOUR EXECUTABLE]
```
* Or reserve a dedicated node for this purpose first, eg
```bash
salloc -p devel -A m2_him_exp -N 1 -t 1:30:00
#or during a turorial
salloc -p parallel --reservation=himkurs -A m2_himkurs -N 1 -t 1:30:00
````
#and do some test runs like in the head node case.
3. Once you found the right launcher arguments, submit the job with
```bash
#load modules for demo analysis and MPI4Py
module purge
module load math/SUNDIALS/2.7.0-intel-2018.03
module load lang/Python/3.6.6-foss-2018b
#run some example provided in the git repository
srun -n 20 ~/workload-manager/wkmgr.py -v -i ~/workload-manager/examples/LGS/Run27_LaPalma_Profile_I50 ~/workload-manager/examples/LGS/PulsedLGS
```
- Missing topic: How to identify the right number of cores
### Examples
#### Input / Output File Example
Task: Run the analysis binary for each input file in MyInputDirectory on 20 cores
```bash
srun -n 20 ~/workload-manager/wkmgr.py -i MyInputDir MyAnalysisBinary
```
check for output* in the current directory for the output or change this behaviour with the `-o`-option.
Provide the `-s` option, of you do not want to add the default placeholders (`{inputdir}{inputfilename} {outputdir}{jobid}/outfile.txt`) to the execname:
```bash
srun -n 20 ~/workload-manager/wkmgr.py -s MyAnalysisBinary
```
#### Parameters Example
Task: Run the analysis binary for theta from 0 to 180 in 2.5 degree and energy from 130 to 150 MeV in 2 MeV steps with no input files on 20 cores
```bash
srun -n 20 ~/workload-manager/wkmgr.py -a theta,0,180,2.5 -a energy,130,150,2 -ni MyAnalysisBinary {theta} {energy}
```
- Note: The `-s` option is automatically activated once you add a '{' character in your execname statement.
#### Example Shell Environment
Your bash environment is also active within the workload-manager jobs. This enables you to do more sophisticated calls like:
```bash
srun -n 20 ~/workload-manager/wkmgr.py -a theta,0,180,2.5 -a energy,130,150,2 -ni MyAnalysisBinaryPrepare && MyAnalysisBinary {theta} {energy} && MyAnalysisBinaryAfter
```
#### Change Verbosity
Get infos with `-v` or even mir mit `-vv`. Sample output
```bash
2019-05-27 16:27:14,842 rank15 INFO Worker startet job.
```
which tells you the time, the rank / worker, debug/info/warning/error and the message text.
Please note, that output from the individual workers / ranks are not necessarily displayed in the right order -- where as the order within a rank is consistent.
#### Dry-runs
Test everything before you do the calculation with
```bash
~/workload-manager/wkmgr.py -vv -n ...
```
to perform a dry-run with maximum verbosity and check the printed out worklist.
## Full Loader
The full loader is currently under development.
- with loader (untested on cluster so far) replace `wkmgr.py` with `wkloader.py`.
- Handy command for debug purposes might be: `mpirun -n 4 ./wkmgr.py -vv -n [YOUR EXECUTABLE]`
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment