SLURM scheduler

Prepare a programme that can run on a distributed computin system.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
    int  step, node, hostlen;
    char hostname[256];
    hostlen = 255;

    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &node);
    MPI_Get_processor_name(hostname, &hostlen);

    for (step = 1; step < 5; step++) {
        printf("Hello World from Step %d on Node %d, (%s)\n", step, node, hostname);
        sleep(2);
    }

    MPI_Finalize();

    return 0;
}

Compile it with the following:

mpicc main.c -o mpi_exec

Then you can instruct the scheduler on how to run the programme, such as:

#!/bin/bash
#SBATCH --job-name=hello-world-job
#SBATCH --ntasks=6
#SBATCH --output=%x_%j.out

mpirun ./mpiexec

To start the Slurm scheduler alongside with the script that specifies the resources needed and the commands to be executed:

sbatch directives_slurm.sbatch

General utilities

Instead to run directly a single job or a script on allocated compute nodes, use:

srun

This can be useful for quick tasks or testing purposes As the name suggests, the following command allows you to cancel submitted jobs. It can be used for jobs that are pending, running, or even to send signals to running jobs

scancel

The following command is for checking the status of your submitted jobs. It displays information like job ID, name, user, partition, state, and resource use:

squeue

Instead the next command provides details about the available compute nodes and partitions (queues) on the Slurm cluster. It's helpful for understanding the resources available and choosing the right nodes for your jobs

sinfo