Connections in the back of a Cray XC40 supercomputer

Connections between components of a supercomputer


Quickstart Guide for the Harlow cluster

Node configuration:

30 standard compute nodes with:

  • Intel Xeon Silver 4110 2.1G, 8C/16T, 9.6GT/s (16 mpi tasks per node)
  • 11M Cache, Turbo, HT (85W) DDR4-2400
  • connections, Mellanox FDR

    This cluster is named for a giant in the field of computational fluid dynamics, Frank Harlow . It was commissioned Oct 3, 2018.

    Login

    Access to the Harlow system requires a user account. See our computing page for details on how to request one. Access is only possible from within the GSU network.

    File Systems

  • HOME directory with a 100GB quota, 10^4 inode quota.
  • WORK directory with a 30TB capacity. This is quotas by group with a 5 TB soft, 10 TB hard quota.
  • no permanent storage service is available for Harlow. Please use departmental resources to backup data.

    Software and Environment

    To manage the access to pre-installed software like compilers, libraries, pre- and postprocessing tools and further application software, Harlow uses the module command. This command offers the following functionality.

    1. Show lists of available software
    2. Access software in different versions

    harlow:~ $ module avail
    ... intel/18.0.3.222 ...
    harlow:~ $ module load intel/18.0.3.222
    harlow:~ $ module list
    Currently Loaded Modulefiles: ... intel/18.0.3.222 ...
    

    Job Scripts

    Standard batch system jobs are executed applying the following steps:

    1. Provide (write) a batch job script, see the examples below.
    2. Submit the job script with the command sbatch.
    3. Monitor and control the job execution, e.g. with the commands squeue and scancel (to cancel the job).

    A job script is a script (written in bash, ksh or csh syntax) containing Slurm keywords which are used as arguments for the command sbatch. Parallel applications on the system are started with either mpirun or mpiexec, a substantial part of most job scripts. We recommends to use the mpirun command specific for the MPI library that was used to build the program.

    MPI job script

    Requesting 4 nodes in the general4 partition with 16 cores (no hyperthreading possible) for 10 minutes, using MPI.

    #!/bin/bash
    #SBATCH -J harlow_mpi_test
    #SBATCH -t 00:10:00
    #SBATCH -N 4
    #SBATCH --tasks-per-node 16
    #SBATCH -o job%j.out     # strout filename (%j is jobid)
    #SBATCH -e job%j.err     # stderr filename (%j is jobid)
    
    module load intel/18.0.3.222
    export SLURM_CPU_BIND=none
    
    mpirun -iface ib0 -env I_MPI_FAULT_CONTINUE=on -n $SLURM_NPROCS hello_world > hello.out
    

    Hybrid MPI+OpenMP job script

    Requesting 2 nodes with 2 MPI tasks per node, and 8 OpenMP tasks per MPI task.

    #!/bin/bash
    #SBATCH -J harlow_hyb_test
    #SBATCH -t 00:20:00
    #SBATCH -N 2
    #SBATCH --cpus-per-task=8
    #SBATCH -o job%j.out     # strout filename (%j is jobid)
    #SBATCH -e job%j.err     # stderr filename (%j is jobid)
    
    # This binds each thread to one core
    export OMP_PROC_BIND=TRUE
    # Number of threads as given by -c / --cpus-per-task
    export OMP_NUM_THREADS=8
    export KMP_AFFINITY=verbose,scatter
    
    mpiexec -iface ib0 -n 4 --perhost 1 ./hello_world > hello.out
    

    Batch partitions

    Partition Max. walltime Nodes Remark
    devel 12:00:00 30 on demand high priority development tests, can pre-empt other jobs
    production 24:00:00 30 on demand normal queue for production of data for research
    general8 24:00:00 maximum 8 nodes test queue for medium jobs
    general4 24:00:00 maximum 4 nodes test queue for small jobs /class assignments

    Help

    For questions, please contact Dr. Justin Cantrell or Dr. Jane Pratt .