Queues guillimin

Un article de Informaticiens département des sciences de la Terre et l'atmosphère
Version depuis le 26 d'avril 2014 à 13:44 par Katja (Discuter | changes)
(diff) ←Version avant | Version courante (diff) | Version après→ (diff)
Aller à: navigation, charcher

With the new phase2 of guillimin came a few new queues.
In general, we can now run on 5 queues: 'hb', 'lm', 'sw', 'sw2', 'lm2'

If you want to know more about them and see their " busyness", execute the command:


You will see there are essentially three types of queues:
    - 1 serial    Westmere queue for jobs with less than 12 cores
    - 3 parallel SandyBridge queues: 'debug', 'sw2', 'lm2'
    - 3 parallel Westmere queues: 'hb', 'lm', 'sw'

The names might look a little different in the 'nodes' output but you will know which one is which.

Good things to know about these queues:

    - Each Westmere node has two 6-core sockets => 12 cores per node. They are the old phase 1 nodes.
    - Each SandyBridge node has two 8-core sockets => 16 cores per node. They are the new phase 2 nodes.
    - Jobs requesting less than 12 cores will automatically go to the serial queue. And there is nothing you can do about it.
    - Jobs requesting OpenMP=3 or 6 and 36h or less get automatically send to a parallel Westmere queue ('hb', 'lm', 'sw')
    - Jobs requesting OpenMP=4 or 8 and 36h or less get automatically send to a SandyBridge queue ('sw2', 'lm2')
    - Jobs requesting OpenMP=1 or 2 and 36h or less get automatically send to a parallel Westmere queue OR a SandyBridge queue ('hb', 'lm', 'sw', 'sw2', 'lm2')
    - Jobs requesting OpenMP=3 or 6 and more than 36h get automatically send to the 'hb' queue.
    - Jobs requesting OpenMP=4 or 8 and more than 36h get automatically send to the 'sw2' queue.
    - Jobs requesting OpenMP=1 or 2 and more than 36h get automatically send to the 'hb' queue.
    - Jobs requesting OpenMP=5 or 7 will get refused.
    - 'sw2' has some nodes, belonging to certain users. However, when they are not using them, we are allowed to use them, but only for jobs requesting 12 hours or less.

If you want to send your job to a specific queue you can to so by setting the environment variable 'SOUMET_EXTRAS', for example:

    export SOUMET_EXTRAS="-q hb"

will send all jobs submitted (from the same window) using 12 or more cores to the 'hb' queue.
If you linked to the .group_profile you should already have a few aliases for this. You can check them with:

    alias | grep SOUMET_EXTRAS

BUT!!! Know that, if you send a job using OpenMP=3 or 6 to a SandyBridge queue or a job using OpenMP=4 or 8 to a Westmere queue, they will NEVER RUN!!!