Queues guillimin : Différence entre versions

Un article de Informaticiens département des sciences de la Terre et l'atmosphère
Aller à: navigation, charcher
m
m
 
Ligne 1: Ligne 1:
With the new phase2 of guillimin came a few new queues.<br>In general, we can now run on 5 queues: ''''hb', 'lm', 'sw', 'sw2', 'lm2''''
+
With the new phase2 of guillimin came a few new queues.<br>In general, we can now run on 5 queues: ''''hb', 'lm', 'sw', 'sw2', 'lm2''''  
  
 
If you want to know more about them and see their " busyness", execute the command:  
 
If you want to know more about them and see their " busyness", execute the command:  
Ligne 5: Ligne 5:
 
&nbsp;&nbsp;&nbsp; '''nodes'''  
 
&nbsp;&nbsp;&nbsp; '''nodes'''  
  
You will see there are essentially three types of queues:<br> &nbsp;&nbsp;&nbsp; - 1 serial&nbsp;&nbsp;&nbsp; Westmere queue for jobs with less than 12 cores<br> &nbsp;&nbsp;&nbsp; - 3 parallel '''SandyBridge''' queues: 'debug', ''''sw2'''', ''''lm2'''' <br> &nbsp;&nbsp;&nbsp; - 3 parallel '''Westmere''' queues: ''''hb'''', ''''lm'''', ''''sw''''<br> <br> The names might look a little different in the 'nodes' output but you will know which one is which.<br> <br> Good things to know about these queues:<br> <br>&nbsp;&nbsp;&nbsp; - Each '''Westmere''' node has two 6-core sockets =&gt; '''12 cores per node'''. They are the old phase 1 nodes.<br>&nbsp;&nbsp;&nbsp; - Each '''SandyBridge''' node has two 8-core sockets =&gt; '''16 cores per node'''. They are the new phase 2 nodes.<br> &nbsp;&nbsp;&nbsp; - Jobs using less than 12 cores will automatically go to the serial queue. And there is nothing you can do about it.<br> &nbsp;&nbsp;&nbsp; - Jobs using '''OpenMP=3 or 6''' get automatically send to a parallel Westmere queue (''''hb', 'lm', 'sw'''')<br> &nbsp;&nbsp;&nbsp; - Jobs using '''OpenMP=4 or 8''' get automatically send to a SandyBridge queue (''''sw2', 'lm2'''')<br> &nbsp;&nbsp;&nbsp; - Jobs using OpenMP=1 or 2 get automatically send to a parallel Westmere queue OR a SandyBridge queue ('hb', 'lm', 'sw', 'sw2', 'lm2')<br> &nbsp;&nbsp;&nbsp; - Jobs using OpenMP=5 or 7 will get refused. <br>&nbsp;&nbsp;&nbsp; - ''''sw2'''' has some nodes, belonging to certain users. However, when they are not using them, we are allowed to use them, but only for jobs requesting 12 hours or less.<br> <br> If you want to send your job to a specific queue you can to so by setting the environment variable 'SOUMET_EXTRAS', for example:<br><br> &nbsp;&nbsp;&nbsp; export SOUMET_EXTRAS="-q hb"<br><br> will send all jobs submitted (from the same window) using 12 or more cores to the 'hb' queue.<br> If you linked to the .group_profile you should already have a few aliases for this. You can check them with:<br> <br> &nbsp;&nbsp;&nbsp; alias | grep SOUMET_EXTRAS<br> <br> BUT!!! Know that, '''if you send a job using OpenMP=3 or 6 to a SandyBridge queue or a job using OpenMP=4 or 8 to a Westmere queue, they will <u>NEVER RUN</u>!!!'''
+
You will see there are essentially three types of queues:<br> &nbsp;&nbsp;&nbsp; - 1 serial&nbsp;&nbsp;&nbsp; Westmere queue for jobs with less than 12 cores<br> &nbsp;&nbsp;&nbsp; - 3 parallel '''SandyBridge''' queues: 'debug', ''''sw2'''', ''''lm2'''' <br> &nbsp;&nbsp;&nbsp; - 3 parallel '''Westmere''' queues: ''''hb'''', ''''lm'''', ''''sw''''<br> <br> The names might look a little different in the 'nodes' output but you will know which one is which.<br> <br> Good things to know about these queues:<br> <br>&nbsp;&nbsp;&nbsp; - Each '''Westmere''' node has two 6-core sockets =&gt; '''12 cores per node'''. They are the old phase 1 nodes.<br>&nbsp;&nbsp;&nbsp; - Each '''SandyBridge''' node has two 8-core sockets =&gt; '''16 cores per node'''. They are the new phase 2 nodes.<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''less than 12 cores''' will automatically go to the '''serial queue'''. And there is nothing you can do about it.<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=3 or 6''' and '''36h or less''' get automatically send to a parallel Westmere queue (''''hb', 'lm', 'sw'''')<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=4 or 8''' and '''36h or less''' get automatically send to a SandyBridge queue (''''sw2', 'lm2'''')<br>&nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=1 or 2''' and '''36h or less''' get automatically send to a parallel Westmere queue OR a SandyBridge queue (''''hb', 'lm', 'sw', 'sw2', 'lm2'''')<br>&nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=3 or 6''' and '''more than 36h''' get automatically send to the ''''hb'''' queue.<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=4 or 8''' and '''more than 36h''' get automatically send to the ''''sw2'''' queue.<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=1 or 2''' and '''more than 36h''' get automatically send to the ''''hb'''' queue.<br> &nbsp;&nbsp;&nbsp; - Jobs requesting '''OpenMP=5 or 7 will get refused'''. <br>&nbsp;&nbsp;&nbsp; - ''''sw2'''' has some nodes, belonging to certain users. However, when they are not using them, we are allowed to use them, but only for jobs requesting '''12 hours''' or less.<br> <br> If you want to send your job to a specific queue you can to so by setting the environment variable 'SOUMET_EXTRAS', for example:<br><br> &nbsp;&nbsp;&nbsp; export SOUMET_EXTRAS="-q hb"<br><br> will send all jobs submitted (from the same window) using 12 or more cores to the 'hb' queue.<br> If you linked to the .group_profile you should already have a few aliases for this. You can check them with:<br> <br> &nbsp;&nbsp;&nbsp; alias | grep SOUMET_EXTRAS<br> <br> BUT!!! Know that, '''if you send a job using OpenMP=3 or 6 to a SandyBridge queue or a job using OpenMP=4 or 8 to a Westmere queue, they will <u>NEVER RUN</u>!!!'''

Version actuelle datée du 26 d'avril 2014 à 13:44

With the new phase2 of guillimin came a few new queues.
In general, we can now run on 5 queues: 'hb', 'lm', 'sw', 'sw2', 'lm2'

If you want to know more about them and see their " busyness", execute the command:

    nodes

You will see there are essentially three types of queues:
    - 1 serial    Westmere queue for jobs with less than 12 cores
    - 3 parallel SandyBridge queues: 'debug', 'sw2', 'lm2'
    - 3 parallel Westmere queues: 'hb', 'lm', 'sw'

The names might look a little different in the 'nodes' output but you will know which one is which.

Good things to know about these queues:

    - Each Westmere node has two 6-core sockets => 12 cores per node. They are the old phase 1 nodes.
    - Each SandyBridge node has two 8-core sockets => 16 cores per node. They are the new phase 2 nodes.
    - Jobs requesting less than 12 cores will automatically go to the serial queue. And there is nothing you can do about it.
    - Jobs requesting OpenMP=3 or 6 and 36h or less get automatically send to a parallel Westmere queue ('hb', 'lm', 'sw')
    - Jobs requesting OpenMP=4 or 8 and 36h or less get automatically send to a SandyBridge queue ('sw2', 'lm2')
    - Jobs requesting OpenMP=1 or 2 and 36h or less get automatically send to a parallel Westmere queue OR a SandyBridge queue ('hb', 'lm', 'sw', 'sw2', 'lm2')
    - Jobs requesting OpenMP=3 or 6 and more than 36h get automatically send to the 'hb' queue.
    - Jobs requesting OpenMP=4 or 8 and more than 36h get automatically send to the 'sw2' queue.
    - Jobs requesting OpenMP=1 or 2 and more than 36h get automatically send to the 'hb' queue.
    - Jobs requesting OpenMP=5 or 7 will get refused.
    - 'sw2' has some nodes, belonging to certain users. However, when they are not using them, we are allowed to use them, but only for jobs requesting 12 hours or less.

If you want to send your job to a specific queue you can to so by setting the environment variable 'SOUMET_EXTRAS', for example:

    export SOUMET_EXTRAS="-q hb"

will send all jobs submitted (from the same window) using 12 or more cores to the 'hb' queue.
If you linked to the .group_profile you should already have a few aliases for this. You can check them with:

    alias | grep SOUMET_EXTRAS

BUT!!! Know that, if you send a job using OpenMP=3 or 6 to a SandyBridge queue or a job using OpenMP=4 or 8 to a Westmere queue, they will NEVER RUN!!!