Superjobs guillimin : Différence entre versions
m |
m |
||
Ligne 1: | Ligne 1: | ||
= Superjobs = | = Superjobs = | ||
− | A "superjob" is a job which runs on one of the normal queues and executes other jobs, which got submitted to a faked queue, one after the other.<br>It will run until the required wallclock time is finished or until it does not find any job to execute for a certain time. | + | A "superjob" is a job which runs on one of the normal queues and executes other jobs, which got submitted to a faked queue, one after the other.<br>It will run until the required wallclock time is finished or until it does not find any job to execute for a certain time. |
''' NEVER KILL A SUPERJOB !!!''' See below for more information. | ''' NEVER KILL A SUPERJOB !!!''' See below for more information. | ||
− | A superjob is a very useful tool to execute post processing jobs. It will make the submission of post processing jobs by the model independent of guillimin's "moods". No jobs will get lost or have to | + | A superjob is a very useful tool to execute post processing jobs. It will make the automatic submission of post processing jobs by the model independent of guillimin's "moods". No jobs will get lost or have to get resubmitted by hand. |
− | |||
+ | <br> | ||
== How to start a "superjob" == | == How to start a "superjob" == | ||
Ligne 19: | Ligne 19: | ||
'''Submission example''': | '''Submission example''': | ||
− | u.run_work_stream -t ''2592000'' -cpus ''1'' -name ''superjob_1a'' -maxidle ''36000'' -queues ''sj1'' -- -q ''sw'' -jn ''superjob_1a'' | + | '''u.run_work_stream''' -t ''2592000'' -cpus ''1'' -name ''superjob_1a'' -maxidle ''36000'' -queues ''sj1'' -- -q ''sw'' -jn ''superjob_1a'' |
In this case a superjob with the name '''superjob_1''' will get submitted. <br>''''-name'''' is the interlan name of the superjob, ''''-jn'''' the name of the listing.<br>For simplicity I suggest to keep the two names the same.<br>Make sure to '''NEVER HAVE TWO SUPERJOBS WITH THE SAME NAME''' running. But once a superjob has finished you can submit a new one with the same name. | In this case a superjob with the name '''superjob_1''' will get submitted. <br>''''-name'''' is the interlan name of the superjob, ''''-jn'''' the name of the listing.<br>For simplicity I suggest to keep the two names the same.<br>Make sure to '''NEVER HAVE TWO SUPERJOBS WITH THE SAME NAME''' running. But once a superjob has finished you can submit a new one with the same name. | ||
− | The superjob will get submitted for ''''-t | + | The superjob will get submitted for ''''-t''' 2592000' seconds (30 days) on <span style="font-weight: bold;">'</span>'''-cpus''' 1'''''''''''cpu to the queue '''''''''''<b>-q</b>sw'. |
+ | |||
+ | If it does not find a job to execute for ''''-maxidle''' 36000' seconds it will terminate itself.<br> | ||
+ | |||
+ | The superjob will execute jobs which got submitted to the faked queue ''''-queues''' sj1'. You can name the faked queue anyway you want.<br> | ||
+ | |||
+ | <br> | ||
+ | |||
+ | == How to send jobs to the "faked" queue<br> == | ||
+ | |||
+ | At the moment only jobs running on 1-4 cores can get executed by a superjob. But this can easily be changed. Just let me or Michel know. | ||
− | + | To have for example all jobs, which get submitted to run on 1 core, get executed by the above submitted superjob instead of being actually submitted, one has to set the variable: | |
− | + | QUEUE_1CPU=sj1@ | |
+ | You can export this variable in your '''~/.profile.d/.batch_profile''': | ||
+ | export QUEUE_1CPU=sj1@ | ||
− | + | The ''''@'''' at the end is very important. This tells 'soumet' that this is a faked queue and not a real one. |
Version depuis le 9 de novembre 2012 à 21:07
Superjobs
A "superjob" is a job which runs on one of the normal queues and executes other jobs, which got submitted to a faked queue, one after the other.
It will run until the required wallclock time is finished or until it does not find any job to execute for a certain time.
NEVER KILL A SUPERJOB !!! See below for more information.
A superjob is a very useful tool to execute post processing jobs. It will make the automatic submission of post processing jobs by the model independent of guillimin's "moods". No jobs will get lost or have to get resubmitted by hand.
How to start a "superjob"
The command to submit a superjob is "u.run_work_stream":
u.run_work_stream [-instances n] -t mseconds -cpus number_of_cpus -name stream_name -maxidle nseconds -queues q1 q2 ... qn [--] "arguments_for_ord_soumet"
Arguments_for_ord_soumet may include -q, -jn, and any other relevant argument
Submission example:
u.run_work_stream -t 2592000 -cpus 1 -name superjob_1a -maxidle 36000 -queues sj1 -- -q sw -jn superjob_1a
In this case a superjob with the name superjob_1 will get submitted.
'-name' is the interlan name of the superjob, '-jn' the name of the listing.
For simplicity I suggest to keep the two names the same.
Make sure to NEVER HAVE TWO SUPERJOBS WITH THE SAME NAME running. But once a superjob has finished you can submit a new one with the same name.
The superjob will get submitted for '-t 2592000' seconds (30 days) on '-cpus 1''''''cpu to the queue ''''''-qsw'.
If it does not find a job to execute for '-maxidle 36000' seconds it will terminate itself.
The superjob will execute jobs which got submitted to the faked queue '-queues sj1'. You can name the faked queue anyway you want.
How to send jobs to the "faked" queue
At the moment only jobs running on 1-4 cores can get executed by a superjob. But this can easily be changed. Just let me or Michel know.
To have for example all jobs, which get submitted to run on 1 core, get executed by the above submitted superjob instead of being actually submitted, one has to set the variable:
QUEUE_1CPU=sj1@
You can export this variable in your ~/.profile.d/.batch_profile:
export QUEUE_1CPU=sj1@
The '@' at the end is very important. This tells 'soumet' that this is a faked queue and not a real one.