Queued Work Stream : Différence entre versions

Un article de Informaticiens département des sciences de la Terre et l'atmosphère
Aller à: navigation, charcher
m (Created page with "{| width="100%" cellspacing="2" cellpadding="10" style="border: 0px none;" |- | width="50%" valign="top" style="background: none repeat scroll 0% 0% rgb(238, 255, 255); border: 0...")
 
m
 
(24 révisions intermédiaires par le même utilisateur non affichées)
Ligne 13: Ligne 13:
 
=== What is a work stream  ===
 
=== What is a work stream  ===
  
A work stream is a series of "jobs" having a similar resource profile. In order not to overtax the system job scheduler with a myriad of relatively "small" work items, said items are inserted into "pseudo queues" and processed by a "master job".  
+
A work stream is a series of "''jobs''" having a similar resource profile (number of cpus).  
  
A user's work stream(s) will be found in directory $HOME/.job_queues. This directory in turn contains subdirectories, one for each "pseudo queue".
+
In order not to overtax the system job scheduler with a myriad of relatively "small" work items, said items are inserted into "''pseudo queues''" as "''jobs''" and processed by a "''master job''".  
  
More than one master job can go "fishing" into a "pseudo queue".
+
The work stream's ''master job'' will look into the ''pseudo queues'' assigned to it (always starting with the first ''pseudo queue'')&nbsp; and pick ''jobs'' (oldest job first) that it can run (number of cpus/cores needed not greater than the number of cpus/cores that the ''master job'' has at its disposal and enough time left for the ''master job'' to run said ''job'')<br>
  
The main characteristics of a work stream are
+
*a user's work stream(s) will be found in directory '''$HOME/.job_queues''' <br>This directory in turn contains subdirectories, one for each "pseudo queue".
 +
*'''more than one''' master job can go "fishing" into a "pseudo queue".
 +
*jobs appear as links inside the "pseudo queue" subdirectories.<br>when a job has been executed, this link is removed.<br>
 +
*Job monitoring on the primary node will be started by the master job using [[Running job monitor|u.job-monitor]]
  
*a name (arbitrary)
+
The main characteristics of a work stream are
*a set of pseudo queues (may be used to implement some sort of priority scheme)
 
*a computing surface (number of nodes)
 
*a duration (number of hours, days, weeks...)
 
*a maximum idle time (if a stream is using a large number of nodes, its maximum idle time should be very short)
 
  
=== How do i insert work into a work queue  ===
+
*a name (arbitrary)
 +
*a set of pseudo queues (may be used to implement a priority scheme)
 +
*a computing surface (number of cpus/cores)
 +
*a duration (number of hours, days, weeks...)
 +
*a maximum idle time (if a stream is using a large number of nodes, its maximum idle time should be very short)
 +
*a number of instances<br>this is only applicable to a master job that processes non MPI "jobs" (useful mostly on colosse)
  
The ord_soumet utility is used to insert work into a "pseudo queue". The syntax is almost the same as for submitting a job to the system's batch scheduler. The "-q pseudo_queue_name@" parameter to ord_soumet is used to indicate that instead of being submitted directly, the piece of work (job) should rather be inserted into the "pseudo_queue_name" work queue.<br>
+
=== Inserting work into a work queue ===
  
=== How do i start a master job for a work stream  ===
+
The [[Soumet : travaux par lots / batch jobs|ord_soumet]] utility is used to insert work into a "pseudo queue". The syntax is almost the same as for submitting a job to the system's batch scheduler. The "'''-q pseudo_queue_name@'''" parameter to ord_soumet is used to indicate that instead of being submitted directly, the piece of work (job) should rather be inserted into the "pseudo_queue_name" work queue.<br>
  
By submitting a master job with the (to come) u.run_work_stream command
+
In order to activate "queue" inheritance (a job/piece of work will automagically submit to the queue it is coming from)  
  
=== How do i control a work stream  ===
+
#use "'''-q'''" when calling [[Soumet : travaux par lots / batch jobs|ord_soumet]]
 +
#export SOUMET_EXTRAS="-q" (may be done using ~/.profile.d/.batch_profile)
 +
 
 +
example:
 +
 
 +
to start a climate integration using a work queue (p0 in this case) from the current directory
 +
 
 +
#echo 'export SOUMET_EXTRAS=-q'&nbsp; &gt;&gt;$HOME/.profile.d/.batch_profile
 +
#export SOUMET_EXTRAS=-q
 +
#export SOUMET_QUEUE=p0@
 +
#Um_lance
 +
 
 +
=== Submitting the master job for a work stream  ===
 +
 
 +
The master job will be launched with the '''u.run_work_stream''' command
 +
 
 +
A stream master job will terminate automatically if
 +
 
 +
*no work was found for maxidle seconds
 +
*there is less than one minute left in the master job
 +
 
 +
A piece of work will be left in the queue if
 +
 
 +
*there are not enough cpus in the master job to do the work
 +
*there is not enough time left in the master job to run the piece of work (including the safety margin of 1 minute)
 +
 
 +
examples:
 +
 
 +
u.run_work_stream -instances 1 -name stream01 -maxidle 120 -queues p01 p02 p03 -t 7200 -mpi -cpus 144x1 -jn my_stream
 +
 
 +
A single instance master job named stream01 will be started for 7200 seconds on 144 cpus, the batch scheduler job name will be my_stream, pieces of work will be fetched from pseudo queues p01, p02 and p03. If no suitable work if found for more than 120 seconds, the master job will terminate.
 +
 
 +
u.run_work_stream -instances 4 -name stream02 -maxidle 120 -queues p01 p02 p03 -t 10800 -mpi -cpus 4x2 -jn stream_new
 +
 
 +
A 4 instance master job named stream02 will be started for 10800 seconds with 8 cpus, each instance will be able to process ojbs needing one or two cpus (non MPI). the batch scheduler job name will be stream_new, pieces of work will be fetched from pseudo queues p01, p02 and p03. If no suitable work is found for more than 120 seconds, an instance will terminate. When all instances terminate, the master job terminates.
 +
 
 +
=== Layout and control of a work stream  ===
 +
 
 +
The work stream can be controlled via its control files
 +
 
 +
#$HOME/.job_queues/.active_name_jobid.instance
 +
#$HOME/.job_queues/.active_name_jobid.instance.queues
 +
 
 +
*removing the first control file will terminate the instance after the current piece of work is done
 +
*writing into the first control file <br>MaxIdle=new_value<br>will implement the new value for max idle time
 +
*writing into the first control file <br>MinCpusInJob=new_value <br>will prevent "jobs" using less than this number of cpus to be picked for execution
 +
*the second control file is used to change the pseudo queues to get work from<br>
 +
 
 +
The work queues are subdirectories found under $HOME/.job_queues <br>Work queue "my_queue" would found in directory<br>'''$HOME/.job_queues/my_queue'''<br>
 +
 
 +
all work entries in said queue would be soft links pointing to files prepared by ord_soumet and found in directory<br>'''$HOME/.ord_soumet.d/wrap/batch'''
 +
 
 +
the "job name" part ('''-jn''' from ord_soumet) of both link and target would be the same to make it easier to correlate entries.<br><br>
 +
 
 +
=== Aborting and rerunning a piece of work  ===
 +
 
 +
a piece of work may abort and signal to the master job that it should be rerun (up to N times) with the following command
 +
 
 +
'''. exit_and_rerun_work.dot N'''
 +
 
 +
this command will also make sure that the post work cleanup code inserted by ord_soumet will not be performed<br>
 +
 
 +
=== Requeuing a piece of work on guillimin  ===
 +
 
 +
a previously queued or submitted piece of work may be requeued with the following command
 +
 
 +
'''u.resoumet [-h|--help] [--asis] [-l|--list] [--to=queue] [--from=superjob_queue_name]''' ''list of job names''
 +
 
 +
&nbsp; ex:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet --to=hb '*exp1*''''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; resubmit all jobs with a name matching *exp1* to the hb queue<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet --to=same '*exp1*''''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; resubmit all jobs with a name matching *exp1* to their original queue<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet --asis '*exp1*''''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; resubmit all jobs with a name matching *exp1* to their original queue<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet -l --from=myq@ --to=myotherq@ '*exp1*''''&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; requeue all jobs with a name matching *exp1* from work queue myq to work queue myotherq<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet -l --from=myq@ --to=sw '*exp1*''''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remove all jobs with a name matching *exp1* from work queue myq<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; after submitting them to guillimin in queue sw<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet -l --from=myq@ --to=same '*exp1*''''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remove all jobs with a name matching *exp1* from work queue myq<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; after submitting them to guillimin in their original queue<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet -l --from=myq@''' <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; list all jobs in work queue myq<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet -l'''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; '''u.resoumet --list'''<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; list all jobs that have been queued not matter where<br>

Version actuelle datée du 29 d'octobre 2012 à 15:50

en construction

under construction

Work Streams

What is a work stream

A work stream is a series of "jobs" having a similar resource profile (number of cpus).

In order not to overtax the system job scheduler with a myriad of relatively "small" work items, said items are inserted into "pseudo queues" as "jobs" and processed by a "master job".

The work stream's master job will look into the pseudo queues assigned to it (always starting with the first pseudo queue)  and pick jobs (oldest job first) that it can run (number of cpus/cores needed not greater than the number of cpus/cores that the master job has at its disposal and enough time left for the master job to run said job)

  • a user's work stream(s) will be found in directory $HOME/.job_queues
    This directory in turn contains subdirectories, one for each "pseudo queue".
  • more than one master job can go "fishing" into a "pseudo queue".
  • jobs appear as links inside the "pseudo queue" subdirectories.
    when a job has been executed, this link is removed.
  • Job monitoring on the primary node will be started by the master job using u.job-monitor

The main characteristics of a work stream are

  • a name (arbitrary)
  • a set of pseudo queues (may be used to implement a priority scheme)
  • a computing surface (number of cpus/cores)
  • a duration (number of hours, days, weeks...)
  • a maximum idle time (if a stream is using a large number of nodes, its maximum idle time should be very short)
  • a number of instances
    this is only applicable to a master job that processes non MPI "jobs" (useful mostly on colosse)

Inserting work into a work queue

The ord_soumet utility is used to insert work into a "pseudo queue". The syntax is almost the same as for submitting a job to the system's batch scheduler. The "-q pseudo_queue_name@" parameter to ord_soumet is used to indicate that instead of being submitted directly, the piece of work (job) should rather be inserted into the "pseudo_queue_name" work queue.

In order to activate "queue" inheritance (a job/piece of work will automagically submit to the queue it is coming from)

  1. use "-q" when calling ord_soumet
  2. export SOUMET_EXTRAS="-q" (may be done using ~/.profile.d/.batch_profile)

example:

to start a climate integration using a work queue (p0 in this case) from the current directory

  1. echo 'export SOUMET_EXTRAS=-q'  >>$HOME/.profile.d/.batch_profile
  2. export SOUMET_EXTRAS=-q
  3. export SOUMET_QUEUE=p0@
  4. Um_lance

Submitting the master job for a work stream

The master job will be launched with the u.run_work_stream command

A stream master job will terminate automatically if

  • no work was found for maxidle seconds
  • there is less than one minute left in the master job

A piece of work will be left in the queue if

  • there are not enough cpus in the master job to do the work
  • there is not enough time left in the master job to run the piece of work (including the safety margin of 1 minute)

examples:

u.run_work_stream -instances 1 -name stream01 -maxidle 120 -queues p01 p02 p03 -t 7200 -mpi -cpus 144x1 -jn my_stream

A single instance master job named stream01 will be started for 7200 seconds on 144 cpus, the batch scheduler job name will be my_stream, pieces of work will be fetched from pseudo queues p01, p02 and p03. If no suitable work if found for more than 120 seconds, the master job will terminate.

u.run_work_stream -instances 4 -name stream02 -maxidle 120 -queues p01 p02 p03 -t 10800 -mpi -cpus 4x2 -jn stream_new

A 4 instance master job named stream02 will be started for 10800 seconds with 8 cpus, each instance will be able to process ojbs needing one or two cpus (non MPI). the batch scheduler job name will be stream_new, pieces of work will be fetched from pseudo queues p01, p02 and p03. If no suitable work is found for more than 120 seconds, an instance will terminate. When all instances terminate, the master job terminates.

Layout and control of a work stream

The work stream can be controlled via its control files

  1. $HOME/.job_queues/.active_name_jobid.instance
  2. $HOME/.job_queues/.active_name_jobid.instance.queues
  • removing the first control file will terminate the instance after the current piece of work is done
  • writing into the first control file
    MaxIdle=new_value
    will implement the new value for max idle time
  • writing into the first control file
    MinCpusInJob=new_value
    will prevent "jobs" using less than this number of cpus to be picked for execution
  • the second control file is used to change the pseudo queues to get work from

The work queues are subdirectories found under $HOME/.job_queues
Work queue "my_queue" would found in directory
$HOME/.job_queues/my_queue

all work entries in said queue would be soft links pointing to files prepared by ord_soumet and found in directory
$HOME/.ord_soumet.d/wrap/batch

the "job name" part (-jn from ord_soumet) of both link and target would be the same to make it easier to correlate entries.

Aborting and rerunning a piece of work

a piece of work may abort and signal to the master job that it should be rerun (up to N times) with the following command

. exit_and_rerun_work.dot N

this command will also make sure that the post work cleanup code inserted by ord_soumet will not be performed

Requeuing a piece of work on guillimin

a previously queued or submitted piece of work may be requeued with the following command

u.resoumet [-h|--help] [--asis] [-l|--list] [--to=queue] [--from=superjob_queue_name] list of job names

  ex:
      u.resoumet --to=hb '*exp1*'
            resubmit all jobs with a name matching *exp1* to the hb queue
      u.resoumet --to=same '*exp1*'
            resubmit all jobs with a name matching *exp1* to their original queue
      u.resoumet --asis '*exp1*'
            resubmit all jobs with a name matching *exp1* to their original queue
      u.resoumet -l --from=myq@ --to=myotherq@ '*exp1*'    
            requeue all jobs with a name matching *exp1* from work queue myq to work queue myotherq
      u.resoumet -l --from=myq@ --to=sw '*exp1*'
            remove all jobs with a name matching *exp1* from work queue myq
            after submitting them to guillimin in queue sw
      u.resoumet -l --from=myq@ --to=same '*exp1*'
            remove all jobs with a name matching *exp1* from work queue myq
            after submitting them to guillimin in their original queue
      u.resoumet -l --from=myq@
            list all jobs in work queue myq
      u.resoumet -l
      u.resoumet --list
            list all jobs that have been queued not matter where