Troubleshooting 333 : Différence entre versions

Un article de Informaticiens département des sciences de la Terre et l'atmosphère
Aller à: navigation, charcher
m (Restart the post processing)
m (Restart using Chunk_lance)
Ligne 17: Ligne 17:
 
=== Restart using Chunk_lance  ===
 
=== Restart using Chunk_lance  ===
  
cp last_continue_with_next_job continue_with_next_job
+
Go into the directory with your config files.<br>Copy the command to restart the current month from the backup file:
  
<br>last line in chunk_job.log:  
+
'''&nbsp; cp last_continue_with_next_job continue_with_next_job'''
 +
 
 +
Make sure the last line in your file ''''chunk_job.log'''' looks line this:<br>
  
 
... continue_with_next_job $exp starting at ...  
 
... continue_with_next_job $exp starting at ...  
  
And be careful NOT to have a blank line at the end of the file!  
+
Remove all line below this one.<br>And be careful '''NOT to have a blank line''' at the end of the file!  
 +
 
 +
Then restart your simulation with:
  
Chunk_lance  
+
'''&nbsp; Chunk_lance'''
  
 
=== Restart from another simulation  ===
 
=== Restart from another simulation  ===

Version depuis le 13 de mars 2013 à 13:53

Troubleshooting v_3.3.3


Restart from a previous restart file

First you need to copy the restart file(s) from which you want to continue (previous month) in the execution directory:

  ~/MODEL_EXEC_RUN/${TRUE_HOST}

Then 'gunzip ....ca.gz' and afterwards unarchive ('cmcarc -x -f ....ca') your restart file(s).

If you only want to rerun one or a few months in the middle of a simulation I suggest you create a new directory and copy the original config files in it.
Then you should set 'CLIMAT_enddate' to the date until which you want to rerun. But NEVER change 'CLIMAT_startdate'.
You can do this while the original simulation is still running.

Then go into your (new) config file directory.

Restart using Chunk_lance

Go into the directory with your config files.
Copy the command to restart the current month from the backup file:

  cp last_continue_with_next_job continue_with_next_job

Make sure the last line in your file 'chunk_job.log' looks line this:

... continue_with_next_job $exp starting at ...

Remove all line below this one.
And be careful NOT to have a blank line at the end of the file!

Then restart your simulation with:

  Chunk_lance

Restart from another simulation

Restart the post processing

It sometimes happens that you do not get all the model output in your archive.
If you are missing output for one month it is best to have a look at the listings for that month which are mostly still in ~/listings/${TRUE_HOST}.

Click here for a flowchart of the listings.

Once you found out which job stopped, crashed or never got submitted you can usually find the submission command in the listing of the previous job.

And if the job still exsist you can submit it again.