Troubleshooting 333

Un article de Informaticiens département des sciences de la Terre et l'atmosphère
Aller à: navigation, charcher

Troubleshooting v_3.3.3


Restart from a previous restart file

First you need to copy the restart file(s) from which you want to continue (previous month) in the execution directory:

  ~/MODEL_EXEC_RUN/${TRUE_HOST}

Then 'gunzip ....ca.gz' and afterwards unarchive ('cmcarc -x -f ....ca') your restart file(s).

If you only want to rerun one or a few months in the middle of a simulation I suggest you create a new directory and copy the original config files in it.
Then you should set 'CLIMAT_enddate' to the date until which you want to rerun. But NEVER change 'CLIMAT_startdate'.
You can do this while the original simulation is still running.

Restart using Chunk_lance

Go into the directory with your (new) config files.
Copy the command to restart the current month from the backup file:

  cp last_continue_with_next_job continue_with_next_job

Make sure the last line in your file 'chunk_job.log' looks line this:

... continue_with_next_job $exp starting at ...

Remove all lines below the above one.
And be careful NOT to have a blank line at the end of the file!

Then restart your simulation with:

  Chunk_lance

Restart from another simulation

Restart the post processing

It sometimes happens that you do not get all the model output in your archive.
If you are missing output for one month it is best to have a look at the listings for that month which are mostly still in ~/listings/${TRUE_HOST}.

Click here for a flowchart of the listings.

Once you found out which job stopped, crashed or never got submitted you can usually find the submission command in the listing of the previous job.

And if the job still exsist you can submit it again.