HOWTO setup and Run an Ensemble MetUM-GOML Experiment

There are two methods for starting an ensemble of MetUM-GOML jobs on Archer, a manual method and an automatic method. For either there are two steps

Atmosphere Initial Conditions

To generate these initial atmospheric conditions, first copy your base umui job (e.g. for experiment or control).
In this copy, go to
 Atmosphere -> Control - post processing, dumping & meaning -> dumping and meaning
In 'Restart dumps every' - changed to 1 day.
close & save.
Next, change:
 Compilation and Run -> UM scripts build
              - turn ON enable build of UM scripts
    Compilation and Run -> Compile and run potions...
           - turn OFF Run the model
save, process and SUBMIT.
Once script compilation is complete, turn build of UM scripts OFF, and Run the model back ON.
Save, Process & Submit.
if you submit this job to the short queue for 20 minutes (MAX short queue)
use:
    ~lrdlrh/bin/short 1 20 
Which will submit the most recently uploaded job to the SHORT queue for 20 minutes.
This should generate ~ 15 days of atmosphere dumps.
We will next use these to initialize the ensemble.
However, we want to start each member of the ensemble from the 1st of the month (e.g. September). If we use the atmosphere dumps directly, but set the model to start on the 1st sept, then the model will crash, because the model start date will not be consistent with the internal date in the atmosphere dump file. We therefore need to overwrite this date in atmosphere to make all dates consistent.
we use:
    /work/n02/n02/hum/bin/change_dump_date

change_dump_date dumpfile year month day hour minute second   
So, if we want to change all the dump dates to 1st Sept 2000 do
    for i in *.da*; do change_dump_date $i 2000 9 1 0 0 0;done

Launching ensemble

Manual method:

In the manual method, create a copy of the base job for each ensemble member. Then alter the start dump location in
    Atmosphere -> Ancillary and input data files -> start dump
     -> Specify the input dump for the atmosphere model
Ensure that the directory path is on the /work disk (ARCHER can't see /home).
Then change the start dates at the top of this window with the common ensemble start date chosen and set using change_dump_date
Next, for each job, run the um scripts build
Then setup the ocean for each job, using a copy of the setup script used for the base job.
Finally, submit the umui job to the queue.
(Don't forget to set the correct Run length in the 3d_ocn.nml AND the
 UMUI -> input.Output control and resources -> Re-Submission Pattern
and make sure that the Target run length (
 Input/Output control and resources -> Start date and Run length Options
) is >10 years. )
In the standard queue, we can get about 4yrs 6 months in 24 hours, but there is also a LONG 48hr queue - the model will run for about 8 yrs 8 months in this queue. To submit to this queue use my long command
    /home/n02/n02/lrdlrh/bin/long 

Automatic Method

I've developed another method to launch ensembles automatically. This uses a base job and a directory of atmosphere initial conditions (as genorated above). It copies the base job, converts it into a unique job, creates directories, compiles scripts, substitutes the chosen atmosphere initial conditions and the launches all the ensemble members as a single large job.
An example script to submit an ensemble job is here:
    /home/n02/n02/lrdlrh/work/hadgem3-kpp/do_scripts/ENSEMBLE_Example
This script uses XNOHC as a base job, make four copies of this job (XNHOD - XNHOG) initialises each with a unique randomly selected atmosphere start dump from $dump_dir. Then submits all these jobs as a single large job to the STANDARD queue for 1 hour.
The atmosphere start dumps used will be listed in
<JOB_BASE>_DUMP_LIST
e.g.
XNOH_DUMP_LIST


    There are queue limits for the size of a job (in terms of processors), But at N96 you should
    be able to submit up to 16 jobs to the 48hr queue.

You can submit a test ensemble to the short queue using
    number_of_ensemble_members=2
    queue=short
This will submit a 2 member ensemble to the short queue - you can then check the leave file to check that everything worked, before resubmitting a full ensemble.
Page updated on