In comments you link your related question where you're trying to use linux process priority to make jobs run faster / use more CPU
There you ask
CQLOAD (what does it mean too?)
The docs for this are hard to find, but you link to the spec of your cluster, which tells us that the scheduling engine for it is Sun's *Grid Engine". Man pages are here (you can access them locally too - in particular typing man qstat
)
If you search through for qstat -g c
, you will see the outputs described. In particular, the second column (CQLOAD) is described as:
OUTPUT FORMATS
...
an average of the normalized load average of all queue
hosts. In order to reflect each hosts different signifi-
cance the number of configured slots is used as a weight-
ing factor when determining cluster queue load. Please
note that only hosts with a np_load_value are considered
for this value. When queue selection is applied only data
about selected queues is considered in this formula. If
the load value is not available at any of the hosts '-
NA-' is printed instead of the value from the complex
attribute definition.
This means that CQLOAD gives an indication of how utilized the processors are in the queue. Your output shows 0.84
: the average load on processors in all.q
is 84%. This doesn't seem too low.
You state colleagues are complaining that your processes are not using enough CPU. I'm not sure what that's based on, but I wonder if it's just because you're using a lot of nodes (even if just for a short time).
You might want to experiment with using fewer nodes (unless your results are very slow) - that is achieved by altering the line #$ -pe mpi 30
- maybe take the number 30
down. You can work out how many nodes you need (roughly) by timing how long 1 model run takes on your computer and then use
N = (time to run 1 job) * number of runs in experiment) / time you want the run to take
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…