Message boards : Number crunching : Aborting Tasks
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jan 11 Posts: 175 Credit: 6,242,691 RAC: 699 |
Does aborting a task make it immediately available for re-issue - or thereabouts? |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Pretty much. As long as work unit hasn't reached its "Max # of errors" total, which is usually 3 for the work units being sent out nowadays. |
Send message Joined: 15 Jan 11 Posts: 175 Credit: 6,242,691 RAC: 699 |
Thanks, I thought that was probably the case. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
Following the recent outage among the work units I have got are some from earlier batches - 660, 709 and 719. All are on their third go. Should I allow these to run or should I abort? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Nothing in the 700 range has been closed yet, so those 2 are OK. I don't see 660 in the closed list either, but it is getting a bit old. Best to get it run soonish. Yesterday I completed a 647 that one of my computers picked up 12 hours before the nam50's came out, so good to get that out of the way. Which leaves a 719 still running with 2 days to go. Just under 8 days run time so far to reach 85%, if that helps. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,975,898 RAC: 14,500 |
Thanks Les. The 660 is about 25% complete after just under 2 days so shouldn't take much longer. The 709 and 719 haven't started yet. The 735s that I have had have been taking between 3 nad 4 days to complete. I'll keep plodding. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Hi folks, I got a hadcm3s_a18n_203412_120_599 on my Linux box. It is in its 3rd attempt and I wonder should I let it finished or this batch is of no scientific interest any more? |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,002,360 RAC: 21,497 |
I got a hadcm3s_a18n_203412_120_599 on my Linux box. It is in its 3rd attempt and I wonder should I let it finished or this batch is of no scientific interest any more? It is still in the current list, not in batches to be closed or closed lists. But as you note it is quite old. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
But it's a "short", which are usually a challenge to complete, so you could have a go at it just to see if your computer can do it. :) |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
I ruined it. I did system update before the WU finished and after restart it ended with computation error. Here is the error I updated from 14.04 to 16.04 LTS and after the crash I checked missing libraries and I had to install gcc.4.7-multilib. Not sure if this was the reason, but I really had to let if finish first - it was its 3rd attempt |
©2024 cpdn.org