Message boards :
Number crunching :
Recycled Work Units
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 Jul 15 Posts: 21 Credit: 4,188,925 RAC: 2,009 |
I've observed the anomaly (to me) that failed tasks are immediately reassigned to another user. This topic may have been covered in previous threads; I apologize if I'm covering old ground. For example, the sole Work Unit I'm currently running: wah2_eas25_h24l_201412_24_1006_012261565 had two previous incarnations. The first started 13 Feb. and errored-out on 31 March. It was reassigned 27 seconds later and errored out on 22 April. It was again reassigned - this time to me 26 seconds later - and has been running ever since. Task .......... Computer.....Sent.............. ................Time reported/deadline ..Status...................Run time(sec) 22400282 ..1542765.......13 Feb 2024, 9:24:13.....31 Mar 2024, 14:02:04 ..Error computing...171,716.22 22415741 ..1454376.......31 Mar 2024, 14:02:31...22 Apr 2024, 22:43:55 ..Error computing ...159,099.67 22430790 ..1367467.......22 Apr 2024, 22:44:21....20 Aug 2024, 22:44:21..In progress...........--- Similarly, other Work Units that had compute errors on my computer, were picked up by others within seconds of the error. It seems that there are no more than three attempts to complete a Work Unit, however. Is this the normal paradigm? |
Send message Joined: 29 Oct 17 Posts: 1044 Credit: 16,196,312 RAC: 12,647 |
That's correct. That's normal operation. Each workunit has a maximum of three attempts at a successful run before being declared a failure. Note that unlike some other projects, CPDN do not need more than 1 successful task to treat the workunit as succeeded. --- CPDN Visiting Scientist |
©2024 cpdn.org