climateprediction.net (CPDN) home page
Thread 'hadcm3s problems'

Thread 'hadcm3s problems'

Message boards : Number crunching : hadcm3s problems
Message board moderation

To post messages, you must log in.

AuthorMessage
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58017 - Posted: 29 Mar 2018, 12:46:03 UTC

I get the occasional work unit for this kind of work unit. They seem to run correctly for me even though the fail for someone else first. For example,

20224863 1159364 21 Feb 2017, 4:39:26 UTC 3 Feb 2018, 9:59:26 UTC Timed out - no response 0.00 0.00 4,355.42 UK Met Office HadCM3 short v8.29
windows_intelx86

21046924 1256552 14 Feb 2018, 11:43:33 UTC 1 Mar 2018, 0:55:46 UTC Completed 1,252,823.67 1,139,922.00 4,355.42 UK Met Office HadCM3 short v8.34
i686-pc-linux-gnu

My machine is 1256552.

But there was something wrong with the way it actually worked. And this has happened with all "recent" hadcm3s work units. They start up, and in a day or so, they deliver a trickle, and I get credit for that trickle. They they run for a week or two, and complete normally, but deliver no more trickles and they get no more credit. So what was it doing with all the extra computing time?

Name hadcm3s_71wg_200012_168_514_010909121_1
Workunit 10909121
Created 14 Feb 2018, 11:43:21 UTC
Sent 14 Feb 2018, 11:43:33 UTC
Report deadline 27 Jan 2019, 17:03:33 UTC
Received 1 Mar 2018, 0:55:46 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 1256552
Run time 14 days 12 hours 0 min 23 sec
CPU time 13 days 4 hours 38 min 42 sec
Validate state Initial
Credit 4,355.42
Device peak FLOPS 1.28 GFLOPS
Application version UK Met Office HadCM3 short v8.34
i686-pc-linux-gnu
stderr out

<core_client_version>7.2.33</core_client_version>
<![CDATA[
<stderr_txt>
Processing restart Year 2014 Month 12 Day 1
Calling boinc_finish...18:44:34 (16936): called boinc_finish(0)
In boinc_exit called with status 0
Calloing set_signal_exit_code with status 0

</stderr_txt>

15 Feb 2018 14:16:45 1256552 21046924 hadcm3s_71wg_200012_168_514_010909121_1 1 362,952 81,899 0.2256
]]>

ID: 58017 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 58020 - Posted: 29 Mar 2018, 14:46:56 UTC

If you look at the message logs, you'll see that boinc is sending up trickles, and file uploads every 1 year, so the data should be getting up to the server. But the only trickle listed on the webpage, and credited, is the 1st one after 1 model year. The problem has something to do with the other models getting trickles and credits after 1 model month, and the hadcm3s after 1 model year. They know about the problem, it was reported long ago, and has never been fixed.
ID: 58020 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58025 - Posted: 29 Mar 2018, 19:16:41 UTC - in response to Message 58020.  

If you look at the message logs, you'll see that boinc is sending up trickles, and file uploads every 1 year, so the data should be getting up to the server. But the only trickle listed on the webpage, and credited, is the 1st one after 1 model year. The problem has something to do with the other models getting trickles and credits after 1 model month, and the hadcm3s after 1 model year. They know about the problem, it was reported long ago, and has never been fixed.


But there are no credits based on full completion awarded? It seems not. It appears it did send up 13 trickles... Here is the last of a bunch. (This may not be the same work-unit, but they look the same.)

27-Feb-2018 15:27:54 Started upload of hadcm3s_75ok_200012_168_514_010914021_2_r1993309999_13.zip
27-Feb-2018 15:28:10 Finished upload of hadcm3s_75ok_200012_168_514_010914021_2_r1993309999_13.zip

ID: 58025 · Report as offensive     Reply Quote

Message boards : Number crunching : hadcm3s problems

©2024 cpdn.org