climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 91 · Next

AuthorMessage
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57109 - Posted: 12 Oct 2017, 6:45:57 UTC

Some more seem to have been put out that will run on Linux from batch 599 (Hadcm3s) but all six I have received crashed. 5 at 19 seconds across two machines and one managed a whole two minutes something!
ID: 57109 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57111 - Posted: 12 Oct 2017, 14:15:27 UTC - in response to Message 57109.  

And all gone now. I hope some had better luck than I in running them.
ID: 57111 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 57121 - Posted: 14 Oct 2017, 14:22:47 UTC - in response to Message 57109.  

I got two work units a few days ago. They are running and have not crashed. They each have over 56 hours of Elapsed time on them.

11-Oct-2017 17:08:04 Starting task hadcm3s_5045_200012_168_671_011310038_0 using hadcm3s version 834 in slot 5
11-Oct-2017 17:08:04 Starting task hadcm3s_504w_200012_168_671_011310065_0 using hadcm3s version 834 in slot 6

They seem to be uploading trickles.

12-Oct-2017 16:31:07 Started upload of hadcm3s_504w_200012_168_671_011310065_0_r1419425307_1.zip
12-Oct-2017 16:31:21 Finished upload of hadcm3s_504w_200012_168_671_011310065_0_r1419425307_1.zip
12-Oct-2017 16:31:46 Started upload of hadcm3s_5045_200012_168_671_011310038_0_r830022721_1.zip
12-Oct-2017 16:31:56 Finished upload of hadcm3s_5045_200012_168_671_011310038_0_r830022721_1.zip

13-Oct-2017 15:41:18 Started upload of hadcm3s_504w_200012_168_671_011310065_0_r1419425307_2.zip
13-Oct-2017 15:42:07 Finished upload of hadcm3s_504w_200012_168_671_011310065_0_r1419425307_2.zip
13-Oct-2017 15:43:05 Started upload of hadcm3s_5045_200012_168_671_011310038_0_r830022721_2.zip
13-Oct-2017 15:43:40 Finished upload of hadcm3s_5045_200012_168_671_011310038_0_r830022721_2.zip
ID: 57121 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57130 - Posted: 16 Oct 2017, 23:26:37 UTC

There are 2 x 2,370 13-month ANZ at 50 km in batch #672 and batch #673 (batch list).
ID: 57130 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57143 - Posted: 19 Oct 2017, 16:37:21 UTC

Some 13-month Africa models at 50 km resolution have been added to the queue - 3,900 batch #674, 3,375 batch #675 and 3,375 batch #676; there are also 200 HADCM3S models (batch list).
ID: 57143 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57274 - Posted: 31 Oct 2017, 21:24:36 UTC - in response to Message 57143.  
Last modified: 31 Oct 2017, 21:35:24 UTC

A couple of hundred hadcm3s tasks have been released. May be more and the rest are still being loaded into the queue. Not showing up on server stats yet and if it is only a couple of hundred they might never do so.

Anyway, good news for two machines I have that have been without work since the WAH2's were withdrawn from Linux.

Perhaps not such good news. 2 batch 602 on one machine crashed just under 3 minutes in. Now waiting to see what happens on other box....Three on another slightly faster machine now 7minutes in so may be OK. Will check in morning.
ID: 57274 · Report as offensive
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2186
Credit: 64,822,615
RAC: 5,275
Message 57275 - Posted: 31 Oct 2017, 21:37:33 UTC - in response to Message 57274.  

My Phenom II picked up 5 of these new tasks from batch 602. All crashed immediately with segmentation violations.
ID: 57275 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57282 - Posted: 1 Nov 2017, 6:02:52 UTC - in response to Message 57275.  

I picked up six across two machines. On one two crashed at once. On the other three from batch 602 and one from batch 618 are still running and past five hours in. Nothing to do with machines as last time around the machine that has 4 running crashed all it got with segmentation faults.
ID: 57282 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57290 - Posted: 2 Nov 2017, 7:22:11 UTC - in response to Message 57282.  

One of the four has crashed, Interestingly with an invalid theta just before the sigseg fault. This a bit over 17 hours in. I don't remember seeing the two together before, though am pretty certain I do remember seeing tasks that have crashed with invalid theta on windows machines crashing with sigseg fault on Linux ones before.
ID: 57290 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57352 - Posted: 10 Nov 2017, 12:54:54 UTC
Last modified: 10 Nov 2017, 22:52:20 UTC

A small batch of 160 60-month Pacific North-West models at 25 km has been added - batch #678 (batch list).

[Edit: Plus 50 x PNW25/49 in batch #679, and 400 x PNW25/60 as an extension of batch #665.]
ID: 57352 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57354 - Posted: 13 Nov 2017, 12:29:57 UTC
Last modified: 13 Nov 2017, 17:52:16 UTC

A batch of 372 21-month Pacific North-West models at 25 km has been added - batch #680 (batch list).

[Edit: ... and 7,200 SAS50/3.]
ID: 57354 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57360 - Posted: 14 Nov 2017, 11:47:12 UTC

A batch of 3,600 3-month South Asia at 50 km has been added - batch #682 (batch list).
ID: 57360 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57375 - Posted: 17 Nov 2017, 16:24:47 UTC - in response to Message 57360.  

And now some PNW tasks batch 683. Still none for us penguin types though.
ID: 57375 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 57376 - Posted: 18 Nov 2017, 13:47:48 UTC - in response to Message 57375.  

Still none for us penguin types though.

I tried to set up my BOINC client to run 50% of my machine's spare time on Climate prediction. But since I run Linux, I seldom run any climate prediction at all these days, but not for lack of trying. I used to run three Climate prediction tasks at a time, month-in, month-out, but not in a long time now.
ID: 57376 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57382 - Posted: 20 Nov 2017, 21:24:01 UTC

A varied selection of batches has been released in the last few days: 10 x WUS25/120, 1650 x NAM50/13 and the return of East Asia with 500 x EAS50/12 (batch list).
ID: 57382 · Report as offensive
MossyRock
Avatar

Send message
Joined: 4 Oct 13
Posts: 27
Credit: 2,301,681
RAC: 7,632
Message 57385 - Posted: 22 Nov 2017, 5:25:55 UTC

Three of my most recently downloaded batch of 11 models have crashed. These models also crashed for my "wingmen" if that is an accurate term to use at CPDN.

Task 11361810 - wah2
Signal 11 received: Segment violation

Task 11339935 - pnw25
Unknown error

Task 11278191 - wah2
Signal 4 received: Illegal instruction - invalid function image
Signal 4 received: Floating point exception
Signal 4 received: Segment violation

Not sure what is going on. There have been no interruptions in processing at all (i.e., suspends, reboots, etc.).
ID: 57385 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57386 - Posted: 22 Nov 2017, 8:25:50 UTC - in response to Message 57385.  

At least two different problems here, one from batch 683 is a create thread error which I think is a dodgy line in one of the files for the task. The segmentation error in batch 686 I don't think the root has been found of that one, I did a quick root around but it may be too early to see if any of these tasks are completing.
ID: 57386 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57389 - Posted: 24 Nov 2017, 11:16:20 UTC

Batch #687 has just been added and has 1,600 Central America 13-month models at 50 km resolution - i.e. CAM50/13 (batch list).
ID: 57389 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4535
Credit: 18,989,107
RAC: 21,788
Message 57437 - Posted: 4 Dec 2017, 17:48:03 UTC

A few thousand pnw tasks released, batch 688.

Thinking about going back to WINE :(
ID: 57437 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 57462 - Posted: 11 Dec 2017, 11:32:55 UTC
Last modified: 11 Dec 2017, 18:59:22 UTC

A new format, Central America at 25 km for 18 months - batch #689, 325 off (batch list).

[Edit: plus 780 x EAS50/12.]
ID: 57462 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org