climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 91 · Next

AuthorMessage
Thomas Wiegand

Send message
Joined: 4 Jul 19
Posts: 31
Credit: 252,192
RAC: 0
Message 60766 - Posted: 1 Aug 2019, 2:51:27 UTC
Last modified: 1 Aug 2019, 2:57:32 UTC

I just got new Tasks 31 Jul 2019, 22:54:56 UTC. but pc crashed 3 times yet.
but the other also running PC with same CPU got nothing. ???

At first, and after 2 restart even 4 tasks where running, now after again crash/freeze only 2 came up.
see: https://www.cpdn.org/cpdnboinc/results.php?hostid=1487884 - show 4
ID: 60766 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 60769 - Posted: 1 Aug 2019, 4:20:53 UTC - in response to Message 60766.  

They're not "new" tasks, but some that have failed on other computers.

All tasks are issued in the hope that they'll get run right through to the finish.
But lots don't for various reasons. These are re-issued, and if they fail again, then they get issued for a 3rd time.
After this, they get dumped.

You can see this by the number at the right hand end of the name.

As for some of your computers getting nothing, they weren't asking for work at the exact moment that those re-issues appeared, and your computer that got them was asking for a lot of work.

********************

There is NO new work at present, and may not be for a while yet.
ID: 60769 · Report as offensive
Wilgard

Send message
Joined: 30 Mar 10
Posts: 12
Credit: 2,609,109
RAC: 87
Message 60812 - Posted: 8 Aug 2019, 7:26:59 UTC - in response to Message 60769.  


There is NO new work at present, and may not be for a while yet.


Do you have "an estimate" about when it could be possible to get new works ?

Best Regards,
ID: 60812 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60813 - Posted: 8 Aug 2019, 9:22:06 UTC - in response to Message 60812.  

Do you have "an estimate" about when it could be possible to get new works ?


Afraid not. While moderators have fairly frequent contact with those at Oxford (mostly about issues that crop up on these boards or for stuff happening in testing) the work comes from researchers at universities all over the world and we are not privy to their timetables. Sometimes a testing batch is followed up by work to this the main site fairly quickly, sometimes there seems no connection. The discussion board used by the people at Oxford among themselves and the moderators rarely gives more than a few hours notice that a batch of work is on the way. Just occasionally we get hints a day or two in advance but that is it.

As George stated in another thread, there is some testing happening at the moment but whether that is in anticipation of some work here tossing a coin is about as good as it gets.
ID: 60813 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60818 - Posted: 8 Aug 2019, 12:57:10 UTC - in response to Message 60813.  
Last modified: 8 Aug 2019, 22:32:08 UTC

Batch 831 500 pnw tasks but two crunchers have posted about download failures on them so carrier pigeon is on its way from Cambridge to Oxford to let them know.

Edit: Also some hadam4 tasks for Linux just about ready to go but above batch pulled till download failures issue resolved and these not going out till download problems fixed either. Don't know numbers of Linux tasks yet.

See thread adjacent to this for the download failures problem.

Edit2:It is looking increasingly likely that the problems causing the download failures will not be resolved till next week when Andy is back and there is another issue with the Linux tasks in testing which may delay the main site release.

Edit3: And that is why most moderators don't post information about future batches because so often the information changes and avoiding misleading people requires further posts!
ID: 60818 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60941 - Posted: 19 Sep 2019, 10:42:20 UTC
Last modified: 19 Sep 2019, 10:44:44 UTC

Batch 832
600 cam25 tasks 18 month runs so will take a several days maybe even a week on the fastest machines.
I notice someone has already had problems uploading zips tot he Mexico server. This has been reported back to the Oxford.

Edit: All 600 came and went before I had a chance to get any. With luck some more will be on the way soon. There are attempts going on now to try and get a hadcm3s batch out which if successful will please Linux users.
ID: 60941 · Report as offensive
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 60948 - Posted: 19 Sep 2019, 15:32:26 UTC
Last modified: 19 Sep 2019, 15:48:26 UTC

I'm running 28 other BOINC projects, so i have no shortage of work,
but i would like to get some climate, to help the cause :)
any sign of a new batch?
I guess i missed the last batch?
Server Status:
Tasks ready to send - 0
ID: 60948 · Report as offensive
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 60949 - Posted: 19 Sep 2019, 15:56:13 UTC - in response to Message 60941.  

Batch 832



Hopefully the noble 600 were just a test batch for a much bigger release to come shortly. After such a long work drought there will be lots of hungry computers waiting to suck them done fast.
ID: 60949 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 60950 - Posted: 19 Sep 2019, 16:17:33 UTC - in response to Message 60949.  

After such a long work drought there will be lots of hungry computers waiting to suck them done fast.

That is why I am no longer on Windows, but have switched to Linux. Maybe they need me for the large work units.
If not, there is no point in tying up their server with useless requests. Other people can do them as well as I can.
ID: 60950 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 60956 - Posted: 19 Sep 2019, 21:07:15 UTC

Batch 833, of 500 "CA camp fire attribution" models have shown up out of the blue, but they're nearly all gone.,

I didn't see anything on Project stats, just a private list, so I'm going to go and whinge about it.
ID: 60956 · Report as offensive
ProfileByron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 60968 - Posted: 23 Sep 2019, 2:36:39 UTC
Last modified: 23 Sep 2019, 3:24:19 UTC

<edit> for spelling and trying to fix Link.

Batch 833

Name: wah2_pnw25_c15j_201709_16_833_011891432_0

My Computer just returned 15 WU - [Batch 833] - to Climate Server as follows:

5 ---- Error while computing
10 -- Completed

https://www.cpdn.org/results.php?hostid=1485364

https://www.cpdn.org/server_status.php

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
The system cannot find the drive specified.
 (0xf) - exit code 15 (0xf)</message>
<stderr_txt>
No Process Handle
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=15916, selfPID=15916, iMonCtr=1
No Process Handle
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=15916, selfPID=14744, iMonCtr=1

</stderr_txt>
]]>
ID: 60968 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60970 - Posted: 23 Sep 2019, 6:28:12 UTC - in response to Message 60968.  

My Computer just returned 15 WU - [Batch 833] - to Climate Server as follows:

5 ---- Error while computing
10 -- Completed


Overall 64 (13%) have completed so far. 5% fails. I don't have access without looking at each failure individually to see how many of that 5% are the same tasks failing twice, just that none are hard fails having gone belly up three times. Will keep an eye on this.
ID: 60970 · Report as offensive
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 60972 - Posted: 24 Sep 2019, 4:46:25 UTC - in response to Message 60968.  


5 ---- Error while computing
10 -- Completed


That "system cannot find the drive specified" error is quite vexing. I get that every once in awhile. Based on pure observation, it seems to happen when lots of disk writes from multiple tasks are occurring at the same time, and almost always as some task (the one that fails) is at the end of a month, perhaps writing upload files. Google search doesn't seem to lead to much in the way of explanation or solutions to this problem.
ID: 60972 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60973 - Posted: 24 Sep 2019, 5:57:17 UTC - in response to Message 60972.  

That "system cannot find the drive specified" error is quite vexing. I get that every once in awhile. Based on pure observation, it seems to happen when lots of disk writes from multiple tasks are occurring at the same time, and almost always as some task (the one that fails) is at the end of a month, perhaps writing upload files. Google search doesn't seem to lead to much in the way of explanation or solutions to this problem.


That might explain why I haven't experienced them yet. Neither of my computers have more than 4 cores. I wonder if the pci express solid state drives would reduce the problem with speeds up to about five times those of ordinary sata drives?
ID: 60973 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,826,970
RAC: 5,066
Message 60974 - Posted: 24 Sep 2019, 8:50:50 UTC

My memory, which may be very out of date, is that the BOINC system allows a switch in which an error code in one system is reported as an error code in another. Thus, for example, IBM FORTRAN error 15:

15
Indicates an XL Fortran message

... might be reported as Windows error 15 ...

ERROR_INVALID_DRIVE
15 (0xF)
The system cannot find the drive specified.

Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it.

In other words, the error number may matter but the words may not.
ID: 60974 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60975 - Posted: 24 Sep 2019, 9:03:32 UTC - in response to Message 60974.  

Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it.


I have built my own Linux client and manager, I haven't yet managed to determine whether it is possible to roll your own Windows client using WINE and even if possible am a long way off gathering the prerequisites needed in order to do so.
ID: 60975 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,826,970
RAC: 5,066
Message 60976 - Posted: 24 Sep 2019, 9:14:24 UTC - in response to Message 60975.  

Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it.


I have built my own Linux client and manager, I haven't yet managed to determine whether it is possible to roll your own Windows client using WINE and even if possible am a long way off gathering the prerequisites needed in order to do so.

It will be interesting to see what error text you get ...

... the Web site reports Byron’s failed models as a more cautious “15 (0x0000000F) Unknown error code”, and only commits to an interpretation in the stderr listing.
ID: 60976 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60977 - Posted: 24 Sep 2019, 11:21:52 UTC
Last modified: 24 Sep 2019, 12:51:42 UTC

Batch 834 (Micro batch of 10 IFS tasks) And all gone. They came and went during the hour back off on my laptop so waiting to see if the hadcm3s make it this time.

Edit:Any observations by those who pick up from this test batch will be most welcome.

Edit2: Two of them have failed at the file unpacking stage so 0seconds run time. I guess that was why this was a small test batch.
ID: 60977 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 60980 - Posted: 24 Sep 2019, 13:56:06 UTC - in response to Message 60977.  

Edit2: Two of them have failed at the file unpacking stage so 0seconds run time. I guess that was why this was a small test batch.


But five have completed successfully so far.
ID: 60980 · Report as offensive
ProfileBonsai911

Send message
Joined: 9 Sep 04
Posts: 228
Credit: 30,756,611
RAC: 3,303
Message 60987 - Posted: 25 Sep 2019, 12:23:23 UTC
Last modified: 25 Sep 2019, 12:49:31 UTC

No wu transfer?

Boinc Manager shows:

25.09.2019 14:18:47 | climateprediction.net | No tasks sent
25.09.2019 14:18:47 | climateprediction.net | No tasks are available for UK Met Office Coupled Model Full Resolution Ocean
25.09.2019 14:18:47 | climateprediction.net | No tasks are available for UK Met Office HadCM3 short

But on the server status are 1607 UK Met Office HadCM3 short are shown.

Is this a server failure?

Ahhhhh, only for Linux/x86[/color] [color=indigo]!!!!
(Layout tags shown?!)
Greetings

bonsai911
ID: 60987 · Report as offensive
Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org