Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 4 Jul 19 Posts: 31 Credit: 252,192 RAC: 0 |
I just got new Tasks 31 Jul 2019, 22:54:56 UTC. but pc crashed 3 times yet. but the other also running PC with same CPU got nothing. ??? At first, and after 2 restart even 4 tasks where running, now after again crash/freeze only 2 came up. see: https://www.cpdn.org/cpdnboinc/results.php?hostid=1487884 - show 4 |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
They're not "new" tasks, but some that have failed on other computers. All tasks are issued in the hope that they'll get run right through to the finish. But lots don't for various reasons. These are re-issued, and if they fail again, then they get issued for a 3rd time. After this, they get dumped. You can see this by the number at the right hand end of the name. As for some of your computers getting nothing, they weren't asking for work at the exact moment that those re-issues appeared, and your computer that got them was asking for a lot of work. ******************** There is NO new work at present, and may not be for a while yet. |
Send message Joined: 30 Mar 10 Posts: 12 Credit: 2,609,109 RAC: 87 |
Do you have "an estimate" about when it could be possible to get new works ? Best Regards, |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Do you have "an estimate" about when it could be possible to get new works ? Afraid not. While moderators have fairly frequent contact with those at Oxford (mostly about issues that crop up on these boards or for stuff happening in testing) the work comes from researchers at universities all over the world and we are not privy to their timetables. Sometimes a testing batch is followed up by work to this the main site fairly quickly, sometimes there seems no connection. The discussion board used by the people at Oxford among themselves and the moderators rarely gives more than a few hours notice that a batch of work is on the way. Just occasionally we get hints a day or two in advance but that is it. As George stated in another thread, there is some testing happening at the moment but whether that is in anticipation of some work here tossing a coin is about as good as it gets. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Batch 831 500 pnw tasks but two crunchers have posted about download failures on them so carrier pigeon is on its way from Cambridge to Oxford to let them know. Edit: Also some hadam4 tasks for Linux just about ready to go but above batch pulled till download failures issue resolved and these not going out till download problems fixed either. Don't know numbers of Linux tasks yet. See thread adjacent to this for the download failures problem. Edit2:It is looking increasingly likely that the problems causing the download failures will not be resolved till next week when Andy is back and there is another issue with the Linux tasks in testing which may delay the main site release. Edit3: And that is why most moderators don't post information about future batches because so often the information changes and avoiding misleading people requires further posts! |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Batch 832 600 cam25 tasks 18 month runs so will take a several days maybe even a week on the fastest machines. I notice someone has already had problems uploading zips tot he Mexico server. This has been reported back to the Oxford. Edit: All 600 came and went before I had a chance to get any. With luck some more will be on the way soon. There are attempts going on now to try and get a hadcm3s batch out which if successful will please Linux users. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
I'm running 28 other BOINC projects, so i have no shortage of work, but i would like to get some climate, to help the cause :) any sign of a new batch? I guess i missed the last batch? Server Status: Tasks ready to send - 0 |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Batch 832 Hopefully the noble 600 were just a test batch for a much bigger release to come shortly. After such a long work drought there will be lots of hungry computers waiting to suck them done fast. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
After such a long work drought there will be lots of hungry computers waiting to suck them done fast. That is why I am no longer on Windows, but have switched to Linux. Maybe they need me for the large work units. If not, there is no point in tying up their server with useless requests. Other people can do them as well as I can. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Batch 833, of 500 "CA camp fire attribution" models have shown up out of the blue, but they're nearly all gone., I didn't see anything on Project stats, just a private list, so I'm going to go and whinge about it. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
<edit> for spelling and trying to fix Link. Batch 833 Name: wah2_pnw25_c15j_201709_16_833_011891432_0 My Computer just returned 15 WU - [Batch 833] - to Climate Server as follows: 5 ---- Error while computing 10 -- Completed https://www.cpdn.org/results.php?hostid=1485364 https://www.cpdn.org/server_status.php <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> The system cannot find the drive specified. (0xf) - exit code 15 (0xf)</message> <stderr_txt> No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=15916, selfPID=15916, iMonCtr=1 No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=15916, selfPID=14744, iMonCtr=1 </stderr_txt> ]]> |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
My Computer just returned 15 WU - [Batch 833] - to Climate Server as follows: Overall 64 (13%) have completed so far. 5% fails. I don't have access without looking at each failure individually to see how many of that 5% are the same tasks failing twice, just that none are hard fails having gone belly up three times. Will keep an eye on this. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
That "system cannot find the drive specified" error is quite vexing. I get that every once in awhile. Based on pure observation, it seems to happen when lots of disk writes from multiple tasks are occurring at the same time, and almost always as some task (the one that fails) is at the end of a month, perhaps writing upload files. Google search doesn't seem to lead to much in the way of explanation or solutions to this problem. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
That "system cannot find the drive specified" error is quite vexing. I get that every once in awhile. Based on pure observation, it seems to happen when lots of disk writes from multiple tasks are occurring at the same time, and almost always as some task (the one that fails) is at the end of a month, perhaps writing upload files. Google search doesn't seem to lead to much in the way of explanation or solutions to this problem. That might explain why I haven't experienced them yet. Neither of my computers have more than 4 cores. I wonder if the pci express solid state drives would reduce the problem with speeds up to about five times those of ordinary sata drives? |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
My memory, which may be very out of date, is that the BOINC system allows a switch in which an error code in one system is reported as an error code in another. Thus, for example, IBM FORTRAN error 15: 15 Indicates an XL Fortran message ... might be reported as Windows error 15 ... ERROR_INVALID_DRIVE 15 (0xF) The system cannot find the drive specified. Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it. In other words, the error number may matter but the words may not. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it. I have built my own Linux client and manager, I haven't yet managed to determine whether it is possible to roll your own Windows client using WINE and even if possible am a long way off gathering the prerequisites needed in order to do so. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
Replace FORTRAN compiler as necessary or if people here who build their own client know better then take their word for it. It will be interesting to see what error text you get ... ... the Web site reports Byron’s failed models as a more cautious “15 (0x0000000F) Unknown error code”, and only commits to an interpretation in the stderr listing. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Batch 834 (Micro batch of 10 IFS tasks) And all gone. They came and went during the hour back off on my laptop so waiting to see if the hadcm3s make it this time. Edit:Any observations by those who pick up from this test batch will be most welcome. Edit2: Two of them have failed at the file unpacking stage so 0seconds run time. I guess that was why this was a small test batch. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
Edit2: Two of them have failed at the file unpacking stage so 0seconds run time. I guess that was why this was a small test batch. But five have completed successfully so far. |
Send message Joined: 9 Sep 04 Posts: 228 Credit: 30,750,791 RAC: 3,898 |
No wu transfer? Boinc Manager shows: 25.09.2019 14:18:47 | climateprediction.net | No tasks sent 25.09.2019 14:18:47 | climateprediction.net | No tasks are available for UK Met Office Coupled Model Full Resolution Ocean 25.09.2019 14:18:47 | climateprediction.net | No tasks are available for UK Met Office HadCM3 short But on the server status are 1607 UK Met Office HadCM3 short are shown. Is this a server failure? Ahhhhh, only for Linux/x86[/color] [color=indigo]!!!! (Layout tags shown?!) Greetings bonsai911 |
©2024 cpdn.org