Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,001,532 RAC: 21,726 |
One of the w/u has crashed with Signal 11 received: Segment violation. There have been batches in the past where all have failed with this. Several other batches have had a small percentage with this error.I and other mods will keep an eye open to see which this is. I don't have any of the SAFR tasks running at the moment but 43 of batch 789 have completed successfully so far. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,001,532 RAC: 21,726 |
I also see that both seem to have crashed shortly after uploading the 8th zip file. 43 of 789 have finished successfully when I last looked. but about 9% have failed. Project has been informed and will be keeping an eye on this. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
The first three of my 789 (as well at the first of my 790) that have run on my new Ryzen 2600 machine have failed. https://www.cpdn.org/cpdnboinc/results.php?hostid=1480861&offset=0&show_names=0&state=6&appid= The run times look great for the remaining 789's; about 4 days 8 hours. That is running on 9 cores, with 3 cores free. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
It seems that all 3 w/u have failed with :Segment violation. I have one left to do but may as well abort it. 3 out of 3 fail is not encouraging for the 4th w/u which is a 789 I think. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I would keep it going. Mine failed at 7, 8 or 9 trickles. The others have gone to 10 trickles or more by now, and might make it. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
I was finding that the Intel I5 win10 laptop was having many, many wireless dropouts while doing the w/u. Now they have all died the machine seems to be fine again. A coincidence maybe. It was fine with other climate w/u. Once, back in the depths of time I had what was called a "farm" of some 55-60 computers doing several projects. Before the arrival of the mighty pentium4 and a pentium Pro was still good enough for seti work. Rising eleccy prices means I usually only use 1 laptop nowadays. With other machines joining in occasionally. I have not found many aliens yet. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
My first 793 resulted in a download error. I may never get to find out if they run or not. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,001,532 RAC: 21,726 |
Another message to the project I think. Steven thinks 790's may fail more than 789's just because they are longer at twenty months. |
Send message Joined: 27 Feb 08 Posts: 41 Credit: 1,402,356 RAC: 0 |
I too am getting errors: https://www.cpdn.org/cpdnboinc/result.php?resultid=21509158 https://www.cpdn.org/cpdnboinc/result.php?resultid=21518555 Regards, Bob P. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
And another slug of Australia and New Zealand (batch #793: 8320 x ANZ at 50 km resolution for 20 months, batch list). [Edit: As mentioned by geophi above, this is batch #794 not #793.] |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I believe Iain meant batch 794 in his last post. Also, 480, 60-month SAM25 tasks released in batch 795. These are like the long-running ones released in early November of last year. Will likely take 25+ days on the fastest PCs. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,971,756 RAC: 14,149 |
Also got a failure: 08-Mar-2019 18:22:40 [climateprediction.net] Computation for task wah2_eas50_e2cw_201704_18_787_011733412_0 finished 08-Mar-2019 18:22:40 [climateprediction.net] Output file wah2_eas50_e2cw_201704_18_787_011733412_0_r48652267_16.zip for task wah2_eas50_e2cw_201704_18_787_011733412_0 absent 08-Mar-2019 18:22:40 [climateprediction.net] Output file wah2_eas50_e2cw_201704_18_787_011733412_0_r48652267_17.zip for task wah2_eas50_e2cw_201704_18_787_011733412_0 absent 08-Mar-2019 18:22:40 [climateprediction.net] Output file wah2_eas50_e2cw_201704_18_787_011733412_0_r48652267_18.zip for task wah2_eas50_e2cw_201704_18_787_011733412_0 absent 08-Mar-2019 18:22:40 [climateprediction.net] Output file wah2_eas50_e2cw_201704_18_787_011733412_0_r48652267_restart.zip for task wah2_eas50_e2cw_201704_18_787_011733412_0 absent 08-Mar-2019 18:22:40 [climateprediction.net] Output file wah2_eas50_e2cw_201704_18_787_011733412_0_r48652267_out.zip for task wah2_eas50_e2cw_201704_18_787_011733412_0 absent |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Also, 480, 60-month SAM25 tasks released in batch 795. These are like the long-running ones released in early November of last year. Will likely take 25+ days on the fastest PCs. I presume that many people will try to run them on notebooks. It will be a disaster, or at least a waste of computing time with all the errors. They should allow us to select the ones we want to do. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I think that a better idea is to be selective about the computers. The Requirements page is still OK for the older models, but will be hopeless for the new ones. I'm going to start a discussion about it, starting with at least upgrading the minimum requirements. And I think a new thread here to talk about the monster machines that are starting to show up. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Great idea. I think anything you do to better match the wide range of work to the wide range of computers will help. The crunchers will be happier too. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,001,532 RAC: 21,726 |
I believe Iain meant batch 794 in his last post. 793 are the actual runs, 794 the Natural but it is too early and I can't remember exactly what that means. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
Small batch #796 of 38 global models at 25 km resolution for 1 month (batch list). |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
With these longer models now coming on stream, I have gone back to the old days and once more started making regular backups of BOINC_Data so that power outages, vacuum cleaner interference et al don't lose a shed load of invested time. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Batches 797 and 798 are SAM 25km models of 24 and 13 months respectively. There are 3000+ tasks in each batch. My i7 grabbed 2 of the batch 797 tasks and they each failed with a Signal 11 error 2 minutes into the run. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
With these longer models now coming on stream, I have gone back to the old days and once more started making regular backups of BOINC_Data so that power outages, vacuum cleaner interference et al don't lose a shed load of invested time. Me too. Loosing a Wu after 2 or 3 months of crunching due to a power failure is heartbreaking. |
©2024 cpdn.org