Message boards : Number crunching : Miscellaneous problems
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Reusing an old post of mine, with a new title. Use this for any general problems not about credits or uploads. Hopefully the "can't create a new thread" problem will be solved before too long. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Project's front page is down. Reported. Time for stronger string to hold everything together. :) |
Send message Joined: 9 Apr 07 Posts: 7 Credit: 1,630,807 RAC: 0 |
1) Reporting finished WU's takes about 20-30 seconds - usaed to take 5 secs max... 2) Report of a finished WU does not erase the WU from your computer... |
Send message Joined: 15 May 09 Posts: 4542 Credit: 19,039,635 RAC: 18,944 |
Because I only see reporting when I have had network activity suspended I haven't notice 1. yet. Haven't noticed 2 recently either though it always happened with the, "Short models" on Linux. Currently the uploading problems seem to be preventing a task from reporting or perhaps it is just it won't report as finished till uploads have all gone? |
Send message Joined: 15 May 09 Posts: 4542 Credit: 19,039,635 RAC: 18,944 |
2) Report of a finished WU does not erase the WU from your computer... Could you give the batch nos of the ones that are leaving the folders behind. I am assuming you mean the folders are left rather than the task staying on the list in BOINC after being reported. - I have experienced neither problem recently so I assume it is just some batches. (will check all my machines when they next complete tasks in case I have to get into hat eating.) |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Yes, some task are not cleaning up after themselves when they finish. Task wah2_eu25_m304_202312_12_316_010289235 has finished and reported. All zip file and trickles have uploaded. It is gone form the list in boinc mange, but, it is still present in the “Projects” folder. |
Send message Joined: 15 May 09 Posts: 4542 Credit: 19,039,635 RAC: 18,944 |
Just checked all my machines and non batch 316's on any of them. I will wait a day to see in any more batch numbers are reported as affected and let Andy know. Of course the real problem is the people who don't check the fora and find themselves running out of disk space. Those of us in the know can just delete the folders. |
Send message Joined: 28 Nov 15 Posts: 50 Credit: 4,099,809 RAC: 0 |
Hi all last night 3 tasks crashed simultaneously displaying some sort of Visual Fortran error during an AVG upgrade. Somewhat strange as the tasks are on a SD CARD (Known as Drive H on my Desktop) and that drive is on AVG's Exception list. I can't recall the task names this morning, but know that 1 ended with _0, the others ended in _1. Is this a known bug? should I be using a different antivirus product? Any thoughts welcome. Enjoy your day ' Vicki |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
should I be using a different antivirus product? I avoid anti-viruses as much as possible, which is easy since all my CPDN work is on dedicated machines that are not exposed to the usual modes of transmission. But on my main PC, I have tried various AVs with minimal success. They are always monitoring something in the system, even with the "exceptions". At present, I am just using Microsoft Security Essentials, which generally does not cause problems (Win7 64-bit). I am of the opinion that if you need a very aggressive AV, then you are doing something wrong and should change your habits anyway. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I am of the opinion that if you need a very aggressive AV, then you are doing something wrong and should change your habits anyway. Or just doesn't know the level of protection needed, what each one really does, etc. Vicki Generally it's a good idea to Suspend everything one by one, and then Exit from BOINC before doing ANY upgrades. The Visual Fortran error is because the main climate program is written in Visual Fortran. Years ago I used AVG, until an upgrade started deleting models, in spite of all the excluding I tried. So within a few hours of downloading it, I got rid of it. I did a lot of reading and thinking, before deciding to use Microsoft Security Essentials, based mostly on a feeling that I was being careful anyway, and having other programs for checking everything. It might not have been the best, but it did do one thing very well: stop Windows from complaining about no AV. Towards the end, I even found out that I could keep changing the day of the week when it ran, so that it NEVER ran, and didn't complain about not having been run. :) |
Send message Joined: 4 Oct 13 Posts: 27 Credit: 2,301,681 RAC: 7,632 |
I had 2 WAH2 tasks fail back-to-back after I restarted Boinc Manager after a machine reboot for Windows updates. I always perform a normal, routine, controlled Boinc Manager shutdown before a reboot: File > Exit Boinc > Stop running tasks when exiting the BOINC Manager (checked). The failed tasks are WUs 10371899 and 10351844. Outcome: Client error Client State: Compute error Validate State: Invalid. Here is what is in the Boinc Manager Event Log: 5/11/2016 7:40:32 AM | climateprediction.net | Message from task: 0 5/11/2016 7:40:32 AM | climateprediction.net | Computation for task wah2_eu25_d756_193612_13_366_010351844_0 finished 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_3.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_4.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_5.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_6.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_7.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_8.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_9.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_10.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_11.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_12.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_13.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_14.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent 5/11/2016 7:41:02 AM | climateprediction.net | Message from task: 0 5/11/2016 7:41:02 AM | climateprediction.net | Computation for task wah2_eu25_i76p_198612_13_366_010371899_1 finished 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_2.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_3.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_4.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_5.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_6.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_7.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_8.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_9.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_10.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_11.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_12.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_13.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_14.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent 5/11/2016 8:40:49 AM | climateprediction.net | Sending scheduler request: To report completed tasks. 5/11/2016 8:40:49 AM | climateprediction.net | Reporting 2 completed tasks That's a lot of processing time down the drain. Any ideas why this happened? Thanks. |
Send message Joined: 4 Oct 13 Posts: 27 Credit: 2,301,681 RAC: 7,632 |
Ah, I just read Les' recommendation to suspend everything before exiting Boinc Manager. Would doing this prevent the crashes I experenced? Thanks. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,901,585 RAC: 2,106 |
Ah, I just read Les' recommendation to suspend everything before exiting Boinc Manager. Not necessarily. The same thing just happened to a suspended WAH2 model on one of my Windows 10 machines after one of these forced updates: they are an absolute curse. "3 AM seems a good time to reboot", says the idiotic Windows Update. Actually, that's for me to decide IMHO. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Iain, Can't you choose to have Windows notify you when an update requires a reboot? Then schedule it in the future some time, but reboot before that time when you have a chance to shutdown boinc cleanly? I haven't any system on Windows 10 yet, but likely will in the near future. |
Send message Joined: 4 Oct 13 Posts: 27 Credit: 2,301,681 RAC: 7,632 |
Iain, Thanks for your response. Was BOINC Manager up and running at the time of the reboot, even though the WAH2 model was suspended? There's a setting in Win 10 tell it NOT to reboot after updates until you explicitly give it the ok to do so - I have it set on my Win 10 machine that doesn't do any BOINC, and it has never rebooted on its own, at least, not yet. Have you set this option on yours? If you have, are you saying that it went ahead and rebooted on its own without your input? |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,901,585 RAC: 2,106 |
[geophi wrote:]Can't you choose to have Windows notify you when an update requires a reboot? Then schedule it in the future some time, but reboot before that time when you have a chance to shutdown boinc cleanly? I haven't any system on Windows 10 yet, but likely will in the near future. That's pretty much what I do. However, the recent WAH2 casualty on restart makes me rather resent the required intervention: all WAH2 tasks will now have to be completed before manually rebooting. I'll do nothing next patch Tuesday and look at the event log to see what if anything happens automatically and remind myself of the time periods offered. Perhaps I've interpreted the language as being stricter than it is ... |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,901,585 RAC: 2,106 |
[MossyRock wrote:]Was BOINC Manager up and running at the time of the reboot, even though the WAH2 model was suspended? No. I suspend the models individually (i.e. I don't suspend the project) then close the BOINC client and exit BOINC Manager before a reboot. This is because starting two WAH2 models at the same time can crash one of them (i.e. the model fails on my machine but the reissue has sometimes succeeded on a similar machine, arguing that the model was not an inevitable failure), so on restarting I start each WAH2 then wait a few minutes then start the next etc. This has worked up until recently. |
Send message Joined: 4 Oct 13 Posts: 27 Credit: 2,301,681 RAC: 7,632 |
Suspending work units individually causes new work units, that are ready to start, to begin running to "fill in the hole." This can cause quite a mess, especially if there are new CPDN models in your queue that are ready to start. You will end up with the ones that were running originally, now suspended, plus the new ones that start that you have to suspend also. Is there a way to prevent new work units from starting as you go about suspending work units individually? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Individually suspend work units is only the short form of the answer, to jog peoples memory. The full answer depends on several things, which need to be worked out by individuals. Some of these, not necessarily needed here, are: Suspend Network access (in the menu) Suspend the project (in the Projects tab) Suspend all pending models FIRST, then the running ones. And, because of the huge number of data sets waiting to be downloaded at present, there's no need for a large queue. I wait until my current models finish before downloading more. This lets me see what's available, which may be newer than what would have been downloaded way back, and some of them may be of more interest, for lots of reasons. |
Send message Joined: 15 May 09 Posts: 4542 Credit: 19,039,635 RAC: 18,944 |
[Ian Inglis Wrote] so on restarting I start each WAH2 then wait a few minutes then start the next etc Interesting, I will try this next time I reboot one of my two machines running tasks directly under Linux as opposed to the other two that are pretending to be windows machines. I have had some tasks fail when I have restarted two tasks of the same type within seconds of each other on both of these machines. At least Linux just gives me a message telling me that I need to reboot to use my updated software. |
©2024 cpdn.org