|
Questions and Answers : Windows : Comments for \'Generic solutions to models\' sticky
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
This is the comments thread for the \'generic solutions\' sticky post. Please post any queries, suggestions, and so forth here. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 4 Dec 05 Posts: 1 Credit: 49,802 RAC: 0 |
* Before playing games or other heavy duty applications (high CPU or memory usage), set \'no more work\' against the project and \'suspend\' the model - that way they won\'t tread on each other\'s toes. Sometimes simultaneous use of graphics drivers from two different programs seems to cause problems. This one worked for me. I found that my CP project started playing up after I ran GoogleEarth, which is pretty hungry for CPU, memory and graphics resources. It resulted in the sulphur_um_4.22_windows_intelx86.exe process constantly spawning a new process which closed almost immediately. I had to set the project to Run Always to fix it, but even then it reset from 40% to 0% and took a while to sort itself out properly. Next time I ran GoogleEarth I suspended the CP project first and it Resumed quite happily afterwards. |
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
From the BBC Boards: Martin Smith I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 12 Jul 05 Posts: 1 Credit: 0 RAC: 0 |
If you have multiple processors and therefore might have several Climate Prediction work processes, do not run both graphics. This causes at least one work to crash and you have to start all over agin. |
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
error code 99 AKA the \'killer trickle\' This is a remote kill command sent to shut down models which contain errors. Many of these have been sent out today (April 16th), for the reason why please see the announcement : http://www.climateprediction.net/board/viewtopic.php?t=4697 Please do not restore models shut down in this way. I'm a volunteer and my views are my own. News and Announcements and FAQ |
![]() Send message Joined: 5 Aug 04 Posts: 172 Credit: 4,023,611 RAC: 0 |
I have noticed one thing about the crash during heavy CPU usage. The CPDN application is split into two pieces - one that uses almost no CPU (M if I recall correctly) and one that uses all of the available CPU (UM if I recall). In any case, on a single CPU system with one CPDN WU running, multiple UM processes were started, I believe that if this could be prevented, this crash would stop happening. The only time that CPDN crashed on that machine, multiple CPDN results were started for the same result. I believe that if the first thing that happened during the execution of UM were to create a mutex based on the name of the result, then this crash would stop. Let us know when it has been fixed. ![]() ![]() BOINC WIKI |
![]() Send message Joined: 5 Aug 04 Posts: 22 Credit: 7,271,105 RAC: 0 |
hi there, as it was told in a linked thread, i tryed the solution with the updated graphics-driver. my mashine produced some -1073741819 (0xc0000005) errors, but prime95 was stable over 24hours. a new driver (catalyst 5.9 (the last without ccc) for the Radeon7000) solved the problem (as it seems). but: what the heck does the gravicsdriver to cdpn? and why does this error appears just now? greetings, Micha |
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
There are no child processes to wait for. (0x80) - exit code 128 (0x80) There is an announcement on the Boinc news site : Microsoft Windows has a component called DirectX that manages graphics and sound. If you are running BOINC on Windows and you don\'t have a recent version (9.0c or later) of DirectX, applications may crash with error messages like I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 8 Dec 05 Posts: 21 Credit: 215,749 RAC: 0 |
Hmmm after 45% or so, my hadcm crashed....The PC wasn\'t being used any differently and I do have BOINC excluded from Norton..... |
![]() Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Hmmm after 45% or so, my hadcm crashed....The PC wasn\'t being used any differently and I do have BOINC excluded from Norton..... Yep, it was a -161 error. Did you work through Mike\'s recommendations (first post in this Thread)? Hope you had a recent backup, that\'s a large investment to lose... "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
![]() Send message Joined: 21 Feb 05 Posts: 24 Credit: 991,032 RAC: 0 |
about the 161 error : I got 10 of 12 runs terminated by an 161 error and had a look on the yabsd.out file. But there is a flood of messages i could not get any sense out. Do you know a describtion or explanation of the content of this file ? my last job terminated today, hadcm3lbm_4eh3_05161524, without leaving any out folder or file. do you know what happened there ? thanks for help Jochen from Old Germany *** Since I'm a fool I prooved that the system is not foolproof ;-) *** |
Send message Joined: 10 Oct 05 Posts: 3 Credit: 26,902 RAC: 0 |
Well, The ID/units just keep returning Client Error -161 on and on on this machine. I\'m getting a bit tired of this, since i understand most unfinished projects (<10%) are useless for research, and i don\'t enjoy credit which wasn\'t actually beneficial, and it\'s swallowing time on this dual processor xp system which could benefit other boincs. There might be a breakthrough: recently the comptuer started crashing again and again, with a blue window, i followed the clues on the crash screen, ended up downloading a display/graphics driver (don\'t remember the details), since then >20 days and the project is running on, the fact that i set \'no more work\' might also be related, in that there\'s no way for two CP\'s to run now (whereas previously they did many times, and only now i see i should\'ve set one of their graphics off). I hope it\'ll get better, \'cause only about a third of my 22k credit is justified, a big unit on a single-cpu machine, and a small one. I hope it\'ll be possible to solve this via the BOINC/CP end, this is one of the projects i most want to contribute my infinitisimal help to. Thanks, gady b thaye.net |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
Leaving the graphics window open while switching to another user (Windows XP) crashed a workunit. It happened with workunits from other projects as well. |
Send message Joined: 1 Sep 06 Posts: 11 Credit: 4,627 RAC: 0 |
I have run climateprediction twice before and started again today. With both previous attempts the models hung my computer - mouse, keyboard, graphics all frozen solid and required me resetting or switching off. I have not seen this mentioned here and wondered if anybody else has had this problem or if anybody has any advice for me. Thanks |
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
The two most common reasons are: * Memory (but you have well over the 512MB recommended amount) * CPU overheating. Try the monitoring tools mentioned in an earlier post to get your CPU temp. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 1 Sep 06 Posts: 11 Credit: 4,627 RAC: 0 |
I checked and it does not seem to be a temperature problem. I am running several projects and both cores at 100% for hours keeps the processor at 50°C and only once I saw it at 56°C. The processor fan speed also rarely exceeds 1400 rpm. The MB temp is usually below 40°C, so I don\'t see a problem there. My problem is that everything hangs and I have to reset or switch off with power button, nothing else responds. This means I get no error messages. I am running 4 SATA drives and my pagefile and temp directories are on seperate drives and seperate from program files ... not sure if this can be a problem though as it is set in the control panel. It happens while using no graphic display, so doubt that could be the problem. The problem also occurs with only CPDN running or while other projects are running. If it is because I use BAM, is it possible to detach only CPDN from BAM and Boincstats? Thank you for the assistance so far. I really want to run this project. |
![]() ![]() Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Hi, The temperatures sound OK. I have a similar setup to yours in terms of the paging file and so forth, and it works fine. It may be worth going through the stability checks, Prime95\'s Torture Test is very good... Does the same thing happen with the Seasonal Attribution project? (attribution.cpdn.org) I'm a volunteer and my views are my own. News and Announcements and FAQ |
![]() ![]() Send message Joined: 12 Dec 05 Posts: 4 Credit: 93,834 RAC: 0 |
Hmmmmf. I just crashed a BBC sim AGAIN. I\'ve read the posts about backups and temperatures and so on, but I think that\'s all missing the point-- this software is very fragile. That is not a compliment. There are many applications that make my machine (AMD 3700+, 3GB ram) work much harder. It has never failed any app due to heat. I\'ve had four or five sims get up to around 10% and die of something or other. I don\'t have time to babysit my screensaver. It\'s frustrating that this app, alone of all the BOINC apps I\'ve tried, is so prone to crashing. Further, I really don\'t subscribe to the \"some programs crash, you know\" proposition. If I had paid for sw that behaved like this, I\'d be furious. Honestly, wouldn\'t you? I would like to help CPDN and BBC et al. save the world from climate change. Seriously. However, if my sims all die, my machine might just as well be folding proteins or tracking mosquitoes or evesdropping on ET or something. My preferred solution is robust code from CPDN. My interim solution will be to drop this project after the next crash. Harrumph. -Rick |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Perhaps you should send a copy of your complaints to the UK Met office. After all, it IS their code, and they run it daily on their supercomputers for weather and climate forecasting. So the sooner they know how terrible it is the better. |
![]() ![]() Send message Joined: 12 Dec 05 Posts: 4 Credit: 93,834 RAC: 0 |
Perhaps you should send a copy of your complaints to the UK Met office. Hi Les, Fine. Let\'s assume then that the core prediction sw runs perfectly on their supercomputers. That doesn\'t change a word of my points, it only indicates that the problem is elsewhere, for example in the Windows wrapper or the screensaver functionality. I\'m assuming that the UK Met does not run this daily on their supercomputers as a Windows screensaver. -Rick |
©2025 cpdn.org