Questions and Answers : Windows : Is my computer working uselessly?
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Hello. I\'m new to BOINC. I registered on 2008-05-28 and I don\'t see anything change on my account (3 days later), my quota is stuck at 1 per day and I see messages like \"Output file hadsm3fub_jotf_005950294_7_2.zip for task hadsm3fub_jotf_005950294_7 absent\". Is everything ok or is there a problem and how can I solve it? -- Frederic |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
Frederic Your machine is crashing every result you have tried to crunch. The crashes are all the -107 crashes, which are usually hardware related. I noticed you have a mobile CPU, is this a laptop? I also noticed it has a shared memory space for video, are you using the screen saver? Your machine has a light amount of memory for this project. If you want to continue to do this projected, a few suggestions: Do not use the BOINC / CPDN screen saver. Recommended use BLANK. Any of the screen savers on that machine are going to take up valuable resources and could cause crashes. If a laptop, get it off the \"desk\". If you can use something to raise it a little and allow it to \"breath\" more, or better yet buy a laptop cooler. Make sure all the airflow spots are free of dust. If you continue to crash models, then your may want to start doing some intensive testing of your machine. Like memtest86+ and Prime95 are a couple good testers. I just noticed your speeds are low for that machine... it is having issues. Make sure you do some of the above cleaning / testing. Good luck, and know that not all machines can handle CPDN. |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Thanks for your answer. Your machine is crashing every result you have tried to crunch. The crashes are all the -107 crashes, which are usually hardware related. Sorry, I don\'t see the crashes as such in the Messages pane. All I see are \"Started download ... / Finished download ... / Starting ... / Starting task ... / Computation for ... finished \" and then the \"Output ... absent\" messages. Nothing I recognize as error messages (except for \"absent\"). Is this as expected? I noticed you have a mobile CPU, is this a laptop? I also noticed it has a shared memory space for video, are you using the screen saver? Your machine has a light amount of memory for this project. Yes it is a laptop. Sorry, that\'s all I have to offer :-( I don\'t use the BOINC screen saver, though. I used the standard XP screen saver, but I just switched to the empty screen screen saver to see if it improves memory usage. If a laptop, get it off the \"desk\". If you can use something to raise it a little and allow it to \"breath\" more, or better yet buy a laptop cooler. Make sure all the airflow spots are free of dust. I don\'t experience any crash. I own two laptops but I currently use this one for BOINC because the other one is having stability issues (probably OS-related) and I use this one to remotely access (TS client) a remote server, so it has more CPU time available. Since I use this laptop to access to my work, I can assure you that any crash would be immediately noticed! I just noticed your speeds are low for that machine... it is having issues. Make sure you do some of the above cleaning / testing. I am going to send the other laptop to be fixed. When it comes back, I reinstall the OS and switch to it. Good luck, and know that not all machines can handle CPDN. Thanks. Frederic |
Send message Joined: 5 Feb 05 Posts: 465 Credit: 1,914,189 RAC: 0 |
Frederic, Go to your account, and view the tasks. They all say they are \"Compute error\", which means the result crashed when working. These results on your computer should take a few months to finish, not the few minutes they are taking. You never get to a stage to even send back a trickle (a bit of data that shows the progress your result has done at a certain point in time). Your computer isn\'t crashing, but the result itself is. |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Thanks, it\'s much much clearer, now. I hope the aborted tasks are not lost. And I am a bit disappointed by the lack of hints about what is going wrong. I suppose there is nothing you can do about it, but it is frustrating when a software aborts without any information about why it did so. I checked the whole directory tree without finding anything. I found a stderr_um.txt, but it is empty :( If tomorrow\'s job does not work properly, I\'m going to unsubscribe from climateprediction; consuming watts uselessly is definitely not a way to help climate! Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Bonjour Frédéric, bienvenu au forum et au projet. Here is the detailed task page for one of your models: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7468399 If you click the + beside stderr out you\'ll see some details about what caused the crash. (These details only show when a model has finished computation, either crashed or completed.) CPDN has 5 collections of README posts containing lots of useful advice. You can get to them by clicking on the link in my signature at the bottom of this post. I recommend you should look at the collection about Crashes and Problems. In that collection go to link #6 where MikeMars explains all the frequent causes of model crashes, including -107 code crashes. Link #7 in the same problems collection is a post by Thyme Lawn who explains how to update graphics card drivers. This is a free download from the web like Windows updates. Updating these drivers often cures -107 errors, and it\'s good for the computer. Hope that helps. Mo Cpdn news |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Ah, at least I feel we are going somewhere! I like that! There are a number of common errors which cause many people problems. The first is the Windows Stop message (appears as a Microsoft Send / Don\'t Send dialogue, and -1073741819 in the log) I believe I have something like it: I found this - exit code -1073741819 (0xc0000005) in stderr.out. BUT I don\'t have any dialog. I am using XP SP3, maybe is it related. Anyhow, this made me understand that my issue could be firewall- or rather kerio-related. I found here some infos about Kerio and I set my rules accordingly. I will check tomorrow if this solves the issue and report here. Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I think a graphics problem is more probable than your firewall. But it\'s also a very good idea to exclude the BOINC folder from your anti-virus scans if you do these scans while BOINC is running. Cpdn news |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Here is where I am, now: - I used a memcheck86 and superpi (32M) without finding anything abnormal (I did not expect to find anything since I had not experienced any crashes on this machine). If my new task crashes too, I\'ll test Prime95, just to be sure. - Since the hardware did not seem to be the culprit, I attacked the firewall. Here is what I did:
Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The graphical display only opens and shows your globe if the model\'s running. Using it and playing round with its display modes shouldn\'t cause problems, but members who\'ve had possibly graphics-related error codes like 107 and -107 should avoid maximising the graphics window. Keep the globe window small. Have you updated your graphics card drivers? Have a look at the README about backing up the BOINC folder. Les\'s manual method is easy and works well. With regular backups, if you do have another model crash, you could restore the backup and continue the same model. I back up my BOINC folders regularly even though my computers are running very stably at the moment. Because the models are so long, the probability that they will crash before they complete is quite high even on an exceptionally good computer. Cpdn news |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Frederic You have a laptop. These use a few chips on the mainboard for the display, as well as some of the mainboard\'s memory. When anything needing a large amount of memory for the display runs, such as a picture, larger amounts of memory are suddenly needed, and this can \"pull the carpet out from under the climate program\". The program (and models), was originally developed for big supercomputers, so people running it on desktops need to have a reasonably well resourced computer to prevent problems. Laptops are worse, because of their more limited memory, their lack of a separate graphics card, their smaller, \"lighter duty\" HDs, and their reduced cooling capabilities. And the \"107...\" errors are Windows \"stop\" errors. (Look them up on the net, or in the Microsoft Knowledge Database.) There seems to be 2 main reasons why they cause problems: 1) Shutting down a computer without first shutting down BOINC, 2) (More often), something to do with the display, usually something using a lot of memory, or an outdated (or generic MS), graphics driver, which isn\'t handling the requirements of the model\'s display very well. These errors are not firewall related. Backups: Here |
Send message Joined: 13 Sep 04 Posts: 228 Credit: 354,979 RAC: 0 |
I am running fine on an XP SP3 machine. |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Thanks for all your answers, everyone. Frankly, when I calculated the number of days which I would need to finish my task, I started thinking about quitting, but the help I am getting here makes me want to stay on board. I like the graphics explanation better too, but I want to be sure. Now that I know there is a way to backup and recover from a crash, I am going to check if the interface is the culprit as soon as I have a little time. Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
On projects where the workunits are short there isn\'t much time for a disaster to occur, and most computers complete most tasks. For example, if you play graphics-intensive games for 2 hours per day and your graphics card can\'t manage the games + the task simultaneously, you would probably still complete most short tasks successfully while you\'re not playing games and never realise that there\'s a problem. But with a climate model that lasts for months, every day that user could create a possible model crash situation. Fortunately in the case of the HADCM 160 and 80-year models, they upload a decadal zip file to the server after every 10 years. The information contained in these files is used by the researchers even if the model later crashes. In the case of the shorter HADAM and HADSM models I think they need to complete in order to be useful to the researchers. CPDN is almost without doubt the most difficult project to learn to crunch successfully because of the length of the tasks. But we do get credits for every successful trickle even if the model later crashes. I think most new CPDN members probably crash several models before they learn what to do and what not to do. I crashed a few. Members who would prefer a shorter model next time they download one can select the model type in their CPDN preferences in their account. This may not be possible today because there are some server database problems at the moment. Cpdn news |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
If I continue at my current rate, my next download should be in about 270 days (plus the unplanned but unavoidable delays). I guess I have time enough to choose an easier job for next time :-D Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The choice of models may then be different. Cpdn news |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Thanks for all your answers, everyone. Frankly, when I calculated the number of days which I would need to finish my task, I started thinking about quitting, but the help I am getting here makes me want to stay on board. Hi, I am glad to see that you are continuing with the project. Backups are vital. I am running the 3 coupled models on 2 laptops and I can tell you that I never get through one without restoring it half a dozen times! Making backups is easy and once you have done it a few times you can do it in your sleep. I make one every morning (that way I only loose a few hours crunching time). It only takes about 5 minutes. If you only make backup like once a week the model will proudly crash on day 6 and you have to repeat the whole week. Happy crunching. Jim |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Thanks for the idea, JIM. To make things easier, I just created a small batch file which suspends, backups and restarts the project. Now that I know how to do it, I will create another to suspend and backup BOINC before shutting off the computer. I believe I saw tools that allow you to launch tasks before closing Windows. Frederic |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi again Make sure your batch file also exits from BOINC, which is what you do manually by File > Exit in BOINC manager, or by right-clicking on the system tray fried-egg-on-grill icon then selecting Exit. If you make a backup without exiting from BOINC first, there\'s no real guarantee that it will restore successfully and the tasks run. Cpdn news |
Send message Joined: 28 May 08 Posts: 16 Credit: 32,985 RAC: 0 |
Funny, I just found this out! I was using 7zip (command line version of course) to compress and I saw that 7zip complained that some files were still locked (although in the BOINC GUI the project is suspended). I believe there are some zip switches which could solve the lock issue, but I\'d rather compress when all locks are removed. So is the procedure Suspend-then-Exit completely ok? Is it any use to do boinccmd --project http://climateprediction.net suspend boinccmd --quit (zip) boincmgr or is --quit enough ... or is there something else required? Frederic |
©2025 cpdn.org