Message boards : Number crunching : Computer wasting multiple models
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · Next
Author | Message |
---|---|
Send message Joined: 4 Oct 09 Posts: 73 Credit: 7,242,427 RAC: 0 |
This newcomer may need assistance. Started on the 21st but has lost the first 4 models after just a few trickles. All with the general exit code 1. Computer Id = 1095430 |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It's a laptop, and I think that error 1 is: turned off the computer without first exiting from BOINC. Possibly: closed the lid and hibernated. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Here's what Jorden says. Are these explanations possible when the computer doesn't appear to have a CUDA card? I don't think they're relevant in this case. Cpdn news |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
I've had a few exit code 1's in my time, but none for at least 2 years. Every one happened when the controller process stopped and left an orphaned worker process running. The error happened when the controller was restarted and tried to start a second worker. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 4 Oct 09 Posts: 73 Credit: 7,242,427 RAC: 0 |
It's a laptop, and I think that error 1 is: turned off the computer without first exiting from BOINC. Possibly: closed the lid and hibernated. Maybe depends on hibernation settings. Just tested closing the lid. No issue (Win XP). But pulling the plug would almost certainly mess up the models. Noted that the member appears to be running Malaria in rotation (being a single core CPU) and has also lost WUs there with exit 1. Did complete two WUs though! Times of last trickle for each of the 4 crashed CPDN tasks are at random times during the day. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
In a future News post I'll add a reminder about exiting from Boinc before shutting down a computer. I suspect that hundreds of people don't exit first. Usually a shutdown without a Boinc exit leaves the model intact but sooner or later one will crash. I don't think we can do anything about this person at the moment as he needs time to notice the crashes and take action. But in a month or so we should check what's happening on this computer and if necessary ask Milo to send him the email. Cpdn news |
Send message Joined: 4 Oct 09 Posts: 73 Credit: 7,242,427 RAC: 0 |
This member is in our team - we like to see newcomers getting smoothly off the mark! When posting the computer id earlier, had forgotten that our leader gets email addresses. Subsequently the member has now been emailed with the BOINC exit warning as a first suggestion. Hopefully we shall get a response from the newcomer soon but will keep monitoring anyway. Will update this thread if we can help the member resolve the issue. Meantime, thanks anyway for your and the team's support! |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Tell him how to exit completely as it wuld be very easy to think that closing the Boinc manager is enough. Belonging to an active and keen team is of great help to people. Cpdn news |
Send message Joined: 9 Jan 07 Posts: 497 Credit: 342,899 RAC: 0 |
It might also be worth reminding people that it's a good idea to click on "No new tasks" when you've already got a task running (or as many tasks as your computer can cope with). When your task is completed and has reported, you can click on "Allow new tasks" to get another one. By the way, when the server was down recently and I wanted to suspend network activity till it was working again, I couldn't remember how to do that and had quite a hard time finding out! It's in the "Activity" tab in BOINC Manager (which seemed obvious once I had found it). Visit the Scotland team |
Send message Joined: 9 Jan 07 Posts: 497 Credit: 342,899 RAC: 0 |
mo.v wrote: ... Yes indeed - sometimes new people are hesitant about posting in open forums (I was myself, hard to believe now folks, isn't it!). But, if you're in a team, you can always email, or send a private message to, the team "founder". Or post in the team's own forum, if it has one. And NOBODY should worry that they might ask a "silly" question and look foolish - we've all done that too, and CPDN people are very understanding. Visit the Scotland team |
Send message Joined: 5 Aug 04 Posts: 250 Credit: 93,274 RAC: 0 |
I was checking the computers I was paired against. Found 3 bad ones. Someone please check hostID 565904. Nice that it's an anonymous host, but it's wasting models by the hundreds. As is hostID 941450. And hostID 1021107. Jord. |
Send message Joined: 5 Aug 04 Posts: 250 Credit: 93,274 RAC: 0 |
Whow, more.. hostID 975193 (340 bad) hostID 1087147 is dubious. What is this person doing? hostID 843760 keeps wasting them. hostID 866343 needs a slap on the fingers. hostID 1026857 does as well. Jord. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
We'll have to leave #1087147 at least for the time being. The tasks may have got the detached designation by the owner restoring a Boinc Data folder backup; all the detached models can be crunched. All the others in Ageless's two posts need the email and to be minussed though. Cpdn news |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Done. |
Send message Joined: 5 Feb 05 Posts: 17 Credit: 1,582,791 RAC: 0 |
I had some time so I took another look at some of WUs. All of these computers have no successfully completed tasks listed: 975228 1063016 1080305 Last successfully completed a task on 13 Jan 2010: 644869 The last one I'm not so sure about: 1056295 This computer was only created in February and hasn't had a successfully completed task yet - these I've looked into failed with exit status -2 (Could not launch model process. Last Error=193). However it belongs to a member who has more than 20 active computers and a high RAC. His other computers appear to be running smoothly. Maybe a simple info message about that particular computer would be enough in this case. Starfire |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I agree completely about the first 4. Email and a -1 quota. I think it would be better not to send the email to the owner of the last computer though as he hasn't downloaded any new models for two months. He has probably realised that there's a problem with this machine. As you say, he has lots of other problem-free computers to use for CPDN. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
The most recent ones are done - apologies for the delay. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
|
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
All done, thanks. |
©2024 cpdn.org