Message boards : Number crunching : Error while computing
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,008,987 RAC: 21,524 |
I found it was necessary to use the most recent development Wine versions, the earlier releases didn't work on my system. (Using Ubuntu 15.04/15.10 on stock hardware.) Interesting, I am also running Ubuntu15.10 and pretty stock hardware and Wine1.6.2 the standard offering seems to work just fine with no messing about with settings needed whatsoever. Edit: I am running the latest BOINC. Don't know if that makes a difference? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
My Haswell is running Mint 16, Wine 1.6.1, and BOINC 7.6.9 The Windows version is XP Pro, as it's the one that I'm most familiar with. The hard parts were finding where the Wine stuff was installed, and remembering to right-click and install with the Wine option, rather than the usual double left-click. Everything else was a bit of a let down, as it all was "just there". The only 2 failures have been due to model problems. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
jrapdx I only looked at one of your failed models, and that had a long list of Suspends in it, indicating that your option for Suspend work if CPU usage is above is probably set at the default of 25%, which isn't a good idea with climate models. I'm also starting to get suspicious of how Windows 10 interacts with BOINC and the tasks it runs. Although that may not matter with Wine, as it's a rebuild and not the Real Deal. |
Send message Joined: 9 Sep 04 Posts: 228 Credit: 30,750,791 RAC: 3,898 |
after two minutes, two wu crashed. This is Stderr: <core_client_version>7.6.22</core_client_version> <![CDATA[ <stderr_txt> Signal 11 received, exiting... 16:16:49 (39024): called boinc_finish(193) Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=36120, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=39024, selfPID=40312, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 16:16:53 (40312): called boinc_finish(0) |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
Batch 341 was a misconfigured batch. So, the three recent failures from that PC have nothing to do with your PC, just that week old bad batch. |
Send message Joined: 4 Jul 15 Posts: 63 Credit: 3,223,760 RAC: 0 |
Suspend work if CPU usage is above is probably set at the default of 25%, which isn't a good idea with climate models.It was set to 25%. On my other computer, setting of 60% seems OK, and I reset this one to 60% too. However the computer wasn't being used for much except Wine/BOINC, without which CPU usage was well below 25%. I doubt the occasional OS activity exceeded 25%, but no harm using the higher setting. Yesterday one task completed (yay!), but subsequent downloads errored out with message "couldn't start app: CreateProcess() failed - Internal error.(0x54f)". Around that time I had trouble getting Wine to run (after a system reboot), which probably accounts for these errors. Wine does seem unreliable on my system, perhaps configuration issues but haven't found anything notable. I've considered deleting and reinstalling Wine (and BOINC), but hesitate re: losing the CPDN work underway. Maybe there's a way to save and resume it, but haven't dug into the question yet. Could be coincidental but all the failures under Wine have been with wah2 tasks. However on the positive side two wah2 are still running and with any luck will successfully complete. |
Send message Joined: 4 Jul 15 Posts: 63 Credit: 3,223,760 RAC: 0 |
Terribly sorry about the multiple posts. The browser kept timing out and I didn't know the (partial) messages were sent. Maybe a moderator could delete all but the last one, I'd appreciate it. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Done. My trick with the "spinning wheel" is to open a new window and look at the forum. If the post made it, then I cancel the original post. (It seems that it's mostly the reply to the poster that gets held up.) Present problem is a firewall issue, which looks like dragging on for a while, while whoever supplies and installs the hardware does what ever needs to be done. And I have oodles of zips to upload. Sigh. |
Send message Joined: 4 Jul 15 Posts: 63 Credit: 3,223,760 RAC: 0 |
Thank you! I've tried the trick of opening another tab to load the URL, but it doesn't always work. I mean the indicator will be spinning on that tab instead of this one. Info about the firewall is interesting, how well I can relate, it's sometimes hard getting things to work like we believe they ought to. Anyway, it sheds some light on noticing how slow CPDN has been been lately, obviously the problem I was recently having... |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
However I think it's worth pointing out that BOINC/CPDN under Wine is not all a bed of roses. I've experienced numerous "error while computing" task failures, some of which are likley attributable to Wine-related interruptions. Wine itself can be tricky to set up, I am still working on getting boincmgr.exe to start correctly when the computer unexpectedly reboots (as we are subject to random power failures here). While this might actually go under WINE discussion, I noticed that your WINE computer (ID: 1389186) is reported as Windows 10 and since win 10 is new that might be a cause for your wine troubles. Perhaps it is better to choose other windows distribution that WINE should run applications on. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Bernard According to the Wiki article on WINE: It duplicates functions of Windows by providing alternative implementations of the DLLs that Windows programs call, and a process to substitute for the Windows NT kernel. This method of duplication differs from other methods that might also be considered emulation, where Windows programs run in a virtual machine. Wine is predominantly written using black-box testing reverse-engineering, to avoid copyright issues As none of the many versions of "Wine Windows" available are MicroSoft Windows, I don't think that any comparison of problems can be made. Although this IS about computers, so who knows. |
Send message Joined: 4 Jul 15 Posts: 63 Credit: 3,223,760 RAC: 0 |
It's not really clear what the Windows version means for Wine. I've looked at the documentation but difficult to sort out how it affects execution of apps like boinc*. I could change it from Win10 but my hunch is it won't matter. At this point BOINC/CPDN are running OK. Today a task completed and after a new one started 4 tasks are going so I'm inclined to leave things alone for now. The main problem I've had with Wine is reliably starting BOINC. I was trying to set it up so if the computer reboots (prone to random power outages in this location), BOINC would be automatically restarted. However, despite various attempts with shell scripts, etc., it hasn't worked. The only way it does work requires manually changing to BOINC program directory to start BOINC in a terminal. I need to learn more about the intricacies of Wine, I've hardly used it up to now. BTW with the Ubuntu PPA Wine version is now 1.9.4, I updated it yesterday. |
Send message Joined: 5 Jul 09 Posts: 63 Credit: 6,091,274 RAC: 0 |
Just lost 2 tasks, both afr50, one yesterday one today, but both in the final 15 min of processing. I had a windows error message to say that a program had failed on screen, one related to the file, closed down boinc before closing down message, on restarting boinc message re-appeared and then task failed with computational error. http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19799983 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19810807 Kevin |
Send message Joined: 5 Jul 09 Posts: 63 Credit: 6,091,274 RAC: 0 |
Just lost 2 tasks, both afr50, one yesterday one today, but both in the final 15 min of processing. And another two. http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19811590 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19813607 The error message I am getting is hadam3p_afr_7.22_windows_intelx86.exe has stopped working. ATM the error message is still on the screen and the last work unit is showing in boinc to be 100% completed and --- time remaining but is still running. I have now shut down boinc manager and waited until all boinc programs have closed in task manager then restarted boinc. Error message re-appeared after a few seconds and the work unit has restarted at 99.723 with 11 min remaining. Any ideas anyone?? Kevin |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Abort |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
[Kevin wrote:And another two. No ideas, but same error as one of your models (AFR): <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073740791 (0xc0000409) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Leaving CPDN_Main::Monitor... </stderr_txt> ]]> |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I've had a couple of pop-ups about Windows having a problem. I could still get at the BOINC menus though, so I left the message alone, and uploaded all of the files, THEN clicked on the message. I forget what happened then. Another message? This was on the Linux machine running Wine. |
Send message Joined: 5 Jul 09 Posts: 63 Credit: 6,091,274 RAC: 0 |
Oh well I messed up, I tried to delete a couple of exe files within boinc cpnd folder and it blew out the rest of the wu's, so I have removed CPND and then added it back, I have picked up one afr and have set NNT, will see how this processes, they normally run for three days on this machine. Kevin |
Send message Joined: 22 Mar 06 Posts: 144 Credit: 24,695,428 RAC: 0 |
I've had a couple of pop-ups about Windows having a problem. Funny, I had some but ignored them as there was nothing in the event log at the time. Never seen pop-ups from CPDN/BOINC before. Over the last few days had two hadam3p_afr50 tasks go down with the same stderr message - "The extended attributes are inconsistent. (0xff) - exit code 255 (0xff)" e.g. task 19819095 Any thoughts? Sound model related issue to me. Also had a wah2 go down with "The system cannot find the drive specified. (0xf) - exit code 15 (0xf)", which of course could be PC related. Task 19795175 |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Also had a wah2 go down with "The system cannot find the drive specified. (0xf) - exit code 15 (0xf)", which of course could be PC related. Task I have seen the same thing, when three tasks failed upon a reboot. Two of them had that error message: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19801173 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19806053 The third one had a different error message: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19808034 In fact, a fourth one failed earlier, but had already been reported about 35 minutes before the reboot. However, it did not show up in BoincTasks History, so there may have been something strange about it. http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=19810870 I am not sure of cause and effect, but I have rebooted a couple of times since without any problems, so it seems to be somehow connected to the work units themselves. I can't figure it out beyond that. |
©2024 cpdn.org