Message boards : Number crunching : No Tasks Available
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
... The only quibble is that unless these are somehow shorter than previous hadam3p_eu models that estimated time to completion of only 78 hours is a bit short for that machine. 100 hour is more likely. This will probably self correct after first WU�s finish. The beta versions of these models ran at the same speed as previous EU models, on my machine at least. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,984,181 RAC: 14,575 |
Current speed on my machine is approx 1.83sec/time step. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
Wonder what caused the errors on mine. Daily quota reached so I won't be able to check again till tomorrow by which time WCG task will be almost finished and current pnw will be less than a day to go. |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
One ghost WU at 15:28, one arrived properly at 16:14, both WUs are brandnew, generated today. Ghost WUs are usually a sign for server or network overload, which could explain temporary HTTP errors. A permanent HTTP error usually means that the file actually does not exist on the server or has insufficient access permissions for web users so this is usually not a client side or communication problem. Might be bad timing, if the files arrived _after_ the scheduler knew about the fresh results. p.s.: Just in theory, another possible reason for such a permanent download error would be if the download server IP has been cached by your BOINC client some time ago but in the meantime the IP has changed and the old IP points to a still existing web server. In this case only a restart of the BOINC client would help. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
I am getting a download error on 9 hadam3p_eu on my second fastest machine. downloads to fastest machine went OK. Permanent HTTP seems to be the problem. Messages as follows: 3/3/2014 9:10:12 AM | climateprediction.net | Sending scheduler request: Requested by user. 3/3/2014 9:10:12 AM | climateprediction.net | Requesting new tasks for CPU 3/3/2014 9:10:14 AM | climateprediction.net | Scheduler request completed: got 0 new tasks 3/3/2014 9:10:14 AM | climateprediction.net | Not sending work - last request too recent: 83 sec 3/3/2014 10:10:54 AM | climateprediction.net | Sending scheduler request: To fetch work. 3/3/2014 10:10:54 AM | climateprediction.net | Requesting new tasks for CPU 3/3/2014 10:11:47 AM | climateprediction.net | Scheduler request failed: HTTP gateway timeout 3/3/2014 10:13:18 AM | climateprediction.net | Sending scheduler request: To fetch work. 3/3/2014 10:13:18 AM | climateprediction.net | Requesting new tasks for CPU 3/3/2014 10:13:21 AM | climateprediction.net | Scheduler request completed: got 0 new tasks 3/3/2014 10:13:21 AM | climateprediction.net | Not sending work - last request too recent: 144 sec 3/3/2014 11:14:01 AM | climateprediction.net | Sending scheduler request: To fetch work. 3/3/2014 11:14:01 AM | climateprediction.net | Requesting new tasks for CPU 3/3/2014 11:14:05 AM | climateprediction.net | Scheduler request completed: got 9 new tasks 3/3/2014 11:14:07 AM | climateprediction.net | Started download of hadam3p_eu_l1ka_2013_1_008537577.zip 3/3/2014 11:14:07 AM | climateprediction.net | Started download of o3_n96_pers_1959_1999_2020.gz 3/3/2014 11:14:09 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1ka_2013_1_008537577.zip: permanent HTTP error 3/3/2014 11:14:09 AM | climateprediction.net | Giving up on download of o3_n96_pers_1959_1999_2020.gz: permanent HTTP error 3/3/2014 11:14:09 AM | climateprediction.net | Started download of ic19610406_16_N96.gz 3/3/2014 11:14:09 AM | climateprediction.net | Started download of atmos_n0nh.day.gz 3/3/2014 11:14:10 AM | climateprediction.net | Giving up on download of ic19610406_16_N96.gz: permanent HTTP error 3/3/2014 11:14:10 AM | climateprediction.net | Giving up on download of atmos_n0nh.day.gz: permanent HTTP error 3/3/2014 11:14:10 AM | climateprediction.net | Started download of so2dms_N96_2013_12_2015_02f_1900rescale.gz 3/3/2014 11:14:10 AM | climateprediction.net | Started download of region_n0nh.day.gz 3/3/2014 11:14:11 AM | climateprediction.net | Giving up on download of so2dms_N96_2013_12_2015_02f_1900rescale.gz: permanent HTTP error 3/3/2014 11:14:11 AM | climateprediction.net | Giving up on download of region_n0nh.day.gz: permanent HTTP error 3/3/2014 11:14:11 AM | climateprediction.net | Started download of OSICE_natural_2013_12_2014_12.gz 3/3/2014 11:14:11 AM | climateprediction.net | Started download of ancil_OSTIA_deltaSST_2014_HadGEM2-ES.gz 3/3/2014 11:14:12 AM | climateprediction.net | Giving up on download of OSICE_natural_2013_12_2014_12.gz: permanent HTTP error 3/3/2014 11:14:12 AM | climateprediction.net | Giving up on download of ancil_OSTIA_deltaSST_2014_HadGEM2-ES.gz: permanent HTTP error 3/3/2014 11:14:12 AM | climateprediction.net | Started download of hadam3p_eu_l1kj_2013_1_008537586.zip 3/3/2014 11:14:12 AM | climateprediction.net | Started download of ic19610624_11_N96.gz 3/3/2014 11:14:13 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1kj_2013_1_008537586.zip: permanent HTTP error 3/3/2014 11:14:13 AM | climateprediction.net | Giving up on download of ic19610624_11_N96.gz: permanent HTTP error 3/3/2014 11:14:13 AM | climateprediction.net | Started download of hadam3p_eu_l1ki_2013_1_008537585.zip 3/3/2014 11:14:13 AM | climateprediction.net | Started download of ic19611008_12_N96.gz 3/3/2014 11:14:14 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1ki_2013_1_008537585.zip: permanent HTTP error 3/3/2014 11:14:14 AM | climateprediction.net | Giving up on download of ic19611008_12_N96.gz: permanent HTTP error 3/3/2014 11:14:14 AM | climateprediction.net | Started download of hadam3p_eu_l1kh_2013_1_008537584.zip 3/3/2014 11:14:14 AM | climateprediction.net | Started download of ic19611222_14_N96.gz 3/3/2014 11:14:15 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1kh_2013_1_008537584.zip: permanent HTTP error 3/3/2014 11:14:15 AM | climateprediction.net | Giving up on download of ic19611222_14_N96.gz: permanent HTTP error 3/3/2014 11:14:15 AM | climateprediction.net | Started download of hadam3p_eu_l1kg_2013_1_008537583.zip 3/3/2014 11:14:15 AM | climateprediction.net | Started download of ic19610314_14_N96.gz 3/3/2014 11:14:16 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1kg_2013_1_008537583.zip: permanent HTTP error 3/3/2014 11:14:16 AM | climateprediction.net | Giving up on download of ic19610314_14_N96.gz: permanent HTTP error 3/3/2014 11:14:16 AM | climateprediction.net | Started download of hadam3p_eu_l1kf_2013_1_008537582.zip 3/3/2014 11:14:16 AM | climateprediction.net | Started download of ic19610803_11_N96.gz 3/3/2014 11:14:18 AM | climateprediction.net | Giving up on download of hadam3p_eu_l1kf_2013_1_008537582.zip: permanent HTTP error 3/3/2014 11:14:18 AM | climateprediction.net | Giving up on download of ic19610803_11_N96.gz: permanent HTTP error Stderr follow; core_client_version>7.2.39</core_client_version> <![CDATA[ <message> WU download error: couldn't get input files: <file_xfer_error> <file_name>hadam3p_eu_l1kj_2013_1_008537586.zip</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>o3_n96_pers_1959_1999_2020.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>ic19610624_11_N96.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>atmos_n0nh.day.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>so2dms_N96_2013_12_2015_02f_1900rescale.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>region_n0nh.day.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>OSICE_natural_2013_12_2014_12.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>ancil_OSTIA_deltaSST_2014_HadGEM2-ES.gz</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> </message> |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
Thanks Ananas, I will restart Boinc tomorrow morning and see what happens. Hopefully with the way the number of tasks available is going up there will still be some for me! |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Six more hadam3p_eu download errors. Same machine as before. Same error messages as before. What good is it that work is available if all the WU�s fail due to permanent HTTP error. Is this happening to others or just me. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
JIM, Some of the tasks from the work units your computer errored on, have returned trickles on other PCs. None of the other computers that downloaded tasks from those work units have had download errors. So, it would appear to be a problem with that BOINC installation? I'd suggest doing a project reset, but I see you are still running other cpdn tasks. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
I did as Ananas suggested and I have downloaded three tasks successfully since. The first of which seems to be running without problems so far. |
Send message Joined: 5 Aug 04 Posts: 127 Credit: 24,441,759 RAC: 23,771 |
Not aware of any download-errors, but had 4 models crashing-out with the following message: <stderr_txt> Model crashed: INITTIME: Atmosphere basis time mismatch tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> The wu's are 8683247, 8683249, 8683250 and 8683251. On the same computer some of the other models had already been running for a few hours, and another model started successfully a few seconds after the 4 crashing ones. No idea if any other problems, since no way to know how many of the models has started crunching (no access from here). |
Send message Joined: 31 Dec 09 Posts: 12 Credit: 17,214 RAC: 0 |
I don't know if this has already been posted somewhere, but are there any potential problems with the hadam3p project I have to be aware of? Similar to the hadcm3n model where you should't suspend the wu when it's creating the decadal zip files. |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
My experience has been that the HadAM3P code is less sensitive to conditions on your computer than is the HadCM3N. The usual advice applies still: ensure that your virus checker ignores the Boinc data folder. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
I would second what Greg says. Following all the same precautions will maximise your success rate. That said, it is not guaranteed but also not unusual for the regional models to survive a power outage whereas I have only ever had one hadam3cn do so. |
Send message Joined: 31 Dec 09 Posts: 12 Credit: 17,214 RAC: 0 |
I see, thanks, Greg and Dave. Yes, I made sure that my virus scanner ignores the BOINC data folder. And power outages shouldn't be a problem since I'm running this on a notebook (which is almost always on the grid). |
Send message Joined: 12 Feb 08 Posts: 66 Credit: 4,877,652 RAC: 0 |
There seems to be a lot of work available, but after about 20 failed downloads (permanent HTTP error) this computer has reached its daily quota, have a nice day, come back tomorrow. Nothing to see here, move along! |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Follow the advice below: Stop BOINC and then re-start it. Perhaps even re-boot your computer while BOINC is stopped. The permanent http error is only happening to a few people, so is most likely a problem with their computer. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Dear Les Just had and additional 8 download failues. That�s most likely my quota for the day. Do you still advise resetting the project. The cost to me will be the loss of one hadcm3n models that is at 52%, that�s about 250 hours of crunching. I have rebooted the computer. Do you think that will be enough to clear problem. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Until Les signs on -- For what its worth, my machines (of varied longevity, Vista/W7/W8, and medieval boinc versions) downloaded 151 tasks without download error. That included intermittent periods when my DSL 'service' choked down to (speedtest.net) ping 1148 ms, download 0.25Mbps, upload 0.06Mbps (more typical numbers are 26/10.6/0.60). Despite pathetic transit times and interruptions, all survived. I think Les recommends 'reset project' only when the project's queues are empty. (In my experience, 'reset project' doesn't do a very good job of cleanup, so manual purge is tried. Typically, I'll overlook something and the server accommodates by downloading an obsolete file --> or many.) From DOS 1 days, when in doubt, 'reboot' has been good advice. Good luck. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
As Astro says: Only when there's nothing else running. Sorry, but this is so basic that I assumed that everyone knows this, and will make allowances for whatever they have running. I don't think that I've ever had to do a Reset, so I don't know how effective it is. Usually a re-boot works. Until you finish the long model, perhaps you shouldn't try to get any of the short models. Or wait until the hadcm3 is into the next year, is well away from the checkpoint, gently shut down, and then Re-boot. As Clint Eastwood said once, Are you feeling lucky? :) However, looking at that model, you're the last hope for completing it, and you're way past where everyone else got to. Me, I'd stick with the long model and ignore the short ones. There's bound to be some more latter this century. :) |
Send message Joined: 5 Aug 04 Posts: 127 Credit: 24,441,759 RAC: 23,771 |
The permanent http error is only happening to a few people, so is most likely a problem with their computer. Well, taking a look on the wu's I've downloaded, while I've not had any download-errors myself the current results are: 90 wu's downloaded, of these: 38 error-free (atleast for now). 39 wu's with download-errors. 21 wu's with computing-errors. 48 total download-errors. 27 total computing-errors. 3 wu's errored-out due to too many errors. 43% of the wu's having download-errors is in my opinion too high, so even if only a "few" users has problems they're managing to generate lots of errors. Since atleast some of these users seems to have no problems crunching other BOINC-projects, it's a little strange if where's a problem with their computers. Now I've not checked every download-error, but atleast the checked on was from users running BOINC-version 7.2.39 or 7.2.42. If this indicates either a problem with current BOINC-clients or CPDN's server-setup I've no idea about, it can also just be all errors didn't check is from different BOINC-versions. BTW, appart for all the download-errors, 23% of wu's generating atleast one computing-errors seems on the high side to me. |
©2024 cpdn.org