Message boards : Number crunching : ANZ model upload problems.
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
Error reported by file upload server: hadam3p_anz_a48n_2012_1_008561644_0_6.zip] locked by file_upload_handler PID=-1 Of the 3 completed ANZ models on this host, all 3 are reporting this error on 1 or 2 upload files. All files are going to <upload_url>http://rwah0.rdsi.tpac.org.au/cgi-bin/file_upload_handler</upload_url> Of the 39 upload files from these 3 ANZ models, 5 are stuck with this ongoing error. All the rest uploaded OK. No pattern as to which files fail upload by sequence number. Might be that upload handler fails to recover after a transient error on a particular file? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I'll email the project. They'll need to get this sorted fast. Hmmm. 3.14 am there, so it's going to be a long wait. |
Send message Joined: 22 Mar 06 Posts: 144 Credit: 24,695,428 RAC: 0 |
I has 6 complete OK on the 24th. |
Send message Joined: 28 Mar 11 Posts: 35 Credit: 82,588 RAC: 0 |
This appears to be an NFS file locking issue. It currently affects about 1% of the files that have uploaded. The solution would be to stop file locks on the NFS-mounted storage device on the ANZ server, but I am not yet sure of the implications this would have - I am guessing the effect would be minimal, but I am checking with the servers admin. Jonathan CPDN-sysadmin |
Send message Joined: 11 Dec 05 Posts: 5 Credit: 714,983 RAC: 0 |
3/26/2014 2:19:44 PM | climateprediction.net | Started upload of hadam3p_anz_a4qy_2012_1_008562303_0_1.zip 3/26/2014 2:19:46 PM | climateprediction.net | [error] Error reported by file upload server: [hadam3p_anz_a4qx_2012_1_008562302_0_7.zip] locked by file_upload_handler PID=-1 3/26/2014 2:19:46 PM | climateprediction.net | Temporarily failed upload of hadam3p_anz_a4qx_2012_1_008562302_0_7.zip: transient upload error 3/26/2014 2:19:46 PM | climateprediction.net | Backing off 00:18:18 on upload of hadam3p_anz_a4qx_2012_1_008562302_0_7.zip 3/26/2014 2:19:58 PM | climateprediction.net | [error] Error reported by file upload server: [hadam3p_anz_a4qy_2012_1_008562303_0_1.zip] locked by file_upload_handler PID=-1 3/26/2014 2:19:58 PM | climateprediction.net | Temporarily failed upload of hadam3p_anz_a4qy_2012_1_008562303_0_1.zip: transient upload error 3/26/2014 2:19:58 PM | climateprediction.net | Backing off 04:03:14 on upload of hadam3p_anz_a4qy_2012_1_008562303_0_1.zip |
Send message Joined: 11 Dec 05 Posts: 5 Credit: 714,983 RAC: 0 |
Can you check to see if there is a server down and relate my previous message please ? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The preceding 2 posts have been moved here from a thread in the Science section. |
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
I also have 2 stuck with the same error getting reported from different machines: 526 climateprediction.net 27-03-2014 04:39 PM [error] Error reported by file upload server: [hadam3p_anz_n7dq_2012_1_008583254_0_1.zip] locked by file_upload_handler PID=-1 This one got a "transient upload error" at 05:17 (UTC + 11 hours) and then has been getting this since 05:30 and 604 climateprediction.net 27-03-2014 07:30 AM [error] Error reported by file upload server: [hadam3p_anz_n7ot_2012_1_008583653_1_1.zip] locked by file_upload_handler PID=-1 This one appeared after a "transient upload error" at 07:26 (UTC + 11) and from 07:30 its been getting the locked file error. It looks like a common theme, it gets an upload error and then file isn't getting released. Running BOINC 7.2.42 on one and 7.3.11 on the other. BOINC blog |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
One of mine is now past 20 minutes of back off. |
Send message Joined: 4 Mar 14 Posts: 5 Credit: 1,459,572 RAC: 0 |
Just to add my voice - getting the same. I did start another thread (sorry), but Lee pointed me to this one. |
Send message Joined: 4 Mar 14 Posts: 5 Credit: 1,459,572 RAC: 0 |
And now, mine has just uploaded (14:00 GMT). Thanks to whoever did whatever necessary to fix it :) |
Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0 |
With the large upload files and the high server load here, broken uploads can easily happen - here on the server, on the ISP or who knows where else those bites can disappear on their way (maybe the NSA eats some too). The timeout of the upload handler seems to be somewhat longer than the retry delay of the BOINC core client. I had it too lately for a few times, it always fixed itself after some time. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Jonathan re-booted the server some hours ago, and also removed the file locking. There are still some transient failures, but that will be due to the large influx of data. |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
All uploads from here finished ok, including one that was stuck for several days. Thanks Jonathan. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
Les thank you very much for that information :) |
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
After file locks disappeared still had transient upload failures. I had to dig out the old proxy server and hook it up to the dial up to clear them. Personally I think it's an issue with my ISP and they don't have a clue. Anyway all files cleared as of 2 hours ago. Work units progressing, and I added 2 more machines to help out. BOINC blog |
Send message Joined: 18 Feb 06 Posts: 73 Credit: 63,054,745 RAC: 28,792 |
Hello, as you see here below all my anz.. do not upload. eu or pnw do it ! Actually only anz are running on my 2 machines. Shall i go on and hope they are doing in the future, or shall y change something or aborting and waiting for better times..?? Thanks 29/03/2014 10:22:56 | climateprediction.net | Started upload of hadam3p_eu_e1nf_2013_1_008547582_1_7.zip 29/03/2014 10:33:23 | climateprediction.net | Finished upload of hadam3p_eu_e1nf_2013_1_008547582_1_7.zip 29/03/2014 11:27:55 | climateprediction.net | Started upload of hadam3p_anz_nb4h_2012_1_008588105_0_1.zip 29/03/2014 11:28:36 | climateprediction.net | Temporarily failed upload of hadam3p_anz_nb4h_2012_1_008588105_0_1.zip: transient HTTP error 29/03/2014 11:28:36 | climateprediction.net | Backing off 05:24:11 on upload of hadam3p_anz_nb4h_2012_1_008588105_0_1.zip 29/03/2014 11:28:39 | | Project communication failed: attempting access to reference site 29/03/2014 11:28:41 | | Internet access OK - project servers may be temporarily down. 29/03/2014 11:41:42 | climateprediction.net | Started upload of hadam3p_anz_a46k_2012_1_008561569_0_2.zip 29/03/2014 11:42:21 | climateprediction.net | Temporarily failed upload of hadam3p_anz_a46k_2012_1_008561569_0_2.zip: transient HTTP error 29/03/2014 11:42:21 | climateprediction.net | Backing off 05:51:47 on upload of hadam3p_anz_a46k_2012_1_008561569_0_2.zip 29/03/2014 11:42:35 | | Project communication failed: attempting access to reference site 29/03/2014 11:42:38 | | Internet access OK - project servers may be temporarily down. 29/03/2014 13:40:01 | climateprediction.net | Started upload of hadam3p_anz_nb4i_2012_1_008588106_0_1.zip 29/03/2014 13:41:00 | climateprediction.net | Temporarily failed upload of hadam3p_anz_nb4i_2012_1_008588106_0_1.zip: transient HTTP error 29/03/2014 13:41:00 | climateprediction.net | Backing off 03:08:42 on upload of hadam3p_anz_nb4i_2012_1_008588106_0_1.zip 29/03/2014 13:41:14 | | Project communication failed: attempting access to reference site 29/03/2014 13:41:17 | | Internet access OK - project servers may be temporarily down. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Please don't abort any of the models. The files will very probably upload eventually and everything you've crunched will be of use to the project. The uploads will retry automatically after specified backoff delays and the files will come to no harm waiting to upload. Some people sometimes have files waiting to upload for more than a week or two. Having files waiting to upload shouldn't prevent your computer from receiving new work. Some of these delays can be caused by the large number of files waiting in queue to upload and the server can't handle the simultaneous volume. Cpdn news |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
Upload speed here ranges from dialup speed to full available bandwidth. Sometimes have multiple transient http errors, most uploads no problems. Could be server sometimes overloaded, could be ISP or local problem. Last few weeks, all uploads get there sooner or later. If other than "transient http" please report. Long experience leads me to say "give it a couple days" and if error message not yet reported, please report immediately. <edit>> Also, have seen uploads that get temp http errors restart where they left off, rather than restarting from 0. Is this the newer better CPDN version? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I get the occasional transient failure, and I'm only a thousand miles or so north of the server. And often by the time I notice, it's already well on it's way again. So, no worries. |
©2025 cpdn.org