Message boards : Number crunching : Stuck upload issue
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
I have had a stuck upload from batch 691 for so many weeks I have lost track. My stuck upload (cam25) is also from batch 691. It's the restart.zip upload and it's stuck at 27.64%. Do we abort these uploads or continue to have patience? |
Send message Joined: 31 Aug 04 Posts: 10 Credit: 2,538,005 RAC: 0 |
I still have two stuck uploads from the following tasks: https://www.cpdn.org/cpdnboinc/result.php?resultid=20919737 https://www.cpdn.org/cpdnboinc/result.php?resultid=20919722 Both uploads are appr. 105 Mb in size. One of them is stuck at 53 Mb the other one at 46 Mb. As no one seems to be willing to look into this or tell us what to do about it I have set no new tasks for CPDN until this is resolved. Tom 17/02/2018 10:15:07 | climateprediction.net | Started upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip 17/02/2018 10:15:07 | climateprediction.net | Started upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip 17/02/2018 10:15:29 | | Project communication failed: attempting access to reference site 17/02/2018 10:15:29 | climateprediction.net | Temporarily failed upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip: transient HTTP error 17/02/2018 10:15:29 | climateprediction.net | Backing off 05:16:01 on upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip 17/02/2018 10:15:30 | | Internet access OK - project servers may be temporarily down. 17/02/2018 10:15:34 | | Project communication failed: attempting access to reference site 17/02/2018 10:15:34 | climateprediction.net | Temporarily failed upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip: transient HTTP error 17/02/2018 10:15:34 | climateprediction.net | Backing off 03:50:57 on upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip 17/02/2018 10:15:35 | | Internet access OK - project servers may be temporarily down. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
Yes, I would like to add my frustration with these CAM25 stuck uploads. Does anybody have a clue about these? I know from earlier posts that these are uploaded to a server that is distant from Oxford so it isn't of itself an Oxford problem. And previous posts have said that Oxford doesn't have the capacity to divert the CAM25 uploads to their servers. I am tempted to abort the transfers to clear it from my end but I am reluctant to abort if it can be solved. Any advice would be welcome. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,007,330 RAC: 21,449 |
I don't know if any of the current problems are down to the recent outage of the virtual machine at Oxford and probably won't be able to find out till Monday. The only tasks I have at the moment are SAS50's which seem to be fine. I will post the question to the Oxford Team however. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
Dave, My stuck uploads have been stuck since before the recent outage and are still stuck. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,007,330 RAC: 21,449 |
I will request that someone nudges the people in Mexico. Don't know how much effect it will have though. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
A more recent CAM25 has completed and fully uploaded successfully. The successful task was batch 694. The stuck upload task is batch 691. That may help. I note that BetelgeuseFive's stuck uploads are from batch 689. Perhaps someone in Mexico has gone to sleep regarding the earlier batches. EDIT. I also have a later CAM25 model from the same batch, 691, which has finished and fully uploaded successfully. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,976,726 RAC: 14,201 |
Mine from batch 691 is still stuck - from Jan6th!! |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,007,330 RAC: 21,449 |
For anyone happy with playing with their config files, might be worth looking at this post which is to do with stuck uploads on another Mexican batch https://www.cpdn.org/cpdnboinc/forum_thread.php?id=8251#54582 From discussion following my reporting of the issue still being unresolved, ideas around are that on one previous batch to a different server, the thing that got the stuck ones going again was the possibly extreme option of rebooting the server causing the uploads to start again from scratch. Also possibly something to do with size of uploads. Andy is going to liaise with those in Mexico, presumably this afternoon seeing as they are six hours behind UK. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
<quote> For anyone happy with playing with their config files, might be worth looking at this post which is to do with stuck uploads on another Mexican batch https://www.cpdn.org/cpdnboinc/forum_thread.php?id=8251#54582 <unquote> I have been running with that fix installed since Les Bayliss first published and have never taken it out again. So it can't be that. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
@WB8ILI, @Lockleys, @Alan K, Does part of the message/event log about this have "locked by file_upload_handler PID=" in the output? Just trying to make sure this is the same problem as other cam25 upload problems here and on the dev site. Thanks. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
@WB8ILI, @Lockleys, @Alan K, Does part of the message/event log about this have "locked by file_upload_handler PID=" in the output? Just trying to make sure this is the same problem as other cam25 upload problems here and on the dev site. Thanks. Note that for my stuck CAM25, which showed that error on the first upload failure, the error messages are now only “transient HTTP error” etc. The rest of that model’s uploads cleared without difficulty. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
@WB8ILI, @Lockleys, @Alan K, Does part of the message/event log about this have "locked by file_upload_handler PID=" in the output? Just trying to make sure this is the same problem as other cam25 upload problems here and on the dev site. Thanks. I haven't seen this message in the Event Log. Just 19/02/2018 21:28:22 | climateprediction.net | Temporarily failed upload of wah2_cam25_a05e_200405_18_691_011369743_0_r1614740798_restart.zip: transient HTTP error |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,976,726 RAC: 14,201 |
Don't know as my log file doesn't go back to 6th Jan when the problem first occured. Just getting the transient HTTP messages now. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
I lost patience with my stuck CAM25 Zip file and aborted the upload. The model is now available for download here so someone can now get it right ... |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,007,330 RAC: 21,449 |
Zips for both 708 and 709 are not uploading for me. with http debug enabled all I get is 21/03/2018 07:08:51 | climateprediction.net | Started upload of wah2_eu25_qi1y_200712_13_709_011481271_1_r1628680715_1.zip 21/03/2018 07:08:52 | | [http_xfer] [ID#18] HTTP: wrote 93 bytes 21/03/2018 07:09:14 | | [http_xfer] [ID#18] HTTP: wrote 221 bytes 21/03/2018 07:09:15 | climateprediction.net | Backing off 00:24:36 on upload of wah2_eu25_qi1y_200712_13_709_011481271_1_r1628680715_1.zip |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
Zips for both 708 and 709 are not uploading for me. with http debug enabled all I get is My 708 and 709 zips uploaded as normal during the UK night. |
Send message Joined: 15 May 09 Posts: 4538 Credit: 19,007,330 RAC: 21,449 |
My 708 and 709 zips uploaded as normal during the UK night. I am still getting the same on 709. on 708 I am now getting the internet access OK project servers may be down. |
Send message Joined: 29 Nov 17 Posts: 82 Credit: 14,467,735 RAC: 90,100 |
I still have one task that is stuck uploading to upload6, it has sent 6.66/104.92MB so far... This one has finally uploaded successfully ! |
Send message Joined: 8 Dec 05 Posts: 3 Credit: 732,203 RAC: 951 |
These two have been stuck for a couple of weeks trying to upload. I doubt they ever sent a single trickle either.. Looks like it hasn't worked since May! Meow General URL http://ithaqua.oerc.ox.ac.uk/cpdnboinc/ User name Chairmanmeow Team name Project Blue Book Resource share 100 Scheduler RPC deferred for 23:59:06 Disk usage 240.62 MB Computer ID 1460825 Suspended via GUI no Don't request tasks no Trickle-up pending yes Host location home Tasks completed 2 Tasks failed 0 Credit User 443,499 total, 326.90 average Host 0 total, 0.00 average Scheduling Scheduling priority -0.00 Last scheduler reply 5/23/2018 11:16:16 PM [/img] |
©2024 cpdn.org