Message boards : Number crunching : Upload Failure
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
Send message Joined: 11 Nov 04 Posts: 8 Credit: 15,267,364 RAC: 0 |
From yesterday I can see: 3. 5. 2012 20:31:16 | climateprediction.net | [error] Error reported by file upload server: can't open log file '../log_uploader1/file_upload_handler.log' (errno: 9) 3. 5. 2012 20:31:16 | climateprediction.net | Temporarily failed upload of hadam3p_eu_a3mc_1997_1_007861165_2_11.zip: transient upload error . 4. 5. 2012 19:11:00 | climateprediction.net | [error] Error reported by file upload server: can't open log file '../log_uploader1/file_upload_handler.log' (errno: 9) 4. 5. 2012 19:11:00 | climateprediction.net | Temporarily failed upload of hadam3p_eu_a3mc_1997_1_007861165_2_12.zip: transient upload error . 4. 5. 2012 19:53:29 | climateprediction.net | [error] Error reported by file upload server: can't open log file '../log_uploader1/file_upload_handler.log' (errno: 9) 4. 5. 2012 19:53:29 | climateprediction.net | Temporarily failed upload of hadam3p_eu_a3mc_1997_1_007861165_2_11.zip: transient upload error On page Server Status is no problem. Where problem is? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It should just be temporary, (transient), at a time when the server was overloaded with computers wanting to upload files. I've been having no problems. Backups: Here |
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
It should just be temporary, (transient), at a time when the server was overloaded with computers wanting to upload files. Nope. I am getting them too on some eu work units for zip files 11 and 12. I think the admins need to move the log file or something. It's probably run out of disk space (with the backlog of uploads that wouldn't be surprising). Hopefully Jonathan or one of the other guys will notice and fix it soon. BOINC blog |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
|
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,730,664 RAC: 6,969 |
It should just be temporary, (transient), at a time when the server was overloaded with computers wanting to upload files. If it's either a log file problem or a lack of disk space, the reason for the upload failure would appear in your local BOINC client message/event log (which the admins can't see directly). Why not help them find the cause of the problem by quoting the error message which you can see? |
Send message Joined: 5 May 10 Posts: 69 Credit: 1,169,103 RAC: 2,258 |
Why not help them find the cause of the problem by quoting the error message which you can see? Hi. I'm getting the same thing with files 10 to 13 of a HADAM3P EU. The error's: "[error] Error reported by upload server: can't open log file '../log_uploader1/file_upload_handler.log' (errno: 9)". NG |
Send message Joined: 18 Feb 09 Posts: 4 Credit: 97,447 RAC: 0 |
I'm getting the same thing with files 10 to 13 of a HADAM3P EU. The error's: "[error] Error reported by upload server: can't open log file '../log_uploader1/file_upload_handler.log' (errno: 9)". Same problem for me too. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
A server always seems to go down on bank holiday weekend. Is rain getting into the system? |
Send message Joined: 21 Oct 10 Posts: 53 Credit: 2,101,753 RAC: 3,985 |
I have a http error while trying to upload a zip file + one of the upload servers is down (http://climateapps2.oerc.ox.ac.uk/cpdnboinc/server_status.html) but there are others online... Last time I tried to upload was 1 or 2 weeks ago (I crunch on an offline machine and move WU with USB key on a windows VM into my home Mac when they are terminated, and request new work from the VM), and it was down too... I have now 4 WU waiting for upload... the good thing is that the deadline is faaaaar away ;) |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
Seems that the problem is only with the eu uploads -- pnw uploads work here, eu don't even the _01 uploads fail. The one server is reported as down. @Dave -- yeah Murphy's law says that all system failures will happen at the worst time. And on a re-assuring note -- in the last 8 years crunching for climateprediction -- the staff has always fixed problems without losing data. Usually in a day or 2, once it took almost 2 weeks when they had a serious hardware failure. They got mirrors they got logs they got backups. I've learned to trust their backups. Wish my own home backup system was as good. Please keep posting any upload problems here. And keep on crunching. Eric |
Send message Joined: 21 Oct 10 Posts: 53 Credit: 2,101,753 RAC: 3,985 |
As I said, since there is such a long deadline it's not a real problem, only a pain :D |
Send message Joined: 31 Aug 04 Posts: 1 Credit: 1,083,806 RAC: 0 |
Is there any update on this. I have a few EU uploads waiting and I have WUs still working but I don't want to waste time crunching the WUs if the issue isn't going to be resolved and the data lost. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Jonathan and the team will get this sorted, I know it has been a bit longer than usual on this occasion - maybe he is on holiday or something. In the years I have been with the project these problems have always been resolved without data loss - all the data is stored on a mirrored raid array and regularly backed up externally as well. Hopefully one day the project will get the money it deserves which would allow newer better hardware and more staff to look after it. The only advice I can offer is a phrase I heard regularly when in the forces, "Hurry up and wait!" Dave |
Send message Joined: 28 Mar 09 Posts: 126 Credit: 9,825,980 RAC: 0 |
uploader1.atm seems to back on-line and one of my eu work units thats been trying to upload for a week has finally gone through. A big thank you to the guys for fixing it up. BOINC blog |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Good to see the server back on line, have I been unobservant or is srv1.cpdn.psu.edu new in the past week or so? I must get back in the habit of backing up my work units - just lost a couple due to power outage. "The chances of a power of disk failure is proportional to at least the square of the time since the last backup." |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
The project team believe they have resolved all of the upload server issues. If anyone is still having problems uploading please let us know. have I been unobservant or is srv1.cpdn.psu.edu new in the past week or so? That's an upload server which hasn't been used for some time Dave. I don't pay much attention to the green on the server status page, but it may well have been returned to active service. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 21 Oct 10 Posts: 53 Credit: 2,101,753 RAC: 3,985 |
Wow, I can upload ! Good news, it's been so long I wasn't able to do it that I have almost 800 MB to upload now, it's going to take a while with my DSL 110 KBPS upload... Thanks for fixing this ! |
Send message Joined: 14 Sep 10 Posts: 11 Credit: 1,812,972 RAC: 0 |
It's still broken here. I get logs full of: [file_xfer_debug] URL: http://cpdn-restarts.oerc.ox.ac.uk/cgi-bin/file_upload_handler [file_xfer_debug] FILE_XFER_SET::poll(): http op done, retval -107 [file_xfer_debug] file transfer status -107 when attempting to upload _13 hada* results. I can ping http://cpdn-restarts.oerc.ox.ac.uk successfully from the machines running cpdn. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Welcome to the Twilight Zone. Don't adjust your horizontal, your vertical, or your mind, just some spellings on your computer. There's a problem with some flavours of Linux that cause a character string to become corrupted when it's stored on the computer in question. This has been discussed in several threads, (probably under several Topics), on our php board. This is one thread. The cure is to **carefully** edit client_state.xml, and correct the corruption. One spelling is hnndler, but there are others. Do a search on the 4 character model name, then check each line until you find the upload section for zip 13, then look at the spelling. Backups: Here |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
Oh dear == thought this problem had gone away - follow the advice on the thread Les referred to And please follow this advice about stopping BOINC and the CPDN models and doing a backup before editing the client-state.xml If you follow the procedure to correct the misspellings -uploads will start working. Keep on keeping on. |
©2024 cpdn.org