Message boards : Number crunching : Several jobs uploads in project backoff
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
I'm sure this might have been discussed before but I have 4 different WU's uploads go to 100% and either start over or go into "Project backoff". This is what the log said....... 4/26/2013 8:41:12 AM | climateprediction.net | [error] Error reported by file upload server: can't open file /storage/incoming/uploader/hadam3p_eu_qfqb_2009_1_008346176_1_2.zip: No such file or directory 4/26/2013 8:41:12 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_qfqb_2009_1_008346176_1_2.zip: transient upload error 4/26/2013 8:41:12 AM | climateprediction.net | Backing off 3 min 54 sec on upload of hadam3p_eu_qfqb_2009_1_008346176_1_2.zip 4/26/2013 8:17:33 AM | climateprediction.net | [error] Error reported by file upload server: can't open file /storage/incoming/uploader/hadam3p_eu_qf8n_2010_1_008345540_1_12.zip: No such file or directory 4/26/2013 8:17:33 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_qf8n_2010_1_008345540_1_12.zip: transient upload error 4/26/2013 8:17:33 AM | climateprediction.net | Backing off 24 min 56 sec on upload of hadam3p_eu_qf8n_2010_1_008345540_1_12.zip All this happened right around the same time that's why I'm hoping it's a server issue, but no one else has complained about it yet. If it's the work units, do I delete everything? TYA |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
... All this happened right around the same time that's why I'm hoping it's a server issue, but no one else has complained about it yet. ... ...Error reported by file upload server... Yes, a server issue. This sort of thing typically happens at the weekend. The client will keep retrying ('project backoff') for 2 weeks, which is usually enough LOL. And if 2 weeks is not enough time for the staff at Oxford to fix it, you can give it more time by editing the task config files. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
Yes, a server issue. This sort of thing typically happens at the weekend. Boy, isn't that the truth. Well, I'm relieved that it is a server issue rather than a model that I would have to abort for the 100th time, I take it that others are experiencing this problem? This seems to be only happening with the shorter regional models. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There was a problem a day ago in one of the server rooms. It only affected the final 13th zip of the regional models. It was reported as being fixed. As Mike said, it's the weekend, so, here we go again. :( |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
I can't believe it's just me having problems, I noticed about 45 minutes ago a very small upload made it through (7.54MB regional upload) so I tried to see if it would take a 31MB upload and it didn't work. Anyway, thanks guys. |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
Les, you said there was a problem like this not long ago, where is that thread? I have looked and can't find it. I have close to 500MB of uploads waiting and they keep trying to upload over and over sucking up bandwidth from the other project. I can't get more work for GPU-Grid until I upload results and my connections being choked by CPDN results. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It was mentioned in two emails from Andy to the moderators. The first one said that there was a problem and the IT people who look after that equipment room were looking into it. The 2nd one a few hours later said that the problem had been fixed. Whatever is wrong at the moment will NOT get looked at until business hours on Monday. The University of Oxford IS the City of Oxford. And vice versa. There are departments all over, most with their own IT section and equipment rooms, and this project has servers in several of them, wherever they could get space. The only cure for your problem is to turn off Network access and wait it out. Setting the project to No new work, and then Suspending climate models before they finish will minimise the transfer backlog, but it looks like that's too late for you. |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
Okay, thanks, sorry to bother you. I'll just keep on keeping on. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The time limit for uploading files from any project was extended. I can't remember whether the limit is now two or three months, but in any case it's far longer than we need. But, but, but... each file is still only allowed 100 upload attempts, after which it expires. That's the BOINC rule. 100 is plenty but please don't use up the files' lives by repeatedly pressing the Retry now button in the Transfers tab. The files come to no harm while they wait. Cpdn news |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Thanks. Yes, my job in back=off is the 13th zip result file for a Pacific North West Regional Model. |
Send message Joined: 21 Aug 11 Posts: 10 Credit: 26,553,404 RAC: 1,491 |
Yeah, here too with two wus in back-off mode... |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
There's nothing I can do about that, every time I re-enable my internet connection to upload GPUGrid wu's, they try to upload too and slow my connection. I wish someone had the foresight to give us an option to stop certain results from uploading while allowing others to go through. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I wish someone had the foresight to give us an option to stop certain results from uploading while allowing others to go through. That option was asked for at BOINC/dev and refused. Backups: Here |
Send message Joined: 30 Jan 12 Posts: 38 Credit: 10,197,388 RAC: 0 |
That option was asked for at BOINC/dev and refused. I wonder why that is? They must not trust us enough to use it correctly, that really, really bothers me. I have 4 purpose built machines by me just for BOINC, I have about $15,000 tied up in these computers plus a $350.00 a month electric bill and they won't let us have a feature like that to witch I'm sure 90% of the other crunchers would want. It just don't make sense, I'm sure the benefits would far out weigh the their reasons for not wanting it. |
Send message Joined: 15 Nov 10 Posts: 43 Credit: 6,118,949 RAC: 0 |
I have the same problem with one wu trying to upload since Friday night |
Send message Joined: 19 Aug 05 Posts: 104 Credit: 1,866,495 RAC: 0 |
Flashhawk Many of us would like that but they will not build it in, could be that if someone wrote it for then it would go to production. If you know how to write that and have a compiler you can download the code to put it in. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Thyme Lawn who is one of the CPDN moderators provided a patch that could have been incorporated to do the job of project-specific network suspend. He added it to a ticket that had been initiated by MikeMarsUK who posted in this thread a few days ago. He added a couple of extra patches which may have been for the Linux and Mac versions of BOINC. Dr David Anderson, who is our BOINC boss, refused the request on the grounds that the transfer backoff system renders it unnecessary. I know he's also keen to keep the buttons in BOINC Manager as few and as simple as possible. I've had some tickets accepted and some refused. For example, I've always thought it's confusing to have two folders with different contents both called BOINC. I asked for the BOINC Data folder to be renamed BOINC Data. My request was refused on the grounds that giving the same name to both was standard industry practice. Hmmm... BOINC is open-source but we still have our boss in Berkeley. Cpdn news |
Send message Joined: 16 Oct 11 Posts: 254 Credit: 15,954,577 RAC: 0 |
Thanks Mo. Agree with your comments on BOINC, but is there a known problem with the CPDN uploads that needs to be fixed? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
is there a known problem with the CPDN uploads that needs to be fixed? That is the suspicion. It's under discussion. Backups: Here |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Nearly 12 hours ago, Jonathon said that the upload server was accepting uploads normally. My solitary PNW has just finished uploading, which confirms it. So the servers are OK. Backups: Here |
©2024 cpdn.org