climateprediction.net (CPDN) home page
Thread 'ANOTHER UPLOAD PROBLEM'

Thread 'ANOTHER UPLOAD PROBLEM'

Message boards : Number crunching : ANOTHER UPLOAD PROBLEM
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 33 · Next

AuthorMessage
ProfileJS

Send message
Joined: 4 Mar 14
Posts: 7
Credit: 183,494
RAC: 0
Message 49464 - Posted: 30 Jun 2014, 9:25:56 UTC - in response to Message 49438.  

Should have replied to this message.

Yes, I am still having upload problems.
ID: 49464 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 49465 - Posted: 30 Jun 2014, 10:10:14 UTC

Andy has rebooted the upload machine: files should be uploading now.
ID: 49465 · Report as offensive     Reply Quote
Alex Plantema

Send message
Joined: 3 Sep 04
Posts: 126
Credit: 26,610,380
RAC: 3,377
Message 49468 - Posted: 30 Jun 2014, 19:25:09 UTC - in response to Message 49465.  

It didn't help. My queue of 58 files still doesn't upload.
ID: 49468 · Report as offensive     Reply Quote
MartinNZ

Send message
Joined: 22 Mar 06
Posts: 144
Credit: 24,695,428
RAC: 0
Message 49469 - Posted: 30 Jun 2014, 19:36:37 UTC - in response to Message 49465.  

1 or 2 moved, but still loads waiting with transient HTTP error.

ID: 49469 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,019,755
RAC: 20,934
Message 49470 - Posted: 30 Jun 2014, 19:42:54 UTC - in response to Message 49469.  

1 or 2 moved, but still loads waiting with transient HTTP error.


Isn't that just a symptom of server overload now?

My current tasks are all anz so can't give any info from own experience of this lot.
ID: 49470 · Report as offensive     Reply Quote
ProfileJS

Send message
Joined: 4 Mar 14
Posts: 7
Credit: 183,494
RAC: 0
Message 49471 - Posted: 30 Jun 2014, 20:22:31 UTC

It has not helped. Still a pile of uploads that will not move. I cannot see an HTTP error - just a message saying "Upload: pending" or "Upload:retry".

Used all the disk space allocated so it is affecting other projects. Had to cease accepting work and extend allocated disk space.

ID: 49471 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 49472 - Posted: 30 Jun 2014, 20:53:14 UTC - in response to Message 49471.  

The error messages are in the Event Log.

And I suspect that the current problem is one that goes back several weeks.
Will email the project.


ID: 49472 · Report as offensive     Reply Quote
ProfileKathryn Tombaugh-Weber

Send message
Joined: 12 Aug 11
Posts: 18
Credit: 95,107
RAC: 0
Message 49473 - Posted: 30 Jun 2014, 23:01:26 UTC

2 waiting to upload here. Has been several days. Just retried and no luck.
ID: 49473 · Report as offensive     Reply Quote
ProfileJS

Send message
Joined: 4 Mar 14
Posts: 7
Credit: 183,494
RAC: 0
Message 49474 - Posted: 30 Jun 2014, 23:26:27 UTC - in response to Message 49471.  

Files are being uploaded now, for me. It is working through the backlog.
ID: 49474 · Report as offensive     Reply Quote
Rick

Send message
Joined: 25 Mar 14
Posts: 3
Credit: 280,087
RAC: 0
Message 49475 - Posted: 1 Jul 2014, 0:24:12 UTC

Yep, mine are uploading now. Thanks all.
ID: 49475 · Report as offensive     Reply Quote
SekeRob

Send message
Joined: 21 Nov 06
Posts: 20
Credit: 318,377
RAC: 0
Message 49476 - Posted: 1 Jul 2014, 9:28:17 UTC - in response to Message 49475.  
Last modified: 1 Jul 2014, 9:29:14 UTC

Bit puzzled about trickles and eot zips. Like others having upload issues for days, but noticed there to be 2 36+mb zips sitting in upload

climateprediction.net hadam3p_eu_i3j1_2013_1_008769009_1_1.zip 22,182 36425,99 K 00:06:36 42,49 Kbps Uploading 2750123 Home
climateprediction.net hadam3p_eu_i36j_2013_1_008768559_1_1.zip 24.477 36409,51 K 00:06:37 47,01 Kbps Uploading 2750123 Home

Got 2 tasks running with exactly the same job names received during the night, i.e. far from complete:

climateprediction.net 6.09 hadam3p_eu hadam3p_eu_i3j1_2013_1_008769009_1 09:21:34 (09:16:24) 99,08 16,134 03d,09:36:34 346d,19:45:10 7/1/2014 1:41:30 AM [58] 00:03:18 Running 2750123 Home 154.33 MB 156.87 MB
climateprediction.net 6.09 hadam3p_eu hadam3p_eu_i36j_2013_1_008768559_1 09:16:59 (09:11:19) 98,98 15,937 03d,09:48:03 346d,19:45:10 7/1/2014 1:41:30 AM [57] 00:05:43 Running 2750123 Home 154.37 MB 156.87 MB

Looking in the job trickle detail see only 1 listed for each:

01 Jul 2014 05:50:30 1328724 16697420 hadam3p_eu_i36j_2013_1_008768559_1 1 11,616 17,420 1.4997
01 Jul 2014 05:50:30 1328724 16697421 hadam3p_eu_i3j1_2013_1_008769009_1 1 11,616 17,323 1.4913

Before I thought that the trickles were just small snippets and the zip only produced at end of task, thus these 36.4Mb uploads not making sense? They finally seem to be going, albeit slow at just 45Kbps and after finishing nothing web-side indicates these task got something special, still only 1 trickle from hours before.

6325 climateprediction.net 7/1/2014 11:22:59 AM Finished upload of hadam3p_eu_i36j_2013_1_008768559_1_1.zip
6326 climateprediction.net 7/1/2014 11:23:07 AM Finished upload of hadam3p_eu_i3j1_2013_1_008769009_1_1.zip
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 49476 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,706,621
RAC: 9,524
Message 49477 - Posted: 1 Jul 2014, 9:47:56 UTC - in response to Message 49476.  

These tasks produce both trickles and intermediate data files.

The trickles are small progress reports, sent to to the scheduler - they don't show up in BOINC's transfer tab, but they are logged on the task page on the server.

The intermediate data files are visible on the transfer tab (sometimes for longer than we'd like), but they go to one or more different servers. They aren't specifically logged on the task details page, but in practice trickles and data files are produced at approximately the same time, and if the upload servers have space available and are working, the times should correlate.
ID: 49477 · Report as offensive     Reply Quote
SekeRob

Send message
Joined: 21 Nov 06
Posts: 20
Credit: 318,377
RAC: 0
Message 49478 - Posted: 1 Jul 2014, 10:10:22 UTC - in response to Message 49477.  

OK, given the regular space issue, doing away with trickles for these 4 day results might be worth considering, to include the bandwidth savings on the volunteers side too. Trickle every 6 hours or so, 36.4mb, 89 hours total computing, is 89 / 6 * 36.4 = 539Mb per task. What's retained at the end, just the final zip? If so, maybe to think harder about.
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 49478 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 49479 - Posted: 1 Jul 2014, 10:28:53 UTC

The trickles are very small and there would be no significant bandwidth saving by removing them. They are also a relatively fine-grained method for allocating credits, so that a model that doesn't complete still gets something approximately proportional to the run time.

In terms of the science data it is the interim Zip files that are valuable. The final Zip file on models such as HADAM3P provides the data needed to start the next model going. This system is the response to repeated requests by users to have shorter models: longer models make better use of bandwidth by eliminating restart files and machine-to-machine variability - but are not popular. Be careful what you wish for!
ID: 49479 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 49480 - Posted: 1 Jul 2014, 10:29:14 UTC - in response to Message 49478.  
Last modified: 1 Jul 2014, 10:30:28 UTC

There are several types of trickle_up files.
Which is which, is the first piece of data in the file. This is labelled "variety".
The first to be used was "orig", and just says: "I'm still running, and the following info is where I'm up to".

Since then 2 or 3 other "varietys" have been created, and each of these contain actual climate data. (In addition to what's in the zip files.)

Use a text editor to have a look.

**************

The trickle_up files go to their own server.

They're usually around 170 bytes for the simple "variety".

Trickle_up files are kept, so that the credit script can count them each time it runs, and provide something or other, which escapes me at the moment.

edit
You beat me to it, Iain. :)
ID: 49480 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 49483 - Posted: 2 Jul 2014, 6:22:54 UTC
Last modified: 2 Jul 2014, 6:26:39 UTC

I have 4 new hadam3p_eu zip files waiting in my transfer tab. Is this just server overload due to the recent outage or is the server down again?

P.S. Make that 8 zip files, I forgot to check my other machine.
ID: 49483 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 49484 - Posted: 2 Jul 2014, 9:46:03 UTC

What I see - is "the server" is being transitioned to some kind of virtual thingy. With better software and lots of mojo.
"It" has been up most daytimes in the Zulu (GMT) timezone, and overloaded then.
Otherwise, it's been down.
See the warnings already posted about restricted service during upgrades.
Wait a week or so..
The backlog on my clients has been slowly declining at my small max bandwidth - daytimes at ox.ac.uk.

Things should get better soonish, more or less.

Nothing's being wasted.

ID: 49484 · Report as offensive     Reply Quote
UXJnHL

Send message
Joined: 1 Nov 06
Posts: 11
Credit: 579,556
RAC: 1,322
Message 49684 - Posted: 2 Aug 2014, 5:37:45 UTC

It's happening again...

8/2/2014 1:29:21 AM | climateprediction.net | Started upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip
8/2/2014 1:36:04 AM | climateprediction.net | [error] Error reported by file upload server: can't open file /storage/cpdn-restarts/incoming/uploader/hadam3p_eu_lcz2_2013_1_008828039_0_13.zip: No such file or directory
8/2/2014 1:36:04 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip: transient upload error
8/2/2014 1:36:04 AM | climateprediction.net | Backing off 00:10:08 on upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip

that was the fourth attempt.
ID: 49684 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 49685 - Posted: 2 Aug 2014, 6:51:25 UTC - in response to Message 49684.  

Yes, "me too". I reported this about 8 hours ago, but I guess we'll have to wait until next week.

I'd suggest turning off the net access if possible, otherwise it'll keep uploading from the start, wasting your data allocation. :(

ID: 49685 · Report as offensive     Reply Quote
Niall

Send message
Joined: 18 Dec 13
Posts: 62
Credit: 1,078,935
RAC: 0
Message 49686 - Posted: 2 Aug 2014, 9:00:58 UTC

I've got a different problem, and I'm not sure if it's an upload problem or the result of a crash. Last night my system crashed. I don't know what caused it: it may have been BOINC or it may have been either of a couple of other things.

The sdoutdae.txt file logged the following:
"01-Aug-2014 17:13:06 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with zero status but no 'finished' file
01-Aug-2014 17:13:06 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:07 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:07 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:08 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:08 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with zero status but no 'finished' file
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with zero status but no 'finished' file
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with zero status but no 'finished' file"

Then I get repeated messages that
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with a DLL initialization error.
01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with a DLL initialization error.
01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with a DLL initialization error.

This happened repeatedly over the space of 25 seconds. Then I get what looks like BOINC's usual startup readout as the system reboots after the crash.

The work units themselves are now running normally, but this all occurred just a couple of minutes after unit hadam3p_eu_o4kg_2013_1_008832505_0 finished its run.
01-Aug-2014 17:11:36 [climateprediction.net] Finished upload of hadam3p_eu_o4kg_2013_1_008832505_0_13.zip

This is fine, but it's been reporting this, repeatedly, all night:
01-Aug-2014 22:25:30 [climateprediction.net] Sending scheduler request: To report completed tasks.
01-Aug-2014 22:25:30 [climateprediction.net] Reporting 1 completed tasks
...
02-Aug-2014 09:17:56 [climateprediction.net] Sending scheduler request: To report completed tasks.
02-Aug-2014 09:17:56 [climateprediction.net] Reporting 1 completed tasks

The remains of task hadam3p_eu_o4kg_2013_1_008832505_0 are still on my system, with a "ready to report" status more than 16 hours after completion.

Is this the result of the crash, or is this part of your server glitch? I seem to be uploading zips from the working WUs normally.
ID: 49686 · Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 33 · Next

Message boards : Number crunching : ANOTHER UPLOAD PROBLEM

©2024 cpdn.org