Message boards : Number crunching : ANOTHER UPLOAD PROBLEM
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 33 · Next
Author | Message |
---|---|
Send message Joined: 4 Mar 14 Posts: 7 Credit: 183,494 RAC: 0 |
Should have replied to this message. Yes, I am still having upload problems. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
Andy has rebooted the upload machine: files should be uploading now. |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
It didn't help. My queue of 58 files still doesn't upload. |
Send message Joined: 22 Mar 06 Posts: 144 Credit: 24,695,428 RAC: 0 |
1 or 2 moved, but still loads waiting with transient HTTP error. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,019,755 RAC: 20,934 |
1 or 2 moved, but still loads waiting with transient HTTP error. Isn't that just a symptom of server overload now? My current tasks are all anz so can't give any info from own experience of this lot. |
Send message Joined: 4 Mar 14 Posts: 7 Credit: 183,494 RAC: 0 |
It has not helped. Still a pile of uploads that will not move. I cannot see an HTTP error - just a message saying "Upload: pending" or "Upload:retry". Used all the disk space allocated so it is affecting other projects. Had to cease accepting work and extend allocated disk space. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The error messages are in the Event Log. And I suspect that the current problem is one that goes back several weeks. Will email the project. |
Send message Joined: 12 Aug 11 Posts: 18 Credit: 95,107 RAC: 0 |
2 waiting to upload here. Has been several days. Just retried and no luck. |
Send message Joined: 4 Mar 14 Posts: 7 Credit: 183,494 RAC: 0 |
Files are being uploaded now, for me. It is working through the backlog. |
Send message Joined: 25 Mar 14 Posts: 3 Credit: 280,087 RAC: 0 |
Yep, mine are uploading now. Thanks all. |
Send message Joined: 21 Nov 06 Posts: 20 Credit: 318,377 RAC: 0 |
Bit puzzled about trickles and eot zips. Like others having upload issues for days, but noticed there to be 2 36+mb zips sitting in upload climateprediction.net hadam3p_eu_i3j1_2013_1_008769009_1_1.zip 22,182 36425,99 K 00:06:36 42,49 Kbps Uploading 2750123 Home climateprediction.net hadam3p_eu_i36j_2013_1_008768559_1_1.zip 24.477 36409,51 K 00:06:37 47,01 Kbps Uploading 2750123 Home Got 2 tasks running with exactly the same job names received during the night, i.e. far from complete: climateprediction.net 6.09 hadam3p_eu hadam3p_eu_i3j1_2013_1_008769009_1 09:21:34 (09:16:24) 99,08 16,134 03d,09:36:34 346d,19:45:10 7/1/2014 1:41:30 AM [58] 00:03:18 Running 2750123 Home 154.33 MB 156.87 MB climateprediction.net 6.09 hadam3p_eu hadam3p_eu_i36j_2013_1_008768559_1 09:16:59 (09:11:19) 98,98 15,937 03d,09:48:03 346d,19:45:10 7/1/2014 1:41:30 AM [57] 00:05:43 Running 2750123 Home 154.37 MB 156.87 MB Looking in the job trickle detail see only 1 listed for each: 01 Jul 2014 05:50:30 1328724 16697420 hadam3p_eu_i36j_2013_1_008768559_1 1 11,616 17,420 1.4997 01 Jul 2014 05:50:30 1328724 16697421 hadam3p_eu_i3j1_2013_1_008769009_1 1 11,616 17,323 1.4913 Before I thought that the trickles were just small snippets and the zip only produced at end of task, thus these 36.4Mb uploads not making sense? They finally seem to be going, albeit slow at just 45Kbps and after finishing nothing web-side indicates these task got something special, still only 1 trickle from hours before. 6325 climateprediction.net 7/1/2014 11:22:59 AM Finished upload of hadam3p_eu_i36j_2013_1_008768559_1_1.zip 6326 climateprediction.net 7/1/2014 11:23:07 AM Finished upload of hadam3p_eu_i3j1_2013_1_008769009_1_1.zip Coelum Non Animum Mutant, Qui Trans Mare Currunt |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,707,449 RAC: 9,333 |
These tasks produce both trickles and intermediate data files. The trickles are small progress reports, sent to to the scheduler - they don't show up in BOINC's transfer tab, but they are logged on the task page on the server. The intermediate data files are visible on the transfer tab (sometimes for longer than we'd like), but they go to one or more different servers. They aren't specifically logged on the task details page, but in practice trickles and data files are produced at approximately the same time, and if the upload servers have space available and are working, the times should correlate. |
Send message Joined: 21 Nov 06 Posts: 20 Credit: 318,377 RAC: 0 |
OK, given the regular space issue, doing away with trickles for these 4 day results might be worth considering, to include the bandwidth savings on the volunteers side too. Trickle every 6 hours or so, 36.4mb, 89 hours total computing, is 89 / 6 * 36.4 = 539Mb per task. What's retained at the end, just the final zip? If so, maybe to think harder about. Coelum Non Animum Mutant, Qui Trans Mare Currunt |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,808,726 RAC: 5,192 |
The trickles are very small and there would be no significant bandwidth saving by removing them. They are also a relatively fine-grained method for allocating credits, so that a model that doesn't complete still gets something approximately proportional to the run time. In terms of the science data it is the interim Zip files that are valuable. The final Zip file on models such as HADAM3P provides the data needed to start the next model going. This system is the response to repeated requests by users to have shorter models: longer models make better use of bandwidth by eliminating restart files and machine-to-machine variability - but are not popular. Be careful what you wish for! |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There are several types of trickle_up files. Which is which, is the first piece of data in the file. This is labelled "variety". The first to be used was "orig", and just says: "I'm still running, and the following info is where I'm up to". Since then 2 or 3 other "varietys" have been created, and each of these contain actual climate data. (In addition to what's in the zip files.) Use a text editor to have a look. ************** The trickle_up files go to their own server. They're usually around 170 bytes for the simple "variety". Trickle_up files are kept, so that the credit script can count them each time it runs, and provide something or other, which escapes me at the moment. edit You beat me to it, Iain. :) |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
I have 4 new hadam3p_eu zip files waiting in my transfer tab. Is this just server overload due to the recent outage or is the server down again? P.S. Make that 8 zip files, I forgot to check my other machine. |
Send message Joined: 31 Aug 04 Posts: 391 Credit: 219,896,461 RAC: 649 |
What I see - is "the server" is being transitioned to some kind of virtual thingy. With better software and lots of mojo. "It" has been up most daytimes in the Zulu (GMT) timezone, and overloaded then. Otherwise, it's been down. See the warnings already posted about restricted service during upgrades. Wait a week or so.. The backlog on my clients has been slowly declining at my small max bandwidth - daytimes at ox.ac.uk. Things should get better soonish, more or less. Nothing's being wasted. |
Send message Joined: 1 Nov 06 Posts: 11 Credit: 579,556 RAC: 1,322 |
It's happening again... 8/2/2014 1:29:21 AM | climateprediction.net | Started upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip 8/2/2014 1:36:04 AM | climateprediction.net | [error] Error reported by file upload server: can't open file /storage/cpdn-restarts/incoming/uploader/hadam3p_eu_lcz2_2013_1_008828039_0_13.zip: No such file or directory 8/2/2014 1:36:04 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip: transient upload error 8/2/2014 1:36:04 AM | climateprediction.net | Backing off 00:10:08 on upload of hadam3p_eu_lcz2_2013_1_008828039_0_13.zip that was the fourth attempt. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, "me too". I reported this about 8 hours ago, but I guess we'll have to wait until next week. I'd suggest turning off the net access if possible, otherwise it'll keep uploading from the start, wasting your data allocation. :( |
Send message Joined: 18 Dec 13 Posts: 62 Credit: 1,078,935 RAC: 0 |
I've got a different problem, and I'm not sure if it's an upload problem or the result of a crash. Last night my system crashed. I don't know what caused it: it may have been BOINC or it may have been either of a couple of other things. The sdoutdae.txt file logged the following: "01-Aug-2014 17:13:06 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with zero status but no 'finished' file 01-Aug-2014 17:13:06 [climateprediction.net] If this happens repeatedly you may need to reset the project. 01-Aug-2014 17:13:07 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error. 01-Aug-2014 17:13:07 [climateprediction.net] If this happens repeatedly you may need to reboot your computer. 01-Aug-2014 17:13:08 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error. 01-Aug-2014 17:13:08 [climateprediction.net] If this happens repeatedly you may need to reboot your computer. 01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with zero status but no 'finished' file 01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project. 01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with zero status but no 'finished' file 01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project. 01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with zero status but no 'finished' file" Then I get repeated messages that 01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error. 01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reboot your computer. 01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with a DLL initialization error. 01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer. 01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with a DLL initialization error. 01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer. 01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with a DLL initialization error. This happened repeatedly over the space of 25 seconds. Then I get what looks like BOINC's usual startup readout as the system reboots after the crash. The work units themselves are now running normally, but this all occurred just a couple of minutes after unit hadam3p_eu_o4kg_2013_1_008832505_0 finished its run. 01-Aug-2014 17:11:36 [climateprediction.net] Finished upload of hadam3p_eu_o4kg_2013_1_008832505_0_13.zip This is fine, but it's been reporting this, repeatedly, all night: 01-Aug-2014 22:25:30 [climateprediction.net] Sending scheduler request: To report completed tasks. 01-Aug-2014 22:25:30 [climateprediction.net] Reporting 1 completed tasks ... 02-Aug-2014 09:17:56 [climateprediction.net] Sending scheduler request: To report completed tasks. 02-Aug-2014 09:17:56 [climateprediction.net] Reporting 1 completed tasks The remains of task hadam3p_eu_o4kg_2013_1_008832505_0 are still on my system, with a "ready to report" status more than 16 hours after completion. Is this the result of the crash, or is this part of your server glitch? I seem to be uploading zips from the working WUs normally. |
©2024 cpdn.org