climateprediction.net (CPDN) home page
Thread 'incredibly annoyingly slow uploads - like dial-up speed.'

Thread 'incredibly annoyingly slow uploads - like dial-up speed.'

Message boards : Number crunching : incredibly annoyingly slow uploads - like dial-up speed.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 53558 - Posted: 2 Mar 2016, 11:57:59 UTC

Last few days upload rates about 6-8 KB/sec(near dial-up speed, nowhere near broadband) Problem not on this user's end.
one machine here has been trying 3 days and only uploaded 3 60MB files in 3 days. Meanwhile, models created so many more that there's 25-30 upload files waiting here (30-90MB each, and growing) --
Saw some mention here about problem, not sure what?

Another underfunded infrastructure failure?

Or maybe just me?



ID: 53558 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 53559 - Posted: 2 Mar 2016, 12:34:05 UTC

Or maybe just me?


Don't know if it is just you but this morning I had four uploads go though at over 100KB/s which is about as fast as I get to anywhere.
ID: 53559 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 53565 - Posted: 3 Mar 2016, 7:41:59 UTC

Just thought, Erik, my results will be completely irrelevant if they are a different model type and going to a different server.
ID: 53565 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 53568 - Posted: 4 Mar 2016, 9:05:33 UTC

Now, the problem is definitely mostly on my end.
Don't know yet what triggered it, but at least one of my machines got into a situation where uploads were timing out partway through, and kept retrying and rarely completing the uploads. This slowed the uploads from my other 6 hosts to the point where most of them had uploads waiting, and the original problem machine got to having 26 files trying to upload, two at a time. Ugh.
Unusually, all this attempted traffic (maybe 5-7 uploads trying at once) didn't slow other traffic noticeably. I did mess with some QOS setting on the router a couple weeks ago.
Anyhow, I'll limit the uploads until the queues here clear, and then report back. Most of the tasks uploading zip files were wah2_eu25<xxx> going to <upload_url>http://upload3.cpdn.org/cpdn_cgi/file_upload_handler</upload_url>

Apologies for false alarm.

ID: 53568 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 53569 - Posted: 4 Mar 2016, 9:11:24 UTC - in response to Message 53568.  

Seems to confirm it - mine were eu25's also.

Apologies for false alarm.


No need for apologies - If it had been CPDN end it would have meant a quick check out and hopefully a quicker fix.
ID: 53569 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53570 - Posted: 4 Mar 2016, 9:56:20 UTC

When I have slow uploads, I shut down the net on all but one machine and let that one slowly clear. Then repeat with the others.
It's fiddly, but more reliable.

ID: 53570 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 53571 - Posted: 4 Mar 2016, 10:33:26 UTC - in response to Message 53570.  
Last modified: 4 Mar 2016, 10:52:04 UTC

When I have slow uploads, I shut down the net on all but one machine and let that one slowly clear. Then repeat with the others.
It's fiddly, but more reliable.



Gotcha - doing that. I'll figure out what went wrong later
"one upload at a time"

Thx

<edit> looking more closely, some 90Mb files took 4hours or more and uploaded OK, at 9KB/sec. On another host all uploads failed and retried, again and again. Need to manage my tiny upload pipe, it seems <edit>
ID: 53571 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53572 - Posted: 5 Mar 2016, 2:51:36 UTC
Last modified: 5 Mar 2016, 2:57:17 UTC

Fortunately, a (HadCM3n task) #3 .zip went up ~1/3 usual DSL speed, not the roll-a-peanut-uphill-on-hands-and-knees-with-one's-nose 'speed' I too-often see. This sucker was 155.43MB! (We knew upload size would balloon when tasks were chopped into pieces but ... )

EDIT: for typo.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53572 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53587 - Posted: 6 Mar 2016, 5:50:19 UTC
Last modified: 6 Mar 2016, 6:48:25 UTC

Well, they did it!

3/5/2016 8:56:20 PM|climateprediction.net|Output file hadcm3n_sbeb_198012_480_352_010334930_0_4.zip for task hadcm3n_sbeb_198012_480_352_010334930_0 exceeds size limit.
3/5/2016 8:56:20 PM|climateprediction.net|File size: 162957120.000000 bytes. Limit: 150000000.000000 bytes
3/5/2016 8:56:26 PM|climateprediction.net|[error] Couldn't delete file projects/climateprediction.net/hadcm3n_sbeb_198012_480_352_010334930_0_4.zip

Be assured, I intend to throw some tacks on staff's chairs.

[EDIT] Times are Pacific Standard time (Z-8). --- I hope no one has slow uploads with beasts of this size.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53587 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 53588 - Posted: 6 Mar 2016, 7:30:27 UTC - in response to Message 53587.  

That would be the "stash file", i.e. the number of different items that the researcher wants. Probably number-of-items x individual-file-sizes.
I foretold this here.

In this case, it looks like someone forgot to increase the size limit. Probably doesn't even know about this aspect of it.
But it should have been picked up somewhere.

Or perhaps not, thinking a bit further while typing.
During in-house testing, there may not be a lot of data to fill the file, so it couldn't be foreseen. And it depends on where the data (zips) from the alpha models are "aimed". (Perhaps to "null" ?)

It seems that this "cutting edge" modelling has developed it's own built in "cutting edge" programming problems.

ID: 53588 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 53592 - Posted: 6 Mar 2016, 16:17:42 UTC

Hi Jim,

Looking at client_state.xml for 4 currently running hadcm3n's on a Linux PC, the max_nbytes for all 4 decadal uploads for all 4 tasks is 150,000,000 bytes. So far the uploaded 1st and 2nd decadal zips exceeded the max_nbytes (something over 160 MB) and didn't list any error in the message log. The transfer was a "success" according to boinc. This was with boinc 7.2.42 in Linux.

I'm not sure why the final decade upload exceeding max_nbytes would give an error if the others did not.

I hadn't been paying attention to file size uploads since it isn't normally listed in the boinc manager message log. Wow, those are big!
ID: 53592 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53596 - Posted: 6 Mar 2016, 21:26:29 UTC - in response to Message 53592.  

Hi, George,

My experience was same as yours. No error message on first three oversize .zip as seen with #4. All four were uploaded for each completed task, all apparently truncated at 150MB. (Tasks were, after #4 upload, marked as "Error.")

If memory serves, when we last experienced a too-small max_nbytes value, the current value was chosen because it was so ridiculously large that it would never be exceeded. If I might borrow from Robert Burns (and be forgiven a US paraphrase): The best laid plans of mice and men often go awry ...

I hope the responsible scientists weigh-in and tell us whether .zip files actually were truncated on upload.

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53596 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 53597 - Posted: 6 Mar 2016, 22:17:43 UTC

I�m presently running 2 of the tasks and am wondering if they zips are being truncated are they still usable by the Scientists and is it worth finishing them? Each is going to take about 21 days of comp time to finish

ID: 53597 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53598 - Posted: 6 Mar 2016, 22:35:49 UTC - in response to Message 53597.  


We don't know the answer to that yet, JIM. Stay tuned...

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53598 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 53599 - Posted: 7 Mar 2016, 3:07:58 UTC

If I recall correctly, we can manually edit client_state.xml and change the max_nbytes for those file uploads to something larger, and the rest of the files should upload correctly. Of course this has to be done when boinc has been shut down. Perhaps Richard or Ian can chime in on that as they are more boinc knowledgeable and have better memories than I do.

Problem is...I have no trust in hadcm3 models that are stopped and restarted that they won't error out just because they are so sensitive. I've lost too many in the distant past when cleanly shutting down boinc and restarting.
ID: 53599 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 53600 - Posted: 7 Mar 2016, 3:44:44 UTC - in response to Message 53599.  

Hi, again, George,

I'm aware of the capability of editing that parameter and that we did so 'back when.' I considered it this time and chose not to do so -- because I don't like what I perceive as as a "slap-dash" approach to the project, from above, in recent years.

Some on staff are trying to do the right thing and get things organized. They are, unfortunately, not holding the reins guiding this project ...

Re. HadCM3 Models -- my machines don't suffer the "shutdown and die on restart" we too-often see reported... I remember shutting those beauties down every two or three days, for months, to make backups (in the days of much slower machines) -- when they ran for 160 years (or 200 years in the case of "Spinup" project). I have not the slightest clue as to the difference then and now, or the difference in your experience and mine, George -- wish I did.

I'm aware my comments here are the sort of thing better placed on the mail-list. On the other hand, it is said that sunlight is the best disinfectant.


I fear I've steered this Thread on an oblique course from Eirik's topic ... Apologies, Eirik.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 53600 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 53601 - Posted: 7 Mar 2016, 4:52:25 UTC - in response to Message 53599.  

If I recall correctly, we can manually edit client_state.xml and change the max_nbytes for those file uploads to something larger, and the rest of the files should upload correctly. Of course this has to be done when boinc has been shut down. Perhaps Richard or Ian can chime in on that as they are more boinc knowledgeable and have better memories than I do.



Your right, I remember doing that several years ago. I think that they were beta model, so the instructions are probably lost with the boards from the now defunct beta site. It worked well. so if someone could come with those instructions I am game to try it again.


ID: 53601 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 53602 - Posted: 7 Mar 2016, 6:11:53 UTC - in response to Message 53600.  
Last modified: 7 Mar 2016, 13:48:53 UTC

I'm aware of the capability of editing that parameter and that we did so 'back when.' I considered it this time and chose not to do so -- because I don't like what I perceive as as a "slap-dash" approach to the project, from above, in recent years.

Yeah, my advice only helps those who would read this forum, very few of the people running the models.

Re. HadCM3 Models -- my machines don't suffer the "shutdown and die on restart" we too-often see reported... I remember shutting those beauties down every two or three days, for months, to make backups (in the days of much slower machines) -- when they ran for 160 years (or 200 years in the case of "Spinup" project). I have not the slightest clue as to the difference then and now, or the difference in your experience and mine, George -- wish I did.

The early hadcm3 version ones didn't have a problem. It came for me with some later version. And maybe it was most prevalent with Linux since that was where I ran most of these.

JIM, after cleanly shutting boinc down, open client_state.xml in notepad and search for all the instances of max_nbytes for cpdn tasks. Change the value of all those instances from
150000000.000000
to
250000000.000000

Save and then exit notepad. After restarting, BOINC then shouldn't complain about the decade uploads being too big.
ID: 53602 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 53603 - Posted: 7 Mar 2016, 12:32:29 UTC - in response to Message 53602.  

JIM, after cleanly shutting boinc down, open client_state.xml in notepad and search for all the instances of max_nbytes for cpdn tasks. Change the value of all those instances from
150000000.000000
to
250000000.000000

Save and then exit notepad. After restarting, BOINC then shouldn't complain about the decade uploads being too big.


Thanks I give it a try and see if I can still do it.

ID: 53603 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 53606 - Posted: 7 Mar 2016, 20:28:51 UTC

The edits to the Client_state file have been made. The problem is that one of the models was at 37% and had already sent the first (presumably) truncated decadal zip file. Any word from the Scientists on whether it is still usable and is worth the time (about 15 days) needed to finish it?

ID: 53606 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : incredibly annoyingly slow uploads - like dial-up speed.

©2024 cpdn.org