climateprediction.net (CPDN) home page
Thread 'The uploads are stuck'

Thread 'The uploads are stuck'

Message boards : Number crunching : The uploads are stuck
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 25 · Next

AuthorMessage
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,483,915
RAC: 15,324
Message 67562 - Posted: 11 Jan 2023, 19:24:38 UTC - in response to Message 67554.  
Last modified: 11 Jan 2023, 19:33:36 UTC

Stony666 wrote:
I have waited nearly two weeks until I post here because I had the hope that it could be fixed in a few days.
My hosts resides in Germany. They are in 3 different locations and providers.
On all boxes:
69972 climateprediction.net 11.01.2023 19:11:41 Temporarily failed upload of oifs_43r3_ps_0932_2013050100_123_982_12199576_0_r933785131_43.zip: transient HTTP error
The upload server is still up and functioning ok. There is an enormous backlog, so please be patient for a couple of days. I'm sure your uploads will happen soon.

wujj123456 wrote:
Best case: Server is saturated and we just need to be patient and wait for our turn.
Worse case: It was not the storage at first place and the actual issue is still unknown yet.
It was the storage. The cloud provider over-provisioned the disks and we suspect also put the OS on the same bare-metal as the data devices (slaps palm to forehead). But I didn't say that...

leloft wrote:
Doubly unlucky: I've just had the same refusal from both the first machine and now a second one, both refer to the same value 7168.00 MB. The good news is that one of the hosts has managed to upload 8 tasks.
I'll keep trying, but I'm limited by the 3636 seconds rule.
The 7168Mb tasks are fine. Perhaps the disk is just too full of results files at the moment and it will sort itself out in good time.
ID: 67562 · Report as offensive     Reply Quote
ProfileLandjunge

Send message
Joined: 17 Aug 07
Posts: 8
Credit: 37,226,489
RAC: 12,927
Message 67563 - Posted: 11 Jan 2023, 19:28:48 UTC

I am currently uploading with 32MBit/s. It's all my internet connection can provide. Location is germany.
ID: 67563 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 67564 - Posted: 11 Jan 2023, 19:42:36 UTC - in response to Message 67563.  
Last modified: 11 Jan 2023, 19:56:53 UTC

I am currently uploading with 32MBit/s. It's all my internet connection can provide. Location is germany.


I am currently uploading with 6.3 MegaBytes/second. 50 Megabits/second. My internet connection can provide 75 megabits/second. Location near New York City.

Actual speeds are:

Timestamp 	     Download 	Upload 	   Latency  Jitter Quality Score Test Server
1/11/2023 14:35:27  76.28 Mbps  79.91 Mbps 5 ms     1 ms   Excellent     newyork02.speedtest.windstream.net

ID: 67564 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 262
Credit: 34,915,412
RAC: 16,463
Message 67567 - Posted: 11 Jan 2023, 20:11:51 UTC

... both of my connections are pegged at 3Mbit up. No, that's not MB/s. That's Mbit. I'm hoping Starlink will clear up overnight, but it's gonna be a while.
ID: 67567 · Report as offensive     Reply Quote
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,306,920
RAC: 297
Message 67568 - Posted: 11 Jan 2023, 20:22:17 UTC
Last modified: 11 Jan 2023, 20:37:01 UTC

FWIW, when I came home from work 3 hours ago it worked somewhat. It quickly worsened, and now _all_ transfer attempts fail. I've got the exact same situation now as the one @Stony666 described.

Glenn Carver wrote:
The upload server is still up and functioning ok. There is an enormous backlog, so please be patient for a couple of days. I'm sure your uploads will happen soon.
There is some good news and some bad news:

The good: Relative to the upload server outage (now almost three weeks), the backlog can't actually be very big. That's because many of us had to stop computing rather soon after the upload server outage started. (Everybody whose production is either constrained by Internet connection throughput, or by disk space.)

The bad: The fact that _nothing_ is moving anymore for myself and evidently for some others doesn't make me optimistic that the backlog (however modest or enormous it might be) would clear anytime soon. And as long as nothing is moving, folks as myself can't resume computation.
ID: 67568 · Report as offensive     Reply Quote
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 67569 - Posted: 11 Jan 2023, 20:28:04 UTC

I’ve now cleared my backlog and files are still trickling up slowly - only trouble is that I’m generating new files just slightly more often than the link is clearing them :-)

Definitely progress as all of the outstanding WUs have gone and I can see my task list again.
ID: 67569 · Report as offensive     Reply Quote
ProfilePDW

Send message
Joined: 29 Nov 17
Posts: 82
Credit: 14,773,034
RAC: 86,510
Message 67570 - Posted: 11 Jan 2023, 20:30:02 UTC

All mine have uploaded this afternoon and new ZIP files are disappearing and not being held up.
ID: 67570 · Report as offensive     Reply Quote
leloft

Send message
Joined: 7 Jun 17
Posts: 23
Credit: 44,434,789
RAC: 2,600,991
Message 67571 - Posted: 11 Jan 2023, 20:46:56 UTC - in response to Message 67558.  

Fix for - Need more disk space. You currently have 0.00 MB available.

In the BOINC Manager, Options -> Computing Preferences -> Disk and memory -

Check the box "Use no more than" and put a number in the number box equal to about 3/4 of your disk size (or some other number you are comfortable with).

If you leave it this box UNCHECKED, it is the same as having it checked with 100 (GB) in the number box.

At least that is how it works for me.


Good advice, thank you. I set 'use no more than' to 1000G, 'leave at least' to 1G free, 'use at most' to 99% of disk, updated project and now the two hosts in question processing new units.
2/4 hosts uploading as well.
Onwards and forwards.
fraser
ID: 67571 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,483,915
RAC: 15,324
Message 67572 - Posted: 11 Jan 2023, 20:57:21 UTC - in response to Message 67568.  

FWIW, when I came home from work 3 hours ago it worked somewhat. It quickly worsened, and now _all_ transfer attempts fail. I've got the exact same situation now as the one @Stony666 described.

The bad: The fact that _nothing_ is moving anymore for myself and evidently for some others doesn't make me optimistic that the backlog (however modest or enormous it might be) would clear anytime soon. And as long as nothing is moving, folks as myself can't resume computation.
There may be some boinc-ness things going on. I vaguely remember Richard saying something about uploads will stop processing if it tries & fails 3 times? Or something like that? Uploads are still ok for me, I've got another 1000 to do.

Richard?
ID: 67572 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 67574 - Posted: 11 Jan 2023, 21:00:52 UTC - in response to Message 67572.  
Last modified: 11 Jan 2023, 21:19:19 UTC

There may be some boinc-ness things going on. I vaguely remember Richard saying something about uploads will stop processing if it tries & fails 3 times? Or something like that? Uploads are still ok for me, I've got another 1000 to do.

I have vague memories of a limit on the number of tries and fails but it is a lot more than 3!

Edit: Cleared all from the VM. The others after a nudge have just started again.

Edit2: Something like 300 tasks an hour are being uploaded. (based on the reduction in tasks in progress. I didn't check tasks available to send which would increase it but not sure by how much. Will perhaps check that tomorrow.
ID: 67574 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 127
Credit: 42,019,421
RAC: 69,142
Message 67577 - Posted: 11 Jan 2023, 21:11:00 UTC

Curious for folks with no problem connecting, are you able to ping upload11.cpdn.org?

I still couldn't connect, but given the latency, I more or less expect to be the last wave. I just don't want to keep retrying in BOINC since it has the tendency to upload partial files. If ping can give me the signal, that would be much easier for me to know when I might be able to upload.
ID: 67577 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,483,915
RAC: 15,324
Message 67580 - Posted: 11 Jan 2023, 21:25:09 UTC - in response to Message 67577.  
Last modified: 11 Jan 2023, 21:26:11 UTC

Curious for folks with no problem connecting, are you able to ping upload11.cpdn.org?

I still couldn't connect, but given the latency, I more or less expect to be the last wave. I just don't want to keep retrying in BOINC since it has the tendency to upload partial files. If ping can give me the signal, that would be much easier for me to know when I might be able to upload.
I've got 10 concurrent uploads going (all day in fact), and if I try ping upload11.cpdn.org now it just hangs. So either it's too busy or they have disabled ping on the system for security.

traceroute does 30 hops and ends in (several tries, same result):
....
27  * * *
28  * * *
29  * * *
30  * * *
But both ping & traceroute were working fine for me yesterday evening when the server was up but httpd was not enabled on the machine. Probably load. Still, as you say, might be prudent just to wait a few more hours for the load to ease a bit.

p.s. I'm in Cambridge so Oxford's not too far...
ID: 67580 · Report as offensive     Reply Quote
gemini8

Send message
Joined: 4 Dec 15
Posts: 52
Credit: 2,490,276
RAC: 2,122
Message 67581 - Posted: 11 Jan 2023, 21:27:19 UTC - in response to Message 67577.  
Last modified: 11 Jan 2023, 21:28:42 UTC

Curious for folks with no problem connecting, are you able to ping upload11.cpdn.org?

I still couldn't connect, but given the latency, I more or less expect to be the last wave. I just don't want to keep retrying in BOINC since it has the tendency to upload partial files. If ping can give me the signal, that would be much easier for me to know when I might be able to upload.

I was able to upload my files a while ago, but atm a ping isn't getting through.
Trying from Hanover, Germany.
- - - - - - - - - -
Greetings, Jens
ID: 67581 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 127
Credit: 42,019,421
RAC: 69,142
Message 67582 - Posted: 11 Jan 2023, 21:30:54 UTC - in response to Message 67580.  

Thanks. Looks like ICMP packets are blocked then. Guess will just let boinc client handle the retry on its own. Given others are making progress, hopefully it won't take too long to be my turn.
ID: 67582 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 53,404,292
RAC: 2,217
Message 67583 - Posted: 11 Jan 2023, 21:43:22 UTC - in response to Message 67582.  

I haven't seen any uploads in the last week either wujj123456.
ID: 67583 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,483,915
RAC: 15,324
Message 67584 - Posted: 11 Jan 2023, 21:54:05 UTC - in response to Message 67583.  

I haven't seen any uploads in the last week either wujj123456.
FYI. The upload server only came back about 10:00GMT today (weds 11th).
ID: 67584 · Report as offensive     Reply Quote
JagDoc

Send message
Joined: 21 Dec 22
Posts: 5
Credit: 7,830,854
RAC: 4,533
Message 67585 - Posted: 11 Jan 2023, 21:54:10 UTC

I was able to upload all files, 100GB gone.
ID: 67585 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,718,239
RAC: 8,054
Message 67586 - Posted: 11 Jan 2023, 22:13:00 UTC - in response to Message 67572.  

There may be some boinc-ness things going on. I vaguely remember Richard saying something about uploads will stop processing if it tries & fails 3 times? Or something like that? Uploads are still ok for me, I've got another 1000 to do.

Richard?
If three uploads in succession fail to get through, BOINC will go into 'Project backoff' for a stated period - starting at a few minutes, increasing to several hours. During this time, existing uploads will not be retried.

But any new upload will be tried just once. If that succeeds, the project backoff will be cleared, and existing uploads will be retried - until the next batch of failures.

This is purely a temporary thing - the user can over-ride it at any time, either by retrying a single transfer, or choosing 'retry pending transfers' from the tools menu in BOINC Manager.
ID: 67586 · Report as offensive     Reply Quote
Yeti

Send message
Joined: 5 Aug 04
Posts: 178
Credit: 18,897,792
RAC: 44,938
Message 67588 - Posted: 12 Jan 2023, 0:30:37 UTC - in response to Message 67586.  

climateprediction.net 12-01-2023 01:22 [error] Error reported by file upload server: Server is out of disk space
Supporting BOINC, a great concept !
ID: 67588 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,483,915
RAC: 15,324
Message 67591 - Posted: 12 Jan 2023, 0:54:56 UTC - in response to Message 67588.  

climateprediction.net 12-01-2023 01:22 [error] Error reported by file upload server: Server is out of disk space
Yes, I've seen it and reported it back. I believe that's 27Tb filled then. The uploaded tasks should be moving off to a transfer server, maybe that's not working.
ID: 67591 · Report as offensive     Reply Quote
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 25 · Next

Message boards : Number crunching : The uploads are stuck

©2024 cpdn.org