Message boards : Number crunching : The uploads are stuck
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 25 · Next
Author | Message |
---|---|
Send message Joined: 4 Dec 15 Posts: 52 Credit: 2,489,447 RAC: 2,080 |
Mine go up with about 1.5 MiB/s, meaning around twelve seconds per file. Nice it's working again! - - - - - - - - - - Greetings, Jens |
Send message Joined: 6 Jul 06 Posts: 147 Credit: 3,615,496 RAC: 420 |
Yes I am still seeing "connect(): failed" messages on all upload tries. It has changed to "transient HTTP error" now so still not working here yet (Australia). Server Status has not changed yet, still showing nothing. Conan PS: Some files are now moving, so possibly due to the load, some fail then must retry later, others are going through, some as low as 17 kB/s to as high as 1,700 kB/s. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Right now I cannot ping them ... $ ping -c 5 upload11.cpdn.org PING upload11.cpdn.org (192.171.169.187) 56(84) bytes of data. --- upload11.cpdn.org ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4116ms $ ping -c 5 upload11.cpdn.org PING upload11.cpdn.org (192.171.169.187) 56(84) bytes of data. --- upload11.cpdn.org ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4116ms Time to get up, get dressed, make breakfast. |
Send message Joined: 7 Jun 17 Posts: 23 Credit: 44,434,789 RAC: 2,600,991 |
Hi. I'm seeing an error message that there is insufficient space on one of my hosts from the project update process, but df, boinccmd and boinctui all report that there is over 17GB available. No movement on all four hosts, three of which are in the 'too many uploads' loop. update requested by user 11-Jan-2023 12:27:39 [climateprediction.net] Sending scheduler request: Requested by user. 11-Jan-2023 12:27:39 [climateprediction.net] Requesting new tasks for CPU 11-Jan-2023 12:27:41 [climateprediction.net] Scheduler request completed: got 0 new tasks 11-Jan-2023 12:27:41 [climateprediction.net] No tasks sent 11-Jan-2023 12:27:41 [climateprediction.net] OpenIFS 43r3 Perturbed Surface needs 38146.97MB more disk space. You currently have 0.00 MB available and it needs 38146.97 MB. 11-Jan-2023 12:27:41 [climateprediction.net] OpenIFS 43r3 Perturbed Surface needs 7168.00MB more disk space. You currently have 0.00 MB available and it needs 7168.00 MB. 11-Jan-2023 12:27:41 [climateprediction.net] Project requested delay of 3636 seconds boinccmd --get_disk_usage ======== Disk usage ======== total: 47000.71MB free: 18054.40MB 1) ----------- master URL: https://climateprediction.net/ disk usage: 26511.11MB Any ideas? fraser |
Send message Joined: 22 May 21 Posts: 39 Credit: 1,197,645 RAC: 4,143 |
[quote]Right now I cannot ping them ... $ ping -c 5 upload11.cpdn.org PING upload11.cpdn.org (192.171.169.187) 56(84) bytes of data. --- upload11.cpdn.org ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4116ms $ ping -c 5 upload11.cpdn.org PING upload11.cpdn.org (192.171.169.187) 56(84) bytes of data. --- upload11.cpdn.org ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4116ms Yup. Same here. No uploads. No pings. Bull |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
Well, here in slow land, one zip file at a time is being uploaded....... Huuurrray. It's happening as I write this. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,476,460 RAC: 15,681 |
HI Fraser, I suggest removing any boinc limits on disk space (temporarily if need be). In the boincmgr app (or equiv for boinccmd), untick to remove any disk limits for: 'Use no more than', 'Leave at least', & 'Use no more than'. If those are all disabled, the messages about insufficient disk should disappear. I'm puzzled boinc gave you the tasks if there wasn't enough memory. Did you by any chance change your disk limits lately? If that doesn't work, let us know. Hi. I'm seeing an error message that there is insufficient space on one of my hosts from the project update process, but df, boinccmd and boinctui all report that there is over 17GB available. No movement on all four hosts, three of which are in the 'too many uploads' loop. |
Send message Joined: 7 Jun 17 Posts: 23 Credit: 44,434,789 RAC: 2,600,991 |
Thanks for your reply. HI Fraser, There are no limits on disk space: /var/lib/boinc-client has its own 46G partition. These restrictions have been 'unticked' in the account preferences for all 'locations' for a while (days/weeks) since the upload issues went long term. I'm puzzled boinc gave you the tasks if there wasn't enough memory. Did you by any chance change your disk limits lately? It's not a memory issue, the refusal was based on disk space. I've checked to see if it was a swap issue but swap is at 0.45% (of 12G). Host has 16G RAM, of which 10.5% (1.7G) in use. If that doesn't work, let us know. It's not working, but I haven't changed anything, so no surprises there. The host is this one ID: 1523000. If you want any logs, let me know and I'll send you the last 12 hours worth. I'll report any changes if it clears itself. Best fraser |
Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128 |
Yup. Same here. No uploads. No pings. Same here. Not sure if it is the same problem or a new problem. But whatever the case, it is not fixed. |
Send message Joined: 4 Dec 15 Posts: 52 Credit: 2,489,447 RAC: 2,080 |
The uploads seem to behave like an on/off relationship between my clients and the server. When they do upload they seem fine. But sometimes they just won't. -shrug- I'll baby-sit one of the two machines with tasks as I want to shut it down soon, and it's ok by me if the other just takes its time. - - - - - - - - - - Greetings, Jens |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I have uploaded and reported one task from my VM Ubuntu guest under Ubuntu host. It has three more to finish uploading. I suspended uploads from the host machine till these are cleared to reduce the number of connections to the server. Changing to internet access always, the host machine has only managed to get one out of four (the maximum I have allowed) uploads going. This suggests to me that there is still a problem with congestion and the number of machines trying to upload zips and once the backlog has cleared a bit things should improve. (Something over 1,000 tasks have reported since it started working again but there are still a lot to go!) |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,476,460 RAC: 15,681 |
The uploads seem to behave like an on/off relationship between my clients and the server.It's going to take time. I have 20,000 files to upload, scale that up to >700 clients etc...... The upload server seems stable, I've not heard of any issues from CPDN, I'm guessing Dave & the other moderators haven't. So, all good. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,476,460 RAC: 15,681 |
leloft wrote: There are no limits on disk space: /var/lib/boinc-client has its own 46G partition. These restrictions have been 'unticked' in the account preferences for all 'locations' for a while (days/weeks) since the upload issues went long term.Fraser, my brain wasn't quite in gear this morning from the excitement of my uploads starting again. So here's the issue (going back to your original post): 11-Jan-2023 12:27:39 [climateprediction.net] Requesting new tasks for CPUThe client requested more tasks but the server said no because there's not enough space. Note how the first message says the task needs another ~38Gb, the second says only 7Gb. In your first message the free disk space is ~18Gb, that's clearly not enough for the first task which claims it wants 38Gb. It should be enough for the second task which wants 7Gb but you don't get that either. My guess here is that the server tried to send you both, added up their total space, and then said it couldn't send either. The first message 'needs 38Gb' tells me that's a resent task from batch 950, because this batch had a mistake in the disk size requirement, it was ~9-10 times too high. This was corrected for later batches to be ~7Gb. I think you were just unlucky you got a resend from the first batch. I suspect if you try again, you might get a couple of 'corrected' tasks from the other batches. Try it? Cheers, Glenn |
Send message Joined: 9 Feb 21 Posts: 9 Credit: 10,686,195 RAC: 3,343 |
Hi, only for information... I have waited nearly two weeks until I post here because I had the hope that it could be fixed in a few days. My hosts resides in Germany. They are in 3 different locations and providers. On all boxes: 69972 climateprediction.net 11.01.2023 19:11:41 Temporarily failed upload of oifs_43r3_ps_0932_2013050100_123_982_12199576_0_r933785131_43.zip: transient HTTP error traceroute on all boxes show traceroute upload11.cpdn.org traceroute to upload11.cpdn.org (192.171.169.187), 30 hops max, 60 byte packets 1 45.84.199.3 (45.84.199.3) 0.444 ms 0.421 ms 0.413 ms 2 45.135.200.25 (45.135.200.25) 0.407 ms 0.527 ms 0.392 ms 3 unn-84-17-33-58.cdn77.com (84.17.33.58) 0.385 ms unn-84-17-33-62.cdn77.com (84.17.33.62) 0.505 ms unn-84-17-33-60.cdn77.com (84.17.33.60) 0.497 ms 4 ae12-460.fra20.core-backbone.com (5.56.19.81) 0.489 ms 0.678 ms ae22-449.fra10.core-backbone.com (5.56.19.237) 0.547 ms 5 ae3-2072.lon10.core-backbone.com (80.255.15.166) 10.697 ms 10.815 ms 10.675 ms 6 linx-gw1.ja.net (195.66.224.15) 10.661 ms 10.698 ms 10.676 ms 7 ae23.londtt-sbr1.ja.net (146.97.35.169) 13.399 ms 13.390 ms 13.382 ms 8 ae27.erdiss-sbr2.ja.net (146.97.33.14) 18.783 ms 18.594 ms 18.572 ms 9 * * * 10 ral-r26.ja.net (146.97.41.34) 19.316 ms 19.311 ms 19.179 ms 11 * * * 12 * * * 13 * * * 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * All boxes try no more then four transmissions at the same time. No box triggeres the server via a script. From my point of you, nothing is ok! When thinking about the time since when we have the problem... what is the support doing all the time. I have more than 400 WUs to upload. Should i send it via USB stick or is there a chance to upload... before the 20th of January where all are running out? :( Cheers |
Send message Joined: 7 Jun 17 Posts: 23 Credit: 44,434,789 RAC: 2,600,991 |
Doubly unlucky: I've just had the same refusal from both the first machine and now a second one, both refer to the same value 7168.00 MB. The good news is that one of the hosts has managed to upload 8 tasks. I'll keep trying, but I'm limited by the 3636 seconds rule. Best fraser |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I am now getting "Project servers may be temporarily down" again. Shame as I was making use of the fact that I have over 15GB of my 20GB allowance on my phone to upload ten times faster than my bored band can manage. |
Send message Joined: 14 Sep 08 Posts: 127 Credit: 42,006,146 RAC: 68,974 |
Best case: Server is saturated and we just need to be patient and wait for our turn. Worse case: It was not the storage at first place and the actual issue is still unknown yet. TBH, it would be nice if the server status page is more useful, like showing bandwidth usage, etc. Then it would much easier to know if it's making progress. |
Send message Joined: 1 Sep 04 Posts: 161 Credit: 81,522,141 RAC: 1,164 |
Fix for - Need more disk space. You currently have 0.00 MB available. In the BOINC Manager, Options -> Computing Preferences -> Disk and memory - Check the box "Use no more than" and put a number in the number box equal to about 3/4 of your disk size (or some other number you are comfortable with). If you leave it this box UNCHECKED, it is the same as having it checked with 100 (GB) in the number box. At least that is how it works for me. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I am now getting "Project servers may be temporarily down" again. Shame as I was making use of the fact that I have over 15GB of my 20GB allowance on my phone to upload ten times faster than my bored band can manage. Must have been saturation. Now working again. I have only one task left on VM. Once they have gone I can concentrate on the host machine. |
Send message Joined: 4 Dec 15 Posts: 52 Credit: 2,489,447 RAC: 2,080 |
And my first machine is done. Second one seems to have made a lot of ground, so maybe it'll be done when I sit down in the living-room. - - - - - - - - - - Greetings, Jens |
©2024 cpdn.org