Thread 'The uploads are stuck'

Author	Message
zombie67 [MM] Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128	Message 67883 - Posted: 19 Jan 2023, 6:43:46 UTC - in response to Message 67882. Last modified: 19 Jan 2023, 7:03:16 UTC My uploads are building up again =( Same here. Frustrating. Edit: Maybe suddenly cleared up? Nice. ID: 67883 · Reply Quote

Stony666 Send message Joined: 9 Feb 21 Posts: 9 Credit: 10,689,509 RAC: 3,567	Message 67886 - Posted: 19 Jan 2023, 7:57:37 UTC - in response to Message 67883. Still having a box with 240+ WUs that cannot upload. :( ID: 67886 · Reply Quote

leloft Send message Joined: 7 Jun 17 Posts: 23 Credit: 44,434,789 RAC: 2,600,991	Message 67887 - Posted: 19 Jan 2023, 8:25:47 UTC - in response to Message 67864. The 100 GB limit is - If you DO NOT check the "Use no more than XXXX GB" box, the default value is 100 GB. In other words, checking the box with a value of 100 GB is the same as not checking it at all. When I learnt of this limit, I just set it to 1000G in the preferences and controlled disk usage through 'the use no more than xG' or 'eave at least yG free'. The preferences clearly state that the lowest of the 3 limits will be used. ID: 67887 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,731,493 RAC: 6,912	Message 67890 - Posted: 19 Jan 2023, 9:04:34 UTC - in response to Message 67876. "If you don't install the 32-bit libraries your tasks will eventually keep crashing, and hence your device will get jailed. " Unfortunately I don't think that is the case - though I stand to be corrected. It depends on what sort of work you're being offered to crunch. If it's UK Met office 'Hadley' tasks, they'll always fail. If it's the newer IFS tasks, thy're not guaranteed to be successful - but it won't be for a lack of 32-bit libraries. ID: 67890 · Reply Quote

ncoded.com Send message Joined: 16 Aug 16 Posts: 73 Credit: 53,408,433 RAC: 2,038	Message 67895 - Posted: 19 Jan 2023, 9:34:19 UTC Last modified: 19 Jan 2023, 9:58:33 UTC Thank you Richard, that is a very useful clarification. ID: 67895 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,731,493 RAC: 6,912	Message 67902 - Posted: 19 Jan 2023, 13:03:09 UTC - in response to Message 67873. It's actually even simpler in Linux. Stop the client, mv the whole directory to the new location, create a symlink pointing to the new location with the name of previous directory and then start the client. The client will continue to operate on the old directory name except that's now just a link to the new directory. (Of course you can go the other route of changing boinc client config to use new directory name, similar to the Windows setup you described, but involves config editing. ) I've just tried that, and it didn't work. My problem is that neither SuperUser nor BOINC can follow the symlink to the new drive after reboot: if the logged-in user (me) mounts the drive manually, it works for SuperUser, but not BOINC. All the gory details are in my 'Help requested thread' in the Linux area - it would be much appreciated if you could take a look and suggest what I might be doing wrong. ID: 67902 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,731,493 RAC: 6,912	Message 67906 - Posted: 19 Jan 2023, 15:49:11 UTC @wujj123456 - problem solved, no need to follow up. But for the record - you also have to add the new disk to fstab, and if using UUIDs, use the UUID of the formatted partition, not the UUID of the underlying hardware. ID: 67906 · Reply Quote

Landjunge Send message Joined: 17 Aug 07 Posts: 8 Credit: 37,253,824 RAC: 11,789	Message 67910 - Posted: 19 Jan 2023, 17:17:23 UTC All my uploads had cleared up. Jippiee =D ID: 67910 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 127 Credit: 42,294,577 RAC: 73,464	Message 67911 - Posted: 19 Jan 2023, 17:51:14 UTC - in response to Message 67906. @wujj123456 - problem solved, no need to follow up. But for the record - you also have to add the new disk to fstab, and if using UUIDs, use the UUID of the formatted partition, not the UUID of the underlying hardware Congrats! Yeah, mounting on boot and sometimes permission could be a problem when migrating to a new disk and glad you sorted it out. ID: 67911 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 127 Credit: 42,294,577 RAC: 73,464	Message 67917 - Posted: 20 Jan 2023, 3:54:52 UTC Finally cleared all of my backlog. Got decent speed for the past 24 hours, especially during the last 12 hours that maxed out my upload link. Yay! ID: 67917 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944	Message 67936 - Posted: 21 Jan 2023, 17:37:46 UTC - in response to Message 67917. Finally cleared all of my backlog. Got decent speed for the past 24 hours, especially during the last 12 hours that maxed out my upload link. Yay! 36 hours of unattended crunching with a limit of two tasks running at once has left me with about 300 files to upload. I think this is just my slow connection which can just cope with two of the standard uploads but starts falling behind every time a 122.zip which is almost twice the size comes up. I have suspended crunching till things clear a bit. ID: 67936 · Reply Quote

xii5ku Send message Joined: 27 Mar 21 Posts: 79 Credit: 78,311,890 RAC: 633	Message 67937 - Posted: 21 Jan 2023, 17:52:34 UTC Here we go again: 21 Jan 2023 17:43 UTC Error reported by file upload server: can't write file oifs_43r3_ps_[…].zip: No space left on server ID: 67937 · Reply Quote

nairb Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785	Message 67938 - Posted: 21 Jan 2023, 17:56:44 UTC Just had 3 uploads fail at 100% with same message "No space left on server" ID: 67938 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154	Message 67940 - Posted: 21 Jan 2023, 18:26:02 UTC - in response to Message 67938. Last modified: 21 Jan 2023, 18:29:15 UTC Just had 3 uploads fail at 100% with same message "No space left on server" Me too, but 8 having trouble. All 8 are 100% uploaded, but most recent Event Log messages ... Edit 1: They have all gone through now. It actually reloaded each one again. Sat 21 Jan 2023 01:20:32 PM EST \| climateprediction.net \| Started upload of oifs_43r3_ps_0973_2009050100_123_978_12195617_0_r724200919_36.zip Sat 21 Jan 2023 01:20:36 PM EST \| climateprediction.net \| [error] Error reported by file upload server: can't write file oifs_43r3_ps_0973_2009050100_123_978_12195617_0_r724200919_36.zip: No space left on server Sat 21 Jan 2023 01:20:36 PM EST \| climateprediction.net \| Temporarily failed upload of oifs_43r3_ps_0973_2009050100_123_978_12195617_0_r724200919_36.zip: transient upload error Sat 21 Jan 2023 01:20:36 PM EST \| climateprediction.net \| Backing off 00:02:33 on upload of oifs_43r3_ps_0973_2009050100_123_978_12195617_0_r724200919_36.zip ID: 67940 · Reply Quote

SolarSyonyk Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463	Message 67941 - Posted: 21 Jan 2023, 18:29:45 UTC What's annoying is that my boxes are still sending upload traffic, it seems - the upload runs, and then fails at the end. Oh well. Suspend network traffic and crunch on (or finish WUs and put the CPUs to something else - it's time for a maintenance cycle on my boxes). Not like this is new to any of us. Wasn't the upload rate supposed to be monitored and below the offload rate, so this wouldn't happen again? Seems like the sort of thing one would set to page the admin at 10% free space or such... I've been playing with some GCP "Spot" instances (like preemptible, but won't power off automatically at 24h, especially if they're small) to add some cycles, and the AMD boxes are churning along hard. I suppose I'll shut those off, they're not exactly long on disk space. :/ ID: 67941 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,731,493 RAC: 6,912	Message 67942 - Posted: 21 Jan 2023, 18:29:58 UTC The occasional one gets through: 21/01/2023 18:23:12 \| climateprediction.net \| Finished upload of oifs_43r3_ps_0802_1993050100_123_962_12179446_0_r599420434_12.zip I think it must be that the transfer to backing storage is still running, but too slowly - compared to the rate of incoming uploads, at any rate. ID: 67942 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154	Message 67945 - Posted: 21 Jan 2023, 19:46:48 UTC - in response to Message 67940. It is in YoYo miode, I guess one could say.. Another bunch uploaded to 100% but failed. Massaging the Retry button got them to all go up, but .. Now I just have three more. 8-( ID: 67945 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 127 Credit: 42,294,577 RAC: 73,464	Message 67946 - Posted: 21 Jan 2023, 19:51:14 UTC Perhaps it's casual weekend crunchers turning on their computers and finally started to offload their backlog after three weeks. Hopefully the transfer process can eventually win out... ID: 67946 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 127 Credit: 42,294,577 RAC: 73,464	Message 67947 - Posted: 21 Jan 2023, 20:31:24 UTC - in response to Message 67941. I've been playing with some GCP "Spot" instances (like preemptible, but won't power off automatically at 24h, especially if they're small) to add some cycles, and the AMD boxes are churning along hard. I suppose I'll shut those off, they're not exactly long on disk space. :/ Curious what's your $ per WU. I've also recently checked EC2, GCP or Azure and they all have that nice catch of bandwidth cost. Their bandwidth costs around $0.08-0.1 per GB and that would mean around $0.15 - $0.2 per WU. That alone already exceeds cost per WU for whatever I can get with my own equipment, electricity and home network. Azure covers first 100GB and others' free usage is negligible. I honestly wonder if I missed some great deals hidden in their pages of pages of pricing list. Would be nice to cross check. Otherwise, until the bandwidth to compute ratio significantly drops, OpenIFS probably makes no sense in major cloud vendors. ID: 67947 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154	Message 67949 - Posted: 21 Jan 2023, 20:57:38 UTC - in response to Message 67946. Perhaps it's casual weekend crunchers turning on their computers and finally started to offload their backlog after three weeks. Hopefully the transfer process can eventually win out... I suppose so. I run my machines 24/7 and take them down at most once a week for updates. The little Windows 10 box has not run any CPDN in a very long time. My big RHEL8.7 box has been running 5 Oifs jobs at a time for quite a while. Over the last few days, it was having no trouble uploading the "trickles." But today it has most recently gotten up to 13 trickles behind. Right now it is four behind with a 9-minute backoff. Average upload rate 3840.21 KB/sec Average download rate 5713.52 KB/sec Average turnaround time 2.64 days ID: 67949 · Reply Quote