Trickles stop new work arriving

Author	Message
Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 70298 - Posted: 3 Feb 2024, 4:16:07 UTC Last modified: 3 Feb 2024, 4:16:20 UTC If you're running several WAH tasks, the trickles are resetting the 1 hour timer for getting new work, so it's very difficult to get more work. ID: 70298 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,690,861 RAC: 10,559	Message 70299 - Posted: 3 Feb 2024, 9:30:33 UTC - in response to Message 70298. Suspend network activity until the timer runs down, and everything can be done in a single burst when you allow it again. ID: 70299 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,690,861 RAC: 10,559	Message 70300 - Posted: 3 Feb 2024, 10:28:07 UTC Actually, forget that. I think your basic premise is wrong. If a trickle becomes due while scheduler contact is backed off (perhaps because of a trickle report from another task), it will be held in a queue until the backoff time has passed. Then, the scheduler will be contacted and all pending operations will be completed in a batch - a work fetch request (if deemed necessary), and all pending trickles reported. ID: 70300 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 70301 - Posted: 3 Feb 2024, 10:59:57 UTC - in response to Message 70300. The problem arises if say you run another project at lower priority, and have maybe 3/4s of your threads running CPDN, and a 1/4 on the other project. At some point, there will be less than the buffer you've set, and this point could come when the CPDN server is backed off due to a recent trickle up. Therefore Boinc will ask the other project, ad infinitum. ID: 70301 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,690,861 RAC: 10,559	Message 70302 - Posted: 3 Feb 2024, 11:09:37 UTC - in response to Message 70301. Then suspend a few unstarted tasks for the lower priority project, and let it work off the cache for a while. If you've suspended enough for a work fetch to be needed, it will be done alongside any new trickle reports at the end of the backoff hour. CPDN doesn't need new work often enough to make that an onerous chore. ID: 70302 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 70303 - Posted: 3 Feb 2024, 11:34:17 UTC - in response to Message 70302. I prefer Boinc to automate as much as possible. I spend enough time repairing the hardware! I just thought perhaps it might be an easy setting to change on the server. ID: 70303 · Reply Quote

wujj123456 Send message Joined: 14 Sep 08 Posts: 127 Credit: 41,541,636 RAC: 58,436	Message 70308 - Posted: 3 Feb 2024, 20:41:35 UTC - in response to Message 70301. Last modified: 3 Feb 2024, 20:43:44 UTC Boinc client's scheduling left a lot to be desired honestly. The trick I use in this situation is to set low-priority project's share to 0 whenever work shows up for high priority projects. That way, boinc client will only fetch minimal number of tasks to fill all the cores, but not the full buffer. When next time CPDN updates, it will request new work. It's not perfect, but at least I only need to manage the project shares occasionally given how sporadic CPDN work is. ID: 70308 · Reply Quote

Mr. P Hucker Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918	Message 70309 - Posted: 3 Feb 2024, 23:08:04 UTC - in response to Message 70308. I have several projects set to 10,000 priority - all the rare ones. Currently though, Denis and CPDN (two of these rare ones) are fighting over my cores. Other projects I just want to run normally, I have set to 100. I use 0 only where a computer can run out due to project outages. With this setup, Denis (or even one of the 100 priority projects) will grab an excrementload of work when CPDN says "last contact too recent". Why Boinc gets so much I don't know - for example: 24 threads, with 20 occupied by CPDN (I set CPDN to 2 threads per task as it seems to work just as well overall and gets each one done faster). Boinc runs out of other project work to do so needs my 2 day buffer for 4 threads. So why does it ask Denis for 2 days of work for all 24 threads?!? Since Denis has a 3 day deadline, and CPDN has a 3 month deadline (do they really not want them back sooner?), Boinc panics and shoves Denis on first. And it stays that way until the odd time it can get another CPDN task. ID: 70309 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,690,861 RAC: 10,559	Message 70310 - Posted: 4 Feb 2024, 9:48:53 UTC A sidebar on all this. When a task finishes, it first reports a final trickle, and (a few seconds later) starts to upload the final data file. Here's the timing on one of my machines: 04/02/2024 09:27:28 \| climateprediction.net \| Sending scheduler request: To send trickle-up message. 04/02/2024 09:27:29 \| climateprediction.net \| Project requested delay of 3636 seconds 04/02/2024 09:29:52 \| climateprediction.net \| Computation for task wah2_nz25_n2fo_200705_25_1005_012257314_2 finished 04/02/2024 09:29:58 \| climateprediction.net \| Finished upload of wah2_nz25_n2fo_200705_25_1005_012257314_2_r1186546406_out.zip (1265821 bytes) If you're using the project setting for the maximum number of tasks in progress, and you have that number running, you can't have a spare task ready to start immediately - that machine will have to wait for about 58 minutes before reporting/fetching. I'm thinking of trying a project setting of n+1 tasks, and using other tools like 'no new tasks' to control the actual resource use. ID: 70310 · Reply Quote