climateprediction.net home page
Tasks available, but I am not getting them.

Tasks available, but I am not getting them.

Message boards : Number crunching : Tasks available, but I am not getting them.
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,193,804
RAC: 2,852
Message 71108 - Posted: 25 Jul 2024, 0:56:47 UTC

A couple of days ago, my Linux machine received a CPDN task, which it has not done in quite a while.
Task 22463632
Name 	wah2_nz25_2296_209805_25_1019_012300899_1
Workunit 	12300899
Created 	22 Jul 2024, 18:59:57 UTC
Sent 	22 Jul 2024, 19:00:03 UTC
Report deadline 	30 Oct 2024, 19:00:03 UTC
Server state 	In progress
Client state 	New
Exit status 	0 (0x00000000)
Computer ID 	1511241

It seems to be running just fine and has already uploaded 10 trickles.

My app_config file allows two of these to run at a time. Now there are a lot more of these available on the server ready for download, but when my machine tries to get some, I get:

Wed 24 Jul 2024 07:24:44 PM EDT | climateprediction.net | Sending scheduler request: To fetch work.
Wed 24 Jul 2024 07:24:44 PM EDT | climateprediction.net | Requesting new tasks for CPU
Wed 24 Jul 2024 07:24:47 PM EDT | climateprediction.net | Scheduler request completed: got 0 new tasks
Wed 24 Jul 2024 07:24:47 PM EDT | climateprediction.net | No tasks sent
Wed 24 Jul 2024 07:24:47 PM EDT | climateprediction.net | Project requested delay of 3636 seconds


Is this a prolem of my client, or is the server refusing me for some reason? Is it the server refusing to send me more until I successfully send one completed task? It has been so long that I do not remember how this works.

Application details for host 1511241 contains as the last entry this.
Weather At Home 2 (wah2) 8.27 i686-pc-linux-gnu
Number of tasks completed 	0
Max tasks per day 	4
Number of tasks today 	0
Consecutive valid tasks 	0
Average turnaround time 	0.00 days

ID: 71108 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 314
Credit: 14,554,903
RAC: 18,109
Message 71109 - Posted: 25 Jul 2024, 7:12:18 UTC - in response to Message 71108.  

These new runs use v 8.32 which is only for Windows. On the side note, I didn't realize WAH2 is available for Linux now. The Linux version is 8.27, which looks like 1019 and maybe earlier batches used.
ID: 71109 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71110 - Posted: 25 Jul 2024, 8:33:56 UTC

I'm not sure how you got that task. The linux app should be disabled and not used for Weather@Home. I will check with CPDN. All weather@Home batches should be Windows only.

Let it finish. It'll be ok.

After fixing a few things in the code, the Linux version of W@H works fine. But before using it for batches, CPDN need to assess the differences in results to the Windows version.
---
CPDN Visiting Scientist
ID: 71110 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 127
Credit: 40,877,606
RAC: 56,897
Message 71113 - Posted: 25 Jul 2024, 11:19:10 UTC - in response to Message 71110.  

In case you need more data for investigation, I got two for my linux host as well. They are both from older batches but sent a few hours apart.

https://www.cpdn.org/workunit.php?wuid=12296077
https://www.cpdn.org/workunit.php?wuid=12300636
ID: 71113 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71114 - Posted: 25 Jul 2024, 12:10:35 UTC - in response to Message 71113.  

CPDN have found the problem. The linux app was deprecated but accidentally got re-enabled when the new wah2-ri v8.32 was installed. It's been deprecated again and should stop any more linux tasks going out. But let the tasks complete normally - there's no need to abort them.
---
CPDN Visiting Scientist
ID: 71114 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,193,804
RAC: 2,852
Message 71116 - Posted: 25 Jul 2024, 12:56:00 UTC - in response to Message 71110.  

I will let it finish. It has now completed 13 trickles, a little over 53% complete.
Task 22463632
Name 	wah2_nz25_2296_209805_25_1019_012300899_1
Workunit 	12300899
Created 	22 Jul 2024, 18:59:57 UTC
Sent 	22 Jul 2024, 19:00:03 UTC
Report deadline 	30 Oct 2024, 19:00:03 UTC


This work unit looks like this. Mine is the run still in progress.

22463632 	1511241 	22 Jul 2024, 19:00:03 UTC 	30 Oct 2024, 19:00:03 UTC 	In progress 	--- 	--- 	10,789.80 	Weather At Home 2 (wah2) v8.27
i686-pc-linux-gnu
22456746 	1542760 	24 Jun 2024, 18:35:02 UTC 	22 Jul 2024, 18:59:56 UTC 	Error while computing

ID: 71116 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,193,804
RAC: 2,852
Message 71133 - Posted: 27 Jul 2024, 23:19:07 UTC - in response to Message 71110.  

It finished for me and completed successfully on Linux, machine 1511241.
It failed for my wingman (who got it first) on Windows.
22463632 	1511241 	22 Jul 2024, 19:00:03 UTC 	27 Jul 2024, 23:11:04 UTC 	Completed 	446,355.27 	440,722.90 	20,729.77 	Weather At Home 2 (wah2) v8.27
i686-pc-linux-gnu
22456746 	1542760 	24 Jun 2024, 18:35:02 UTC 	22 Jul 2024, 18:59:56 UTC 	Error while computing 	178,741.50 	83,465.81 	4,163.15 	Weather At Home 2 (wah2) v8.24
windows_intelx86

ID: 71133 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71135 - Posted: 28 Jul 2024, 12:34:16 UTC - in response to Message 71133.  

There was a bug in version 8.24 of WaH which tended to make it crash when the model was restarted. This was fixed in version 8.27 which ran on your linux machine. That's why it worked. Glad to know it completed the task.
---
CPDN Visiting Scientist
ID: 71135 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 71149 - Posted: 1 Aug 2024, 3:17:34 UTC

Mine didn't, after the 20th trickle about when it was finishing it then failed on File Transfer

Unable to load library wah2_se_8.27_i686-pc-linux-gnu.so
dlopen error: libnsl.so.1: cannot open shared object file: No such file or directory

I must not have had any 32 bit libraries installed and so it could not find it
Have now installed the file it is complaining about, even though I probably wont need it as I wanted to just have 63 bit applications running.

Conan
ID: 71149 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,193,804
RAC: 2,852
Message 71150 - Posted: 1 Aug 2024, 6:19:58 UTC - in response to Message 71149.  

Unable to load library wah2_se_8.27_i686-pc-linux-gnu.so
dlopen error: libnsl.so.1: cannot open shared object file: No such file or directory


On my machine, it does not seem to need libnsl.so.
although I do have it on my machine

$ locate libnsl.so.1
/usr/lib/libnsl.so.1
/usr/lib64/libnsl.so.1

$ ls -l /usr/lib/libnsl.so.1 /usr/lib64/libnsl.so.1
lrwxrwxrwx. 1 root root 14 Apr 26 14:27 /usr/lib64/libnsl.so.1 -> libnsl-2.28.so
lrwxrwxrwx. 1 root root 14 Apr 26 14:26 /usr/lib/libnsl.so.1 -> libnsl-2.28.so


[/var/lib/boinc/projects/climateprediction.net]
$ ldd wah2_8.27_i686-pc-linux-gnu
	linux-gate.so.1 (0xf7f52000)
	libpthread.so.0 => /lib/libpthread.so.0 (0xf7f1c000)
	libdl.so.2 => /lib/libdl.so.2 (0xf7f17000)
	libstdc++.so.6 => /lib/libstdc++.so.6 (0xf7d84000)
	libm.so.6 => /lib/libm.so.6 (0xf7cb2000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7c95000)
	libc.so.6 => /lib/libc.so.6 (0xf7aec000)
	/lib/ld-linux.so.2 (0xf7f54000)

ID: 71150 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71153 - Posted: 1 Aug 2024, 9:13:56 UTC - in response to Message 71150.  

The error message is coming from libnsl, it doesn't mean it can't find libnsl. i.e. libnsl can't find the dynamically loaded file: wah2_se_8.27_i686-pc-linux-gnu.so.

Unable to load library Unable to load library wah2_se_8.27_i686-pc-linux-gnu.so
dlopen error: libnsl.so.1: cannot open shared object file: No such file or directory

---
CPDN Visiting Scientist
ID: 71153 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 6 Jul 06
Posts: 147
Credit: 3,615,496
RAC: 420
Message 71166 - Posted: 2 Aug 2024, 14:29:52 UTC

I didn't have libnsl.so.1 on my computer so I have now loaded it in case I need it later,

Conan
ID: 71166 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 53
Credit: 4,725,987
RAC: 9,174
Message 71172 - Posted: 4 Aug 2024, 5:32:36 UTC

I only run Windows in a VM but have been able to get work in the past. Currently the server simply refuses to send me any work for this instance. I've tried several times, left it over night to sort itself out, reset the project, any number of reboots, and it still simply will not give me any work on the Windows instance. I managed to get a few Linux units on the host machine (OpenIFS work), those that have completed have been successful, but the Windows machine just sits idle.
Has something changed in the requirements? Have I not allocated enough RAM (4GB per core, four cores)? I get nothing from the logs in any useful time frame thanks to the hour backoff from the server every time.
ID: 71172 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4529
Credit: 18,661,594
RAC: 14,529
Message 71173 - Posted: 4 Aug 2024, 6:50:52 UTC - in response to Message 71172.  

I only run Windows in a VM but have been able to get work in the past. Currently the server simply refuses to send me any work for this instance.

That you can get OIFS work demonstrates it is not the issue with your residing somewhere that has an IP address that is blacklisted. (We have had a couple of those with the project over the years.) Have you checked the disk space settings? Both for the VM and for BOINC running in the VM? I have more than once when setting up a VM found this to be an issue that has stopped me getting work. - Currently, I am just running tasks using BOINC under WINE though I have a VM set up as well.
ID: 71173 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 53
Credit: 4,725,987
RAC: 9,174
Message 71174 - Posted: 4 Aug 2024, 9:03:48 UTC - in response to Message 71173.  

I've expanded the virtual hdd and added another 60GB to the filesystem so I'll see how that goes.
ID: 71174 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71179 - Posted: 4 Aug 2024, 17:12:30 UTC - in response to Message 71174.  

I have increased the disk space requirement for Weather@Home tasks recently as we prepare to roll out a new version of the app which puts all the task files into the slot directory instead of the project directory. So it's quite possible it's a limit on the disk space.

But you should see a message in the BOINC messages log that the server is unable to give you a task because there's not enough space?
ID: 71179 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4529
Credit: 18,661,594
RAC: 14,529
Message 71180 - Posted: 4 Aug 2024, 17:27:58 UTC
Last modified: 4 Aug 2024, 19:25:45 UTC

But you should see a message in the BOINC messages log that the server is unable to give you a task because there's not enough space?
Yes, I have always seen that when its a problem. Guessing that isn't it as the machine in question has been in contact with the server since and still has no work.

I assume you are not hitting the update button for the project before the one hour back off time after requesting new work? If it isn't that, I am running out of ideas.
ID: 71180 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 53
Credit: 4,725,987
RAC: 9,174
Message 71181 - Posted: 4 Aug 2024, 20:43:49 UTC - in response to Message 71180.  

I'm avoiding hitting the update button precisely because of the backoff issue resetting every time.
Haven't seen any specific messages about storage space, and I got no work overnight.
This morning I wiped the client, rebooted, deleted all residual files, and installed fresh.

I enabled work_fetch_debug

4/08/2024 20:41:27 | | [work_fetch] ------- start work fetch state -------
4/08/2024 20:41:27 | | [work_fetch] target work buffer: 432000.00 + 432000.00 sec
4/08/2024 20:41:27 | | [work_fetch] --- project states ---
4/08/2024 20:41:27 | climateprediction.net | [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (2729.13 sec)
4/08/2024 20:41:27 | | [work_fetch] --- state for CPU ---
4/08/2024 20:41:27 | | [work_fetch] shortfall 3456000.00 nidle 4.00 saturated 0.00 busy 0.00
4/08/2024 20:41:27 | climateprediction.net | [work_fetch] share 0.000
4/08/2024 20:41:27 | | [work_fetch] ------- end work fetch state -------

I have the project set to 1000 priority. Previously it was set to 100. I don't recall ever setting this project to zero.
ID: 71181 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 53
Credit: 4,725,987
RAC: 9,174
Message 71182 - Posted: 4 Aug 2024, 21:32:00 UTC

Finally got some units.
The last thing I did was to set the work cache to 10 and 10

Don't know if that was the winning combination or if someone did something at the server end.
ID: 71182 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1044
Credit: 16,196,312
RAC: 12,647
Message 71183 - Posted: 4 Aug 2024, 22:11:41 UTC - in response to Message 71182.  
Last modified: 4 Aug 2024, 22:12:16 UTC

I've had to do that before, never understood why. Once the tasks complete I could set it back down to my usual values.

Maybe it's something to do with the client having never seen the tasks before and doesn't know how long they take in reality?

Definitely wasn't anything at the scheduler end. Not on a Sunday evening!
---
CPDN Visiting Scientist
ID: 71183 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Tasks available, but I am not getting them.

©2024 cpdn.org