climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 91 · Next

AuthorMessage
Daniel Van Meter

Send message
Joined: 19 May 11
Posts: 1
Credit: 11,641,843
RAC: 28,465
Message 61235 - Posted: 16 Oct 2019, 11:23:02 UTC - in response to Message 61038.  

Hello,
I have not received any work units in over two months. Do I have a problem on my side or is it on your side?
Thanks,
Daniel
ID: 61235 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 61236 - Posted: 16 Oct 2019, 13:29:30 UTC - in response to Message 61235.  

Hello,
I have not received any work units in over two months. Do I have a problem on my side or is it on your side?
Thanks,
Daniel


HI Daniel, the lack of work is project side. There is hopefully some work coming later today or tomorrow morning but it is for Linux machines only if it arrives. (There have been some problems with the batch.) I am afraid I haven't seen any clues as to when the next batch for Windows machines is due.
ID: 61236 · Report as offensive
Meerkat

Send message
Joined: 30 Nov 08
Posts: 2
Credit: 1,942,186
RAC: 0
Message 61240 - Posted: 17 Oct 2019, 4:48:45 UTC - in response to Message 61236.  
Last modified: 17 Oct 2019, 4:49:15 UTC

Hello,
I have not received any work units in over two months. Do I have a problem on my side or is it on your side?
Thanks,
Daniel


HI Daniel, the lack of work is project side. There is hopefully some work coming later today or tomorrow morning but it is for Linux machines only if it arrives. (There have been some problems with the batch.) I am afraid I haven't seen any clues as to when the next batch for Windows machines is due.


Hello there, I am in the same situation, except I have not been assigned new work since June, just glad not to be alone and not an issue my end then...

Cheers
Meerkat (aka Paul)
ID: 61240 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61241 - Posted: 17 Oct 2019, 6:11:55 UTC

These are the Windows batches released since June:


841 1200 2019-10-03 Oregon
840 1450 2019-10-03 Oregon
833 500 2019-09-19 Oregon
832 600 2019-09-18 Mexico
829 3000 2019-06-23 Oregon
827 1200 2019-06-21 Oregon
823 860 2019-06-13 Mexico

(batch, number, date)

As can be seen, there's only two research places that have been active, and three of the batches have been small.
Any computer that hasn't picked up some of these, is probably running other projects as well, and the computers just didn't need work for the brief periods when work was available from cpdn.
ID: 61241 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61242 - Posted: 17 Oct 2019, 6:12:22 UTC

The N216 have arrived. I have them running on 11 cores of an i7-8700 (with one core reserved for a GPU), under Ubuntu 18.04.
They are taking 1348 MB each according to the BOINCTasks. The estimated run time is 3 days 14 hours, but that is only after 1/2 hour, so I would not place much credence in that.
ID: 61242 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 61243 - Posted: 17 Oct 2019, 6:19:36 UTC - in response to Message 61242.  

The N216 have arrived. I have them running on 11 cores of an i7-8700 (with one core reserved for a GPU), under Ubuntu 18.04.
They are taking 1348 MB each according to the BOINCTasks. The estimated run time is 3 days 14 hours, but that is only after 1/2 hour, so I would not place much credence in that.



HadAM4 N216 for winters starting in 2006-2015 (starting Nov 1, length 4 months) with high-frequency output in NH
Number_of_workunits: 3150

UK Met Office HadAM4 at N216 resolution 225 59 server status page showing a much lower number - total of running and unsent 284 which is just because it is out of date. If that is all there were, there wouldn't be any left for me to download and two are doing so now but they won't last long!
ID: 61243 · Report as offensive
entity

Send message
Joined: 27 Apr 13
Posts: 4
Credit: 7,391,230
RAC: 2,474
Message 61247 - Posted: 17 Oct 2019, 13:45:01 UTC - in response to Message 61243.  
Last modified: 17 Oct 2019, 14:06:45 UTC

Is any work being sent to machines with AMD processors? The only machines I have that don't have work are machines with AMD processors. These are Linux machines.
ID: 61247 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61248 - Posted: 17 Oct 2019, 14:58:50 UTC - in response to Message 61247.  
Last modified: 17 Oct 2019, 16:15:46 UTC

There are only three things that matter for a computer to get work:

It MUST have all necessary 32 bit libraries
It must have sufficient memory, which is starting to get to 2 Gigs minimum per model
It must actually need work, as decided by BOINC
ID: 61248 · Report as offensive
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 61249 - Posted: 17 Oct 2019, 15:42:13 UTC - in response to Message 61247.  
Last modified: 17 Oct 2019, 20:04:25 UTC

Is any work being sent to machines with AMD processors? The only machines I have that don't have work are machines with AMD processors. These are Linux machines.

AMDs work fine. Make sure your computers have the right 32 bit libraries installed. See this post for more info...

https://www.cpdn.org/forum_thread.php?id=8008&postid=59939
ID: 61249 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 61250 - Posted: 17 Oct 2019, 16:33:36 UTC - in response to Message 61242.  

The N216 have arrived. I have them running on 11 cores of an i7-8700 (with one core reserved for a GPU), under Ubuntu 18.04.
They are taking 1348 MB each according to the BOINCTasks. The estimated run time is 3 days 14 hours, but that is only after 1/2 hour, so I would not place much credence in that.


I got one:
 
USER    PR  NI S  VIRT  RES  SHR  %MEM %CPU  TIME+      COMMAND                                                                                 
boinc   39  19 R 1384m 1.3g  4048 8.5  99.8  194:12.32 /home/boinc/projects/climateprediction.net/hadam4_um_8.52_i686-pc-linux-gnu 182480 


It is only 3.5 hours in. Predicted time to complete is 349 hours, but hadcm3s and hadam4 144 tasks seem to run about twice as fast as predicted, so I suppose these will too. OTOH, this one is predicted to run in about half the time of those others. Working set (of my 16 GByte RAM) is1.3GBytes or 8.5% OF RAM. So I could run on all four cores.
ID: 61250 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 61251 - Posted: 17 Oct 2019, 17:59:03 UTC

And there is now a second batch of 3150 of these in the hopper (batch 843, first was 842)
ID: 61251 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 61261 - Posted: 18 Oct 2019, 7:40:49 UTC

#844 Mini batch of 259 hadcm3s were released late last night. (UK time)

Please if running a Linux box go to the Linux section and read the bit about 32 bit libs. The overwhelming majority of failures for both hadcm3s and hadam4 tasks are because a required 32 bit library is not installed.
ID: 61261 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 61262 - Posted: 18 Oct 2019, 12:21:54 UTC - in response to Message 61261.  

Mine look like these:

$ ldd hadam4_8.52_i686-pc-linux-gnu
	linux-gate.so.1 =>  (0x00ba3000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x007c3000)
	libdl.so.2 => /lib/libdl.so.2 (0x007eb000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00c8b000) <---<<<
	libm.so.6 => /lib/libm.so.6 (0x00aaf000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00c1e000)
	libc.so.6 => /lib/libc.so.6 (0x0062a000)
	/lib/ld-linux.so.2 (0x565ed000)


$ ldd hadam4_um_8.52_i686-pc-linux-gnu
	linux-gate.so.1 =>  (0x006b6000)
	libdl.so.2 => /lib/libdl.so.2 (0x007eb000)
	libm.so.6 => /lib/libm.so.6 (0x00aaf000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x007c3000)
	libc.so.6 => /lib/libc.so.6 (0x001a8000)
	/lib/ld-linux.so.2 (0x5657c000)


It seems to me that boinc runs hadam4_8.52_i686-pc-linux-gnu first which forks off hadam4_um_8.52_i686-pc-linux-gnu to do all the work. When that is finished and exits, then hadam4_8.52_i686-pc-linux-gnu finishes up, calling something in libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00c8b000) this is not there, and then fails. It would be nice if hadam4_um_8.52_i686-pc-linux-gnu would make a dummy call into thelibstdc++.so.6 up front so that it would fail right away, rather than a week or more later when processing is almost complete.
ID: 61262 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 61263 - Posted: 18 Oct 2019, 12:34:49 UTC - in response to Message 61262.  

If that file isn't there, the task fails after a few seconds.
Some examples here
ID: 61263 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 61264 - Posted: 18 Oct 2019, 12:46:04 UTC
Last modified: 18 Oct 2019, 12:48:07 UTC

When that is finished and exits, then hadam4_8.52_i686-pc-linux-gnu finishes up, calling something in libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00c8b000) this is not there, and then fails. It would be nice if hadam4_um_8.52_i686-pc-linux-gnu would make a dummy call into thelibstdc++.so.6 up front so that it would fail right away, rather than a week or more later when processing is almost complete.


I don't recall seeing tasks failing after a week or more processing due to missing libraries.

Just looked at batch 826- (a hadam4 batch and a sample of about 20 hard failures .) All of the failures due to missing libs were at less than 5 seconds. failures after some credit had been granted were all either segmentation violations or invalid theta so problems with the actual tasks.
ID: 61264 · Report as offensive
entity

Send message
Joined: 27 Apr 13
Posts: 4
Credit: 7,391,230
RAC: 2,474
Message 61268 - Posted: 18 Oct 2019, 18:50:34 UTC - in response to Message 61249.  

Is any work being sent to machines with AMD processors? The only machines I have that don't have work are machines with AMD processors. These are Linux machines.

AMDs work fine. Make sure your computers have the right 32 bit libraries installed. See this post for more info...

https://www.cpdn.org/forum_thread.php?id=8008&postid=59939


Thanks for the reply, they were missing a few of the needed 32-bit libraries. One of the libraries (lib32ncurses5) was not found when trying to install. Installed lib32ncurses6 and that seems to work OK. This is also running on Ubuntu 19.10. This machine has downloaded 16 WUs so far so all seems well.
ID: 61268 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 61273 - Posted: 18 Oct 2019, 21:15:27 UTC

Some Windows work: 7160 x WAH2 global, batch #845.
ID: 61273 · Report as offensive
Speedy

Send message
Joined: 20 Jul 05
Posts: 25
Credit: 414,873
RAC: 406
Message 61275 - Posted: 18 Oct 2019, 23:51:07 UTC - in response to Message 61273.  

Some Windows work: 7160 x WAH2 global, batch #845.

Yes it was a nice surprise I managed to pick up 21
ID: 61275 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 61276 - Posted: 19 Oct 2019, 1:31:29 UTC - in response to Message 61264.  

I don't recall seeing tasks failing after a week or more processing due to missing libraries.


Something funny happened with 21752443. i find it sad he had 11 trickles and uploads (I suppose) only to bomb. But something else was probably going on too. All those
Suspended CPDN Monitor - Suspend request from BOINC...
messages.

I happen to be running this one: 21759319 and it seems to be OK so far.

Workunit 11900203
name 	hadcm3s_qy57_190012_240_837_011900203
application 	UK Met Office HadCM3 short
created 	26 Sep 2019, 16:04:05 UTC
minimum quorum 	1
initial replication 	1
max # of error/total/success tasks 	3, 3, 1
Task
click for details	Computer	Sent	Time reported
or deadline
explain	Status	Run time
(sec)	CPU time
(sec)	Credit	Application
21759319 	1256552 	16 Oct 2019, 21:15:59 UTC 	28 Sep 2020, 2:35:59 UTC 	In progress 	--- 	--- 	--- 	UK Met Office HadCM3 short v8.36
i686-pc-linux-gnu
21752443 	1471317 	26 Sep 2019, 17:24:21 UTC 	16 Oct 2019, 20:11:35 UTC 	Error while computing 	320,008.00 	320,008.00 	3,421.44 	UK Met Office HadCM3 short v8.36
i686-pc-linux-gnu
21759317 	1435218 	16 Oct 2019, 20:13:06 UTC 	16 Oct 2019, 21:13:46 UTC 	Error while computing 	0.00 	0.00 	--- 	UK Met Office HadCM3 short v8.36
i686-pc-linux-gnu
[/code]
ID: 61276 · Report as offensive
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 30,978,383
RAC: 14,247
Message 61282 - Posted: 19 Oct 2019, 22:46:40 UTC - in response to Message 61275.  

Blast. Missed them!
ID: 61282 · Report as offensive
Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org