Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,826,970 RAC: 5,066 |
Batch #825 is 550 x HADAM4 Linux units. Batch 825: HadAM4 N144 HAPPI hist spinup, 17 June 2019, Spin up runs for HadAM4 N144 for 2006-2016 (starting Oct 2005, length 12 months) (550 simulations). Batch #824 appears to have been skipped. Another 1500 x SAM50/25 units have been added to batch #822, which was SAM50/24 until that point. Batch List |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I just got four new work-units. Like this: UK Met Office HadAM4 at N144 resolution v8.09 i686-pc-linux-gnu Three of them have a little over four hours on their clocks, and the other has three hours 20 minutes. I hope the problem with shutting down boinc client when these are running crashing the v8.08 work-units is fixed. I will try to keep them running. All boinc work-units have permission to keep them in memory even when suspended (I have 16 GBytes of RAM) and each of them is using only about 4% of the RAM). I have a 4-core 64-bit processor, but I do have the 32-bit compatibility libraries. It does not seem to want much: $ ldd hadam4_um_8.09_i686-pc-linux-gnu linux-gate.so.1 => (0x00ec5000) libdl.so.2 => /lib/libdl.so.2 (0x007eb000) libm.so.6 => /lib/libm.so.6 (0x00aaf000) libpthread.so.0 => /lib/libpthread.so.0 (0x007c3000) libc.so.6 => /lib/libc.so.6 (0x0062a000) /lib/ld-linux.so.2 (0x565a1000) $ ldd hadam4_8.09_i686-pc-linux-gnu linux-gate.so.1 => (0x007c9000) libpthread.so.0 => /lib/libpthread.so.0 (0x00173000) libdl.so.2 => /lib/libdl.so.2 (0x007eb000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x008bb000) libm.so.6 => /lib/libm.so.6 (0x00aaf000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x005f5000) libc.so.6 => /lib/libc.so.6 (0x0062a000) /lib/ld-linux.so.2 (0x565fd000) $ ls -l /usr/lib/libstdc++.so.6 19 Jun 19 2018 /usr/lib/libstdc++.so.6 -> libstdc++.so.6.0.13 $ rpm -qf /usr/lib/libstdc++.so.6.0.13 libstdc++-4.4.7-23.el6.i686 It does not seem to be needing the compatibility libraries anymore. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I hope the problem with shutting down boinc client when these are running crashing the v8.08 work-units is fixed. I will try to keep them running. All boinc work-units have permission to keep them in memory even when suspended All tests on the beta/dev site showed this newer version of the app allows for shutting down the tasks and then restarting them without a problem. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
All tests on the beta/dev site showed this newer version of the app allows for shutting down the tasks and then restarting them without a problem. The do seem to work through a reboot. I was running four of the new tasks and I suspended two of them (just to see if it made a difference) and left the other two running. I then shut down the boinc client. I then rebooted to Windows, did what was necessary, and rebooted to Linux. Everything came up just fine and these four tasks have run all night, so everything seems fine. One of them even sent a trickle. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
#826 550 HadAM4 N144 HadAM4 N144 tasks.(My machines all tied up at the moment so these Linux tasks will likely be gone by the time I have finished other work :( |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,826,970 RAC: 5,066 |
... closely followed by batch #827, which is 75 x Pacific North-West at 25 km for 27 months (batch list). The description is intriguing ... Batch 827: CA camp fire attribution, 21 June 2019, CA camp fire attribution (75 simulations). |
Send message Joined: 20 Jul 05 Posts: 25 Credit: 414,873 RAC: 406 |
I have a question that I am sure are on other people's minds. How come new work gets released when certain people are having trouble uploading work unit parts/tackle to servers because they are full? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
I have a question that I am sure are on other people's minds. How come new work gets released when certain people are having trouble uploading work unit parts/tackle to servers because they are full? This is because, the uploads don't all go to the same servers. With the WAH2 tasks the area is given immediately after WAH2 in the task name. cam tasks (central America) go to a server in Mexico as I recall. Other tasks go to servers at universities all over the world. This means that while there may be problems with one task type, others will upload with no problems. There was a time when all the data went to servers at Oxford but I suspect the vast amounts of data these days would be more than Oxford could cope with! |
Send message Joined: 20 Jul 05 Posts: 25 Credit: 414,873 RAC: 406 |
Thank you for your explanation. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
To add to Dave's response, PNW tasks go to Oregon State University, here in the U.S. Pacific Northwest. Currently, in addition to the earlier-referenced SAM#822, my machines also have SAFR50 Batch#817 tasks hung. Other tasks are now suspended except the single PNW task (its uploads succeed). "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 28 Dec 17 Posts: 18 Credit: 1,097,261 RAC: 147 |
Hi all. I noticed on the server status page that the tasks ready to send is just over 4500 now, but yesterday it was at 11,000 or so. Was there a mass failure of tasks, did some tasks get cancelled, or did a large number of additional computers become active all of a sudden and snatched up these tasks? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I think that was because several old batches were closed. The bulk of the results had been returned, and it was hoped to free up some space. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,826,970 RAC: 5,066 |
A batch #809 download has just failed on one of my machines, so that batch is going to empty out pretty sharpish - and if it's a server rather than a batch problem then other batches too. |
Send message Joined: 18 Feb 06 Posts: 73 Credit: 61,751,384 RAC: 46,505 |
And now there is no more work available. ?? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Thus granting the wishes of several people that there should be no new work until there was lots of server space to accept the uploads. And the weather is so nice we've all Gone Fishing. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Any sign of new work for Windows? The cupboard is getting bare. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
Any sign of new work for Windows? The cupboard is getting bare. No hints of anything for any OS. But, that doesn't always mean a lot as on a number of occasions work has appeared with little or no warning. |
Send message Joined: 13 Jul 18 Posts: 38 Credit: 62,933,508 RAC: 84,702 |
Unable to reach https://www.climateprediction.net/index.php A good sign people are back and doing things. :-) |
Send message Joined: 4 Oct 15 Posts: 34 Credit: 9,075,151 RAC: 374 |
how are things going? when can we expect new work? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
how are things going? Staff are back from the BOINC conference/workshop in America. Credits have been sorted. New work depends on the researchers who use the system at the other end from us crunchers submitting it and the moderators have no direct contact with them as they did with the researchers when they were all based at Oxford. I am reminded of a phrase from my days in the forces, "Hurry up and wait." |
©2024 cpdn.org