Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
A new batch (784) of hadcm3s just came out, and I got four of them. The first one has been running for 45 minutes with no problems, so they could work. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
A new batch (784) of hadcm3s just came out, and I got four of them. The first one has been running for 45 minutes with no problems, so they could work. I just tried to get some but got the database down message so have to wait an hour before having another go. I know Andy knows about it but not sure why the machine should be so busy that it is causing problems at the moment? Edit: Took 8 attempts to post the above. Now to see how long the edit takes.... |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
There are 3061 of them, but I can't get any for my Mac - which can only run HADCM3S - because of the database being "down" (batch list). |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
Have snagged one on my desktop machine now. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
New model type for batch 785 HadAM4 I don't know what is different about this model type but between two machines, three of them running under Linux at the moment. If I understand what I have read correctly, a relatively small batch of 500 tasks so they won't last long, especially if as I suspect they run on all three platforms. Seen some on Windows of various types. Not found any on Mac yet but that means nothing as I only looked at about ten or 12 tasks. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
All 100 of the running tasks I looked at were on the linux app. I had tried Windows first when I saw there were new tasks, but the Windows clients wouldn't pick any up. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, the new app is Linux only. (Yea!) And I've got some running on one computer, which now has both Linux/Wine/Windows, and Linux only. (Yea!) I just need to remember which icon starts which version. And I've come across the first mass killer, who is now running this batch. :( Batch 785 is a small spinup batch. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Now a bit over 3% at a bit over 6 hours on my 3.50 GHz Haswell computer, so about 8.5 days total. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
A lot seem to be crashing with Model crashed: READDUMP: BAD BUFFIN OF DATA. This happened to four on testing when someone put a digger through a mains cable near where I live. This happening anything from shortly after model starts to several hours in. I suspect they don't like being interrupted. This is one example https://www.cpdn.org/cpdnboinc/result.php?resultid=21488816 Looks like about one in six of those that don't fail due to missing libraries are failing with the above error. A few other errors also spotted. A few with an insufficient stack memory available and one where user had restricted memory usage to half a gig. Interestingly that one had completed some hadcm3s tasks. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
And the new task type is now on the project Status page. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
I had one of those failures with the bad buffin data, when I stopped boinc several timesteps after a checkpoint. There should have been nothing wrong with doing it. It was the only task running on that PC at the time. Not good. These things might be the biggest memory hogs in terms of active memory that we've had. Each model task takes about 650 MB of RAM, so for a fully loaded i7 with 8 tasks, about 5.5 GB of RAM used. I would imagine given cache and memory contention, in that circumstance, it would REALLY slow model progress relative to some of our other model types/regions. My reasonably quick PCs running only 1 model each are averaging 7.5 to 9.5 sec/TS. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
Thanks George, Each model task takes about 650 MB of RAM, so for a fully loaded i7 with 8 tasks, about 5.5 GB of RAM used. I would imagine given cache and memory contention, in that circumstance.... I hadn't looked at how much memory was being used but what you say makes sense. There are a few machines out there that have crashed tasks due to running out of memory. One of them admittedly a 4 core I7 with only 1GB ram! |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Any sign of new work for Windows? I will have 4 empty cores by tomorrow. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Finally got to the zips. About 7.3M on average, so much smaller than what we've had for a while. I can confirm what George said about the Virtual memory size. Definitely not for bare bones machines. 14.0 sec/TS for the Haswell, and about 14.4 sec/TS for the Ivy bridge. |
Send message Joined: 15 May 09 Posts: 4537 Credit: 19,001,532 RAC: 21,726 |
Any sign of new work for Windows? I will have 4 empty cores by tomorrow. I think there is a batch in the pipeline. New files were uploaded to a potential cam25 batch about 0100UTC so someone was working late if based in Oxford! |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Hopefully, a large batch. |
Send message Joined: 18 Feb 06 Posts: 73 Credit: 61,570,550 RAC: 47,758 |
I have 20 cores idle, how long do we have to wait for new work (windows) for CPDN ? Just a question : is this the end ? Or is there some hope for better days ? Thanks. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It's not the end, just normal. There were several large batches released late last year. Now those researchers are waiting for the data to be returned so that they can study the results. And at long last work is in hand on new Linux models, for those of us who haven't had any work for a long time. And lots more people are joining the project every day, so there's no shortage of computers waiting. |
Send message Joined: 18 Feb 06 Posts: 73 Credit: 61,570,550 RAC: 47,758 |
Thanks, now there is new work |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
There are 4000 units for East Asia at 50 km at 18 month duration (batch list). |
©2024 cpdn.org