climateprediction.net (CPDN) home page
Thread '72 days for wah2_sam25?'

Thread '72 days for wah2_sam25?'

Message boards : Number crunching : 72 days for wah2_sam25?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59134 - Posted: 6 Dec 2018, 8:38:11 UTC
Last modified: 6 Dec 2018, 8:39:32 UTC

At least the long tasks reduce the number of people posting about no work.
ID: 59134 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 59135 - Posted: 6 Dec 2018, 16:35:59 UTC - in response to Message 59134.  

I now have a semi-long task that has run for 1 1/2 days, and will run for 24 days more:
wah2_pnw25_cg13_204909_121_747_011620721_2

That is OK, once you know it is not a goof on their part, and that they really mean it. I run my machines 24/7 anyway. I will have some work through the New Year.
ID: 59135 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,805,483
RAC: 8,941
Message 59167 - Posted: 15 Dec 2018, 9:21:31 UTC - in response to Message 59127.  

Ok, a 766 batch WU I ran under wine completed successfully https://www.cpdn.org/cpdnboinc/result.php?resultid=21353110 it was created 4 Nov.

These sam25 however...btw 100 (win7) and 130 days (WINE)...I really do not think going to the backup micromanagement path is good for the project
ID: 59167 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59168 - Posted: 15 Dec 2018, 11:04:40 UTC - in response to Message 59167.  
Last modified: 15 Dec 2018, 11:05:48 UTC

I really do not think going to the backup micromanagement path is good for the project


It is never going to be an option for the vast majority who set and forget. Even those of us who know a little more can easily make a mistake.

I know that from last time I played with it. I have done the backing up everything and re-starting all models from before a crash successfully but that was in the days where I never had more than 2 cores.

Edit: On a fast machine, micromanaging would probably take more time for me than resuming all from before crash!
ID: 59168 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 59170 - Posted: 15 Dec 2018, 13:28:03 UTC - in response to Message 59168.  

Edit: On a fast machine, micromanaging would probably take more time for me than resuming all from before crash!

That is why I like the earlier suggestion to allow us the option of running longs or shorts. That would allow us to assign them to machines most suitable.

I think a fast CPU could probably do them in around 30 days, which should be short enough to avoid most disasters, at least with a backup power supply. It would avoid a lot of re-sends of failed tasks, with most of them going to people with slow machines also, and so the cycle would repeat.
ID: 59170 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 59171 - Posted: 15 Dec 2018, 21:27:33 UTC

... allow us the option of running longs or shorts ...


The problem with that is that "the old hands" here are vastly outnumbered by the new comers, who will mostly be set and forget, and waste lots of tasks.
Just have the occasional look at the last page of new joiners, (from the Main page), to see how many have computers, and of those, how many are computers that are going to be useful. A few weeks back there were a few 32 processor Macs, with the latest OS, which is 64 bit only.

And the project people probably don't have time to sort out a more refined system. They are, after all, professors and associate professors, who seem to travel often for conferences.
The day to day job of assembling all of the files into a single model is left to lessor mortals, who seem to be climate physicists, but young and enthusiastic.
And are like kids in a candy store: "I want one of everything".

There's an awful lot of data that can be returned from each model, which quickly builds up the size of the zip files. So they need to be a bit thoughtful about what is really needed, and what would be "nice to get while it's there". Especially with the larger areas, that seem to be getting attention lately.

And then there's the network speed needed to send back the data. ADSL doesn't cut it these days. :)
ID: 59171 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59172 - Posted: 16 Dec 2018, 7:16:34 UTC - in response to Message 59171.  

And then there's the network speed needed to send back the data. ADSL doesn't cut it these days. :)


A while ago, we had a lodger in our spare room who was into gaming. When an upload started, howls of anguish could be heard from his room!
ID: 59172 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 59176 - Posted: 16 Dec 2018, 14:22:58 UTC - in response to Message 59171.  

Thanks for the insight. But I have 50 Mbps down and 10 Mbps up, and can build faster machines.
I think the limitation is on their end, especially in the management. They seem much better at climate models than IT operations, if I may say so.
ID: 59176 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59177 - Posted: 16 Dec 2018, 16:05:06 UTC - in response to Message 59176.  

But I have 50 Mbps down and 10 Mbps up


Nice,
I have 6 Mbs down with a following wind and less than 1 up. Until earlier this year, it was even slower. If all my tasks are the ones with 100MBytes plus on each load, it can take a while!
ID: 59177 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 59181 - Posted: 16 Dec 2018, 18:19:53 UTC - in response to Message 59177.  

If all my tasks are the ones with 100MBytes plus on each load, it can take a while!

We have all been there. That is why they should allow some selection, or they will have problems getting their stuff completed.

I assume that they want to go to larger models too, with more data points. That will make for more impact for their project.
ID: 59181 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 59185 - Posted: 16 Dec 2018, 19:56:00 UTC

I've recently talked to Sarah about both the length of the models, and the size of the zips. Except, as I said a couple of posts ago, "conferences".

But reading between the lines of the private test area, it is being considered in a future project even as we speak.


I think that perhaps a re-think by the regulars here may work faster. As in:

1) DON'T grab a huge stockpile of tasks the moment new ones start to appear.
Just one to start, and then RUN IT IMMEDIATELY to see if there's a 6-30 second failure problem.
And REPORT any such immediately.
Also, check the details in the title for clues about length.

2) DON'T use slow computers for cpdn. Use those for other projects.

3) Note the area it's for. You can see how big the area is from the map here: Regions

And I did foretell all of this in a post not too far above this one, 3 years ago: Models are getting more detailed, and therefore bigger
ID: 59185 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59187 - Posted: 16 Dec 2018, 20:50:56 UTC - in response to Message 59185.  

DON'T use slow computers for cpdn.


Which leaves the question of what is, "slow?"

I have a laptop that seems to be taking 30 days for the longest tasks I have downloaded recently and a desktop that takes a bit longer.

They are 2.16 and 2.70GHz respectively. If both machines were a lot faster it would be the uploads that presented a problem. I am happy with that sort of turn around time.
ID: 59187 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 59189 - Posted: 16 Dec 2018, 21:19:00 UTC

That is, indeed, the question.

As far as speed is concerned, somewhere around 3.00 GHz perhaps.

BUT
My old HP is 2.80GHz, not far below the 3.5 of the latest that I have, and it's very slow in comparison.
Perhaps because it's an old 860 processor.

So the architect is also a big consideration. And built-in energy saving measures which slow things down at times.
And the size of the internal caches and pathways.

Possibly it comes down to desktop processors, as against laptop processors.

The best test may be the Average (sec/TS) values for the same batch, on different computers.

I'm definitely NOT going to be using the old Lenovo again!
ID: 59189 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59190 - Posted: 17 Dec 2018, 8:41:55 UTC
Last modified: 17 Dec 2018, 8:57:48 UTC

So the architect is also a big consideration


Yes, my laptop is faster than the desktop despite being about .5GHz slower clock speed.

I'm definitely NOT going to be using the old Lenovo again!


I have retired my 32 bit Atom powered netbook from BOINC never mind from CPDN. It would take over 6 months to complete some of the tasks we get now. Most other projects I would get tasks still running past the deadline which unlike with CPDN is usually observed.
ID: 59190 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 59194 - Posted: 17 Dec 2018, 16:40:02 UTC - in response to Message 59190.  

I have retired my 32 bit Atom powered netbook from BOINC never mind from CPDN. It would take over 6 months to complete some of the tasks we get now. Most other projects I would get tasks still running past the deadline which unlike with CPDN is usually observed.


I’m not sure the atom powered machines were every really fast enough to run CPDN. The shorter Boinc projects; Maybe?
ID: 59194 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59196 - Posted: 17 Dec 2018, 21:15:37 UTC - in response to Message 59194.  

I’m not sure the atom powered machines were every really fast enough to run CPDN. The shorter Boinc projects; Maybe?


Still quite a bit faster than the first machines I ran CPDN on but at that time I was on dial up and all tasks took a very long time.
ID: 59196 · Report as offensive     Reply Quote
ed2353

Send message
Joined: 15 Feb 06
Posts: 137
Credit: 35,517,114
RAC: 10,523
Message 59572 - Posted: 9 Feb 2019, 11:29:07 UTC

With nothing else available, my computer is working on two 762s, one 763 and two 764s. These are long-running time SAM25s. One of them is now halfway through the estimated run time of 72 days, the others are around a quarter of the way.

Are these still of scientific value, or am I wasting electricity to run them?
They were giving me credits before the recent hiatus.
ID: 59572 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 59573 - Posted: 9 Feb 2019, 15:30:50 UTC - in response to Message 59572.  

With nothing else available, my computer is working on two 762s, one 763 and two 764s. These are long-running time SAM25s. One of them is now halfway through the estimated run time of 72 days, the others are around a quarter of the way.

Are these still of scientific value, or am I wasting electricity to run them?
They were giving me credits before the recent hiatus.


Yes these are known to be long models and the data will be of use.
ID: 59573 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 59575 - Posted: 9 Feb 2019, 16:48:05 UTC - in response to Message 59185.  

2) DON'T use slow computers for cpdn. Use those for other projects.


Is mine a slow computer? It runs 24/7. Tasks, when I get them, take only a few days of wall-clock time.

10-Jan-2019 23:46:28 [---] max memory usage when active: 5845.24MB
10-Jan-2019 23:46:28 [---] max memory usage when idle: 7014.29MB
10-Jan-2019 23:46:28 [---] max disk usage: 21.63GB
10-Jan-2019 23:46:28 [---] max download rate: 9600000 bytes/sec
10-Jan-2019 23:46:28 [---] max upload rate: 9600000 bytes/sec

10-Jan-2019 23:46:28 [---] Running CPU benchmarks
10-Jan-2019 23:46:28 [---] Suspending computation - CPU benchmarks in progress
10-Jan-2019 23:47:00 [---] Benchmark results:
10-Jan-2019 23:47:00 [---] Number of CPUs: 4
10-Jan-2019 23:47:00 [---] 1277 floating point MIPS (Whetstone) per CPU
10-Jan-2019 23:47:00 [---] 3505 integer MIPS (Dhrystone) per CPU
ID: 59575 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 59576 - Posted: 9 Feb 2019, 17:37:26 UTC - in response to Message 59575.  

Well, your Xeon only runs at 1.8 GHz and has no turbo boost, so it's running at half the speed of some of the faster processors out there. On the plus side it can utilize quad channel memory bandwidth if populated with 4 dimms. On the negative side of that though is it officially only supports up to DDR3 1066.

So, given the relatively low CPU clock speed, it's certainly not a speedy computer. For most tasks it chugs along just fine, it's those big/longer tasks where the difference in speed really shows up.
ID: 59576 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : 72 days for wah2_sam25?

©2024 cpdn.org