climateprediction.net (CPDN) home page
Thread 'Africa v7.22 Errors'

Thread 'Africa v7.22 Errors'

Message boards : Number crunching : Africa v7.22 Errors
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profileritterm
Avatar

Send message
Joined: 29 May 08
Posts: 128
Credit: 6,289,876
RAC: 0
Message 51027 - Posted: 24 Dec 2014, 2:15:51 UTC
Last modified: 24 Dec 2014, 2:16:22 UTC

Getting a few errors on the new Africa batch. Wingmen are having problems, too.

Typical WU is 9428038.

Stderr output:
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
couldn't start app: CreateProcess() failed - The system cannot find the file specified.
(0x2)
</message>
]]>
ID: 51027 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51046 - Posted: 26 Dec 2014, 11:34:41 UTC

Finally got some processors free, and got 4 Africa's.
3 crashed at 6 seconds with
INITTIME: Atmosphere basis time mismatch

which is a data file mismatch.

I've gone back to EUs for replacements.


ID: 51046 · Report as offensive     Reply Quote
Profileritterm
Avatar

Send message
Joined: 29 May 08
Posts: 128
Credit: 6,289,876
RAC: 0
Message 51047 - Posted: 26 Dec 2014, 12:34:22 UTC - in response to Message 51046.  
Last modified: 26 Dec 2014, 12:35:18 UTC

Les Bayliss wrote:
Finally got some processors free, and got 4 Africa's.
3 crashed at 6 seconds...

I haven't had any further problems with the AFRs I've got running now.

Les Bayliss also wrote:
...I've gone back to EUs for replacements.

Before I topped off with AFRs, I wasn't able to get any EUs (see my post here). Are those for Linux only?
ID: 51047 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51048 - Posted: 26 Dec 2014, 14:44:13 UTC - in response to Message 51047.  

Are those for Linux only?

I think that they may be. The applications page shows Windows and Mac as being version 6.09 from 23 Mar 2011, and Linux as 7.23 from 11 Dec 2014, and the ones that I have are version 7.23


ID: 51048 · Report as offensive     Reply Quote
Profileritterm
Avatar

Send message
Joined: 29 May 08
Posts: 128
Credit: 6,289,876
RAC: 0
Message 51049 - Posted: 26 Dec 2014, 15:26:41 UTC - in response to Message 51048.  

Les Bayliss wrote:
I think that they may be...

Thanks, Les, that's what I was afraid of. I saw the new Linux app listed with the older Windows apps, but was hoping there might still be Windows tasks to run.

ID: 51049 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 51243 - Posted: 16 Jan 2015, 12:25:26 UTC
Last modified: 16 Jan 2015, 12:31:28 UTC

These latest _afr_7.22 WUs has worked very well on my Linux box.

But now it refuses to download any more even if the server say it has +4,000 available.

16-Jan-2015 12:29:43 [climateprediction.net] Sending scheduler request: To fetch work.
16-Jan-2015 12:29:43 [climateprediction.net] Requesting new tasks
16-Jan-2015 12:29:47 [climateprediction.net] Scheduler request completed: got 0 new tasks
16-Jan-2015 12:29:47 [climateprediction.net] Message from server: No work sent
16-Jan-2015 12:29:47 [climateprediction.net] Message from server: UK Met Office HadAM3P-HadRM3P Africa _("is not available for") Linux/x86.
16-Jan-2015 12:29:47 [climateprediction.net] Message from server: No work available for the applications you have selected. Please check your project preferences on the web site.


Huh?
ID: 51243 · Report as offensive     Reply Quote
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,033,903
RAC: 14,766
Message 51244 - Posted: 16 Jan 2015, 12:58:56 UTC - in response to Message 51243.  

Current issue look as if they are Windows and Mac only. Check the applications in the sidebar.
ID: 51244 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,824,485
RAC: 4,956
Message 51245 - Posted: 16 Jan 2015, 13:13:03 UTC - in response to Message 51244.  

Current issue look as if they are Windows and Mac only. Check the applications in the sidebar.

... looks like the AFR application was also removed to steer Linux users to the current EU model, which is Linux-only and high priority.
ID: 51245 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 51246 - Posted: 16 Jan 2015, 14:44:01 UTC - in response to Message 51245.  
Last modified: 16 Jan 2015, 14:53:08 UTC


... looks like the AFR application was also removed to steer Linux users to the current EU model, which is Linux-only and high priority.


That is about the silliest thing I've ever seen here, most of my completed models are reruns from crashed windows models, rescued, recovered or whatever..

Well well there are other things to do like Dnetc or Asteroids and..
ID: 51246 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51249 - Posted: 16 Jan 2015, 20:42:02 UTC

A lot of the EU models are resends because they're being grabbed by the "set and forget" crowd of Linux users with missing 32 bit libs. Or, as I call them "serial killers". Their computers are now being concentrated into one model type, and they are about to be targeted en-masse for blocking.

ID: 51249 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 51250 - Posted: 16 Jan 2015, 22:46:34 UTC

Well, that has nothing to do with _afr_7.22
ID: 51250 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51251 - Posted: 16 Jan 2015, 23:20:42 UTC

The Africa models have also been Linux only for at least a week now.
So, same thing.


ID: 51251 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 51256 - Posted: 17 Jan 2015, 6:45:41 UTC

I really wonder if you know what you are talking about Les.

Bye bye, there is a big world outside..
ID: 51256 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 51257 - Posted: 17 Jan 2015, 8:29:25 UTC - in response to Message 51256.  

Big world, getting warmer.
Me, run the models here, despite the various tech problems, because --
Seems good to me to run the models --
Despite the many problems -- that's what science is about -- yeah?
Models fail - whatever that means --
Takes time, especially with the Monte Carlo model, to get useful results.
I keep on running the CPDN because I think (and have a clue about the statistics, and about the problems with Distributed Computing -- )
I think there's possible value here.
repeat
I think there's probable value here at CPDN.

ID: 51257 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 51261 - Posted: 17 Jan 2015, 15:53:12 UTC

What�s with the sudden increase in batches of Linux only WU�s. The researchers may like Linux, but, most of the crunchers don�t. There may be 30,000 computers attached to this project, but, I would be willing to bet that about 25,000 of them are running some form of Windows. If they drive them away by not providing work for Windows users, they (the researchers) may find that they have large numbers of tasks and a very small number of Linux computers to run them on. That will slow down the work.

ID: 51261 · Report as offensive     Reply Quote
Profiletullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 51262 - Posted: 17 Jan 2015, 17:00:15 UTC - in response to Message 51261.  

The only solution is to adopt a Virtual Machine model, like CERN does. Its Scientific Linux programs can run on Windows, Mac OS X and other Linux distros systems using a a "wrapper" built by CERN around Virtual Box.
Tullio
ID: 51262 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,824,485
RAC: 4,956
Message 51263 - Posted: 17 Jan 2015, 18:56:19 UTC - in response to Message 51261.  

What�s with the sudden increase in batches of Linux only WU�s. ...

As far as I know, there isn't a change in policy, just some practical scheduling concerns. There are tens of thousands of models for Windows users and Linux users - so no great problem yet.
ID: 51263 · Report as offensive     Reply Quote
ed2353

Send message
Joined: 15 Feb 06
Posts: 137
Credit: 35,334,752
RAC: 12,890
Message 51304 - Posted: 24 Jan 2015, 16:59:36 UTC - in response to Message 51251.  
Last modified: 24 Jan 2015, 17:00:27 UTC

[quote The Africa models have also been Linux only for at least a week now.
So, same thing.][/quote]

So why are they still being downloaded to my Windows computer?

However, the AFR models I downloaded at the beginning of January seem to mis-estimate the run time. About halfway through (about 6 trickles out of 12), the original estimated time has already elapsed and the Remaining time actually starts to increase!

I've not seen that happen before.
ID: 51304 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 51305 - Posted: 24 Jan 2015, 19:41:25 UTC - in response to Message 51304.  

However, the AFR models I downloaded at the beginning of January seem to mis-estimate the run time. About halfway through (about 6 trickles out of 12), the original estimated time has already elapsed and the Remaining time actually starts to increase!

I've not seen that happen before.[/quote]

Errors in the time remaining counter aren�t uncommon. They are also harmless and don�t effect the outcome of the tasks. The hadam3p_afr currently running on my machine is going to take about 200 to finish. The time remaining counter reads 72 hours and the �elapsed� counter is at 28. It will probably reach �0� at the 50% point.

ID: 51305 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51306 - Posted: 24 Jan 2015, 20:13:34 UTC - in response to Message 51304.  

So why are they still being downloaded to my Windows computer?


Because the available applications for the various models has been changed yet again, now that the newer Mac apps have been removed.

ID: 51306 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Africa v7.22 Errors

©2024 cpdn.org