climateprediction.net (CPDN) home page
Thread 'Workunit status, new charts'

Thread 'Workunit status, new charts'

Message boards : Number crunching : Workunit status, new charts
Message board moderation

To post messages, you must log in.

AuthorMessage
tullus

Send message
Joined: 16 May 13
Posts: 48
Credit: 475,901
RAC: 0
Message 50599 - Posted: 24 Oct 2014, 5:15:07 UTC

I am happy to announce a new service:
http://ob.cakebox.net/cpdn_status/cpdn_tasks.html
This will run in addition to
http://ob.cakebox.net/cpdn_status/server_status.html
which is discussed here: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7778

One of the features that I have been thinking about is to allow you to filter the data, by e.g. operating system, boinc version, runtime, application... I am just not sure how to implement it. The simplest would be to pre-code some charts, is there a particular "breakdown" that you would like to see? For instance I see "runtime before failure" is being discussed in the forum, interesting?

The description could be better, if anyone understands it, feel free to suggest a better one.

Let me know what you think!

Please note: I have asked Jonathan Miller if the script puts any strain on the cpdn server, and I am waiting for a response. (I am guessing the answer will be no).
ID: 50599 · Report as offensive     Reply Quote
tullus

Send message
Joined: 16 May 13
Posts: 48
Credit: 475,901
RAC: 0
Message 50850 - Posted: 19 Nov 2014, 9:02:45 UTC

In an attempt to look at the operating system dependency for failed model runs I added another chart (bottom):
http://ob.cakebox.net/cpdn_status/cpdn_tasks.html

The chart is based on wus where one computer has reported it as "Completed", while others have reported "Error while computing". Note that currently, 49% of analysed wus (of this type) has failed on windows, but later succeeded on windows, so no clear trend there. But the total number of succeeded on Darwin (mac) is terribly low, since the short wus seem to fail on macs.

Ideas for improvements are welcome.
ID: 50850 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,826,970
RAC: 5,066
Message 50866 - Posted: 20 Nov 2014, 23:16:57 UTC - in response to Message 50850.  
Last modified: 20 Nov 2014, 23:22:20 UTC

Interesting stuff. I would like to know if any Mac has ever completed a model with 7 in the application version number.

[Edit: I mean the CPDN version number not the BOINC Manager version number.]
ID: 50866 · Report as offensive     Reply Quote
tullus

Send message
Joined: 16 May 13
Posts: 48
Credit: 475,901
RAC: 0
Message 50869 - Posted: 21 Nov 2014, 7:57:26 UTC - in response to Message 50866.  
Last modified: 21 Nov 2014, 7:58:55 UTC

Far from perfect, but made the following hack: It only looks at the "UK Met Office HadCM3 short" application (no version number specified)
http://ob.cakebox.net/cpdn_status/cpdn_tasks_short.html
(Currently not configured to update automatically)

No tasks where found successful on any of the macs that the script looked at. The "Total failure" was 12 % and the "Total success" was 0 % for darwin. (As the table at the bottom tries to show.).

A note on the choice in computers to analyse should be made, as it will bias the results considerably:
- First the 20 top hosts are visited http://climateapps2.oerc.ox.ac.uk/cpdnboinc/top_hosts.php
- Then the remaining 30 hosts are chosen at random from the "discovered" wingmenn.

This means that if there is a single mac out there that complete all of its short models, it will not show up as a wingmann for any other tasks (since cpdn does not use redundancy), and hence not be visited (assuming it is not in the top 20 list).

There are of course a number of ways to improve this algorithm...
ID: 50869 · Report as offensive     Reply Quote
tullus

Send message
Joined: 16 May 13
Posts: 48
Credit: 475,901
RAC: 0
Message 50870 - Posted: 21 Nov 2014, 8:16:16 UTC - in response to Message 50866.  

Interesting stuff. I would like to know if any Mac has ever completed a model with 7 in the application version number.

[Edit: I mean the CPDN version number not the BOINC Manager version number.]


Ahh, I think I might have misunderstood. Yes that would indeed be interesting. Currently the version number is not parsed, added to the todo list.
ID: 50870 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,826,970
RAC: 5,066
Message 50873 - Posted: 21 Nov 2014, 10:13:08 UTC

Thanks, tullus. That rather confirms my prejudice. I know that my Mac can't run to completion any 7-series model - it would appear, for HADCM3S at least, that the problem is not particular to my machine but general to all Macs.

(I've flagged this on beta to the project staff, in case they replace HADAM3P/EU/ANZ with 7-series versions, in which case Mac users would be left with only HADCM3N to run - which is another kind of challenge!)
ID: 50873 · Report as offensive     Reply Quote
Dave Roberts

Send message
Joined: 15 Jan 11
Posts: 175
Credit: 6,242,691
RAC: 699
Message 50874 - Posted: 21 Nov 2014, 19:18:38 UTC

All my 7-series tasks for HADCM3S have come up with a compute error near the end of processing. Would it better not to accept any 7-series task on a Mac at the moment?
If so, should I abort any of these that are currently in my queue?
ID: 50874 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,826,970
RAC: 5,066
Message 50875 - Posted: 21 Nov 2014, 19:26:23 UTC - in response to Message 50874.  

That is my belief. However, I haven't established that definitively for all Macs, just my own and the extension by tullus here. There's work in beta that might fix this soon, which will restore the full spectrum of models to Mac users and fix other problems as well. (That is of course to count one's beta chickens before they are hatched.)
ID: 50875 · Report as offensive     Reply Quote
Dave Roberts

Send message
Joined: 15 Jan 11
Posts: 175
Credit: 6,242,691
RAC: 699
Message 50876 - Posted: 21 Nov 2014, 19:57:33 UTC

Thanks Iain, that was my feeling too. Let's hope any new hatchlings are well bred.
ID: 50876 · Report as offensive     Reply Quote

Message boards : Number crunching : Workunit status, new charts

©2024 cpdn.org