climateprediction.net (CPDN) home page
Thread 'no credit awarded?'

Thread 'no credit awarded?'

Message boards : Number crunching : no credit awarded?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68546 - Posted: 3 Mar 2023, 8:13:18 UTC - in response to Message 68541.  
Last modified: 3 Mar 2023, 8:14:57 UTC

There have been times in the past when credit hasn't shown despite zips being on the website but most of them have been when there are problems with the credit script having fallen over or not been restarted after an event of some kind. There have also been times when the credits have appeared despite zips not showing on the task pages, presumably because the problem occurs after the processes to display them and the ones to go into the credit script separate.
I think that's just a simple matter of timing. The original system had two scripts - one to copy the trickles to a place where they could be seen on the website and used in credit calculations: and the other to work out the actual credit and RAC. They both took several hours to run, and the first had to finish before the second one started, otherwise some hosts got missed (that was another problem).

One script ran on an interval basis: "every 24 hours (then) since the project had last been restarted". The other ran as a cron job: "at hh:mm o'clock every day". If emergency maintenance meant that the project had to be restarted at an unusual time of day, those timings could clash, and credit was erratic until the staff could get round to an orderly, planned, restart - with a check that every component was active, and running in the right sequence. Until the next time ...

I don't know what the current mechanism is supposed to be: just that it doesn't appear to be going to plan. If my offer to take a look is taken up, I suppose the first question is: "can you supply me with a schematic flow-chart of the expected credit system as it stands now?". If they don't have one to hand, then drawing one up would be a useful first step.
ID: 68546 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,545,204
RAC: 16,601
Message 68547 - Posted: 3 Mar 2023, 13:23:29 UTC - in response to Message 68546.  

One script ran on an interval basis: "every 24 hours (then) since the project had last been restarted". The other ran as a cron job: "at hh:mm o'clock every day". If emergency maintenance meant that the project had to be restarted at an unusual time of day, those timings could clash, and credit was erratic until the staff could get round to an orderly, planned, restart - with a check that every component was active, and running in the right sequence. Until the next time .
That's a bizarre way of doing it. If there's a dependency between the scripts either they should both be in the same cron job or there should be a trigger completed for the 2nd script to fire.

Andy did tell me the credit script had been disrupted way back at the beginning of the year - he didn't go into details. That might be part of it but it doesn't seem to be working now either.
---
CPDN Visiting Scientist
ID: 68547 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,545,204
RAC: 16,601
Message 68548 - Posted: 3 Mar 2023, 13:30:50 UTC - in response to Message 68540.  

The problem seems to have started around the same time OIFS got released. OIFS is credited as all or nothing, and trickles never show up on the website. Hadley is credited per trickle and they (should) show up on the website.
It's not related to when OpenIFS was released. OpenIFS tasks first went out years ago. The models know nothing about each other, the controlling code is completely different between the two (though that might be the cause of the problem).
I'm not sure what you mean as I don't know what went on behind scenes but I got my first OIFS tasks on 28 November of last year, and my first un-credited Hadley tasks reported as competed on 30 November. So what I see is that OIFS production release happened around the same time as Hadley models stopped getting credit. It just seems like there just might be a connection there somehow.

OpenIFS first appeared on the production CPDN site in 2020. There is a paper in the scientific literature based on the results from those batches. Then there was a long pause when the model was updated but small batches were released prior to the very big batches we saw end of last year. There has been no change to the way trickles are handled from the task/client side since 2020. I think the issues are at the CPDN server end.
ID: 68548 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 127
Credit: 24,517,158
RAC: 18,287
Message 68549 - Posted: 3 Mar 2023, 13:45:24 UTC - in response to Message 68546.  
Last modified: 3 Mar 2023, 13:47:18 UTC

The original system had two scripts - one to copy the trickles to a place where they could be seen on the website

This script, or whatever was supposed to replace this script, clearly isn't working as seen with the "No trickle!" on website.

Based on the 11. August 2022 batch of WAH2 work, since trickles did work in August but not in December (then original issue errored-out), it doesn't look like any mis-configuration of the actual wu.
Instead, some possibilities includes:
1: Trickle script can't copy to directory, due to accidentally write-protected directory or directory physically full or "full quota" or accidentally lost access rights.
2: The ini-file responsible for where trickle-script should copy trickles was changed to point to new directory, but neither web-pages or credit-script was updated to new directory.
3: Trickle-script stuck on a specific trickle and even if re-started get stuck on the same "bad" trickle.
4: Updated or re-configured BOINC server and "forget" to extract trickle information from scheduler, or extract to "wrong" directory from where trickle script expect.
5: Since apparently OpenIFS does not rely on trickles for crediting, incorrectly assumed didn't need to copy trickles any longer.

Note, chances are then the problem with trickles not showing-up on web-page is fixed the credit will also be fixed on next credit run (unless overlooked the example where "recent" trickles does show on web-page but still no credit).
ID: 68549 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68550 - Posted: 3 Mar 2023, 13:53:19 UTC - in response to Message 68547.  

That's a bizarre way of doing it.
I'm probably referring back to around late 2009 (that's when I last looked in detail at credit), or even earlier.

To me, it smells like a quick'n'dirty kludge, thrown together in the early days of the project (and of BOINC), to bridge the gap between two parts of an incomplete system. Never expected, or intended, to be still running 20 years later with today's vastly quicker flow of results from modern tech.

Did you refer to the history of David Anderson's involvement with BOINC, that he alluded to at the start of his talk to the workshop? The section on CPDN is illuminating, though I don't trust David's recall of history - my name appears later on in the blog, and the roles he ascribes to me are broadly accurate, but that's an amendment after Jord appealed. I still don't recognise myself.

But here's the CPDN section, for what it's worth:

When we released the BOINC-based version of SETI@home to the public, there was a lot of backlash. People don't like change in general, and they didn't like the complexity of BOINC. We lost a big fraction of our volunteer base; it went from 600K to 300K or something like that.

I was very eager to get Climateprediction.net (CPDN) working. It had very long jobs: 6 months on some computers. We added "trickle" mechanisms to let the jobs upload intermediate results, and grant partial credit. I went to Oxford and spent a month working with Carl Christensen and Tolu Aina.
Myles Allen, Climateprediction.net, and Oxford
Myles is a visionary climate scientist at Oxford University. He proposed using volunteer computing for climate research in a Nature article in 2000. I read this and immediately contacted him. They had done something remarkable: taking a state-of-the-art climate model - a giant FORTRAN program that had only been run on supercomputers - and getting it to run on Windows PCs. They initially hired a local company to develop the job-distribution software, but switched to BOINC as soon as it was available.

I was very eager to make CPDN a success. In 2005 I spent a month in Oxford, staying in Myles' house (he was away for the summer) and working with Carl and Tolu.

In my view, CPDN hasn't lived up to its potential. Carl didn't feel appreciated at CPDN, and he left in 2008. Tolu left a year or two later. That left CPDN without a lot of technical resources. Oxford appointed a "director of volunteer computing", but nothing came of it.
Note that the role of the BBC in promoting the early, pre-BOINC, stage of CPDN's life has escaped David's notice.
ID: 68550 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,424,505
RAC: 2,849
Message 68551 - Posted: 3 Mar 2023, 14:06:35 UTC
Last modified: 3 Mar 2023, 14:14:57 UTC

It's been many years since I managed CPDN WU's.
Moved my Linux AntiX VM onto a server to let it do some climate modeling work.

And I'm still puzzled because the last conversation I had with Andy was that only trickles (for OpenIFS) are awarded credit, not completion. But I see what you mean. Richard's offered to take a closer look. Next time there's a tech meeting I'm in I'll bring it up, it will be more effective that way than myself or the moderators sending emails.


I have 4 OpenIFS marked valid, the trickles were all uploaded successfully (according to their log and my client event log), yet they all have 0 credit.
Not sure how to tell if the WU are only partially completed and the rest of the work went to another machine.
But seeing that most of these lines are at less than 100% I guess means the model isn't completed and was moved on:
STATS FOR ALL TASKS
 NUM ROUTINE                                     CALLS  MEAN(ms)   MAX(ms)   FRAC(%)  UNBAL(%)
   0 CNT0     - COMPLETE EXECUTION                   1 ********* *********    100.00      0.00
   1 CNT4     - FORWARD INTEGRATION                  1 ********* *********     99.98      0.00
   8 SCAN2M - GRID-POINT DYNAMICS                 3200   14521.4   14521.4     43.02      0.00
   9 SPCM     - SPECTRAL COMP.                    2952    1842.4    1842.4      5.03      0.00
  10 SCAN2M - PHYSICS                             2953    9882.9    9882.9     27.02      0.00
  11 IOPACK   - OUTPUT P.P. RESULTS                247    6811.2    6811.2      1.56      0.00
  12 SPNORM   - SPECTRAL NORM COMP.                126      82.3      82.3      0.01      0.00
  13 SCAN2M - RADIATION CALC.                      985   82359.3   82359.3     75.10      0.00
  14 SUINIF                                          1   14351.2   14351.2      0.01      0.00
  17 GRIDFPOS IN CNT4                              247     362.0     362.0      0.08      0.00
  18 SUSPECG                                         1    3399.0    3399.0      0.00      0.00
  19 SUSPEC                                          1    3468.4    3468.4      0.00      0.00
  24 SUGRIDU                                         1    7905.6    7905.6      0.01      0.00
  25 SPECRT                                          1    1461.0    1461.0      0.00      0.00
  26 SUGRIDF                                         1    1516.0    1516.0      0.00      0.00
  27 RESTART FILES - WRITING                       123   13675.8   13675.8      1.56      0.00
  28 RESTART FILES - READING                         1       0.0       0.0      0.00      0.00
  29 SU4FPOS IN CNT4                               247       1.4       1.4      0.00      0.00
  30 DYNFPOS IN CNT4                               247   17375.5   17375.5      3.97      0.00
  31 POSDDH IN STEPO                                13      36.4      36.4      0.00      0.00
  37 CPGLAG   - SL COMPUTATIONS                   2953  -53919.1       0.0      0.00    147.40
  38 WAM      - TOTAL COST OF WAVE MODEL          2952   23517.5   23517.5     64.27      0.00
  39 SU0YOMB                                         1    1564.1    1564.1      0.00      0.00
  51 SCAN2M   - SL COMM. PART 1                   2953      59.5      59.5      0.16      0.00
  54 SPCM     - M TO S/S TO M TRANSP.             2952     367.6     367.6      1.00      0.00
  55 SPCIMPF  - S TO M/M TO S TRANSP.             2952      82.1      82.1      0.22      0.00
  56 SPNORM   - SPECTRAL NORM COMM.                126       1.3       1.3      0.00      0.00
 102 LTINV_CTL   - INVERSE LEGENDRE TRANSFORM    10094    1333.5    1333.5     12.46      0.00
 103 LTDIR_CTL   - DIRECT LEGENDRE TRANSFORM      6152    1427.9    1427.9      8.13      0.00
 106 FTDIR_CTL   - DIRECT FOURIER TRANSFORM       6152     228.6     228.6      1.30      0.00
 107 FTINV_CTL   - INVERSE FOURIER TRANSFORM     10094     233.5     233.5      2.18      0.00
 140 SULEG       - COMP. OF LEGENDRE POL.            2     127.7     127.7      0.00      0.00
 152 LTINV_CTL   - M TO L TRANSPOSITION          10094      59.8      59.8      0.56      0.00
 153 LTDIR_CTL   - L TO M TRANSPOSITION           6152      64.6      64.6      0.37      0.00
 157 FTINV_CTL   - L TO G TRANSPOSITION          10094      78.6      78.6      0.73      0.00
 158 FTDIR_CTL   - G TO L TRANSPOSITION           6152      65.8      65.8      0.37      0.00
 400 GSTATS                                     589499       0.0       0.0      0.00      0.00
 401 GSTATS HOOK                                564603       0.0       0.0      0.00      0.00
TOTAL MEASURED IMBALANCE =       0.0 SECONDS,  0.0 PERCENT
TOTAL WALLCLOCK TIME   108019.4 CPU TIME  503376.7 VECTOR TIME   503376.7


From Richard's last comment; I'm guessing the new credit script failed to catch these 4 WU's or the scripts haven't run since they completed about 11 hours ago.

I see nothing abnormal to this run and so rather than starting a new thread, guess this should be added to this conversation.
https://www.cpdn.org/result.php?resultid=22315582

--------
For Glenn Carver:
My BOINC credit was awarded coins, which was translated into fiat dollars of $1800 that bought 3 used rack servers that went into making me a more productive member
of the BOINC community. Proof of work coins are still viable and the electricity goes into actual science work (and heating my home), like finding primes, but not currently climate research... which is a shame..
All labor that human hands, and minds do, must become paid work as AI rises to take over more duties and we may eventually need to "pay" the AI's, so human wages can compete with their "wages". Their wages will need to goto charities, or to fund basic monthly income for humans, as they take over more employment. Chat bots are already making inroads into help desk duties. Wealth disparities can lead societies to civil wars https://phys.org/news/2014-06-rich-poor-gap-civil-war.html and the disparities are growing, and that's not just the pandemic's effect

So yeah, I at least want a cookie, or some credit, for my time spent on these WU's.
And great, you all found some people willing to pay for the modeling services.
If they are paying then send some of those funds our way because managing 400-800 BOINC computing cores is human work, not an AI's, yet... I'm 60, with a physics degree, yet looking at never being able to retire, and needing to work till I die.

And if you think my anger isn't appropriate then stop making disparaging comments about users who like to get simple tokens of credit for the time, which is worth money, spent running your research...It's like I tell believers; "If you don't want your religion criticized then don't bring it up".
Do not tell us we should not even worry about getting credit.
We deserve credit and we also deserve cash for our labor.
ID: 68551 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 127
Credit: 24,517,158
RAC: 18,287
Message 68553 - Posted: 3 Mar 2023, 14:17:49 UTC - in response to Message 68550.  

Note that the role of the BBC in promoting the early, pre-BOINC, stage of CPDN's life has escaped David's notice.
While I did run the pre-BOINC CPDN client, I can't remember BBC mentioned here, but then again it's roughly 20 years ago. The "special" BBC CPDN experiment that started in 2006 on the other hand did use BOINC.

BTW, now maybe my recollection is too fuzzy, but after the BBC experiment shut-down, didn't once-upon-a-time these BBC credits show-up here as a separate field on individual user's pages? I just checked and didn't see such a field.
ID: 68553 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,545,204
RAC: 16,601
Message 68554 - Posted: 3 Mar 2023, 14:25:20 UTC - in response to Message 68550.  
Last modified: 3 Mar 2023, 14:28:47 UTC

Richard wrote:
Did you refer to the history of David Anderson's involvement with BOINC....here's the CPDN section, for what it's worth:

David A.'s online article wrote:
I was very eager to get Climateprediction.net (CPDN) working. We added "trickle" mechanisms to let the jobs upload intermediate results, and grant partial credit. I went to Oxford and spent a month working with Carl Christensen and Tolu Aina.
......
In my view, CPDN hasn't lived up to its potential. Carl didn't feel appreciated at CPDN, and he left in 2008. Tolu left a year or two later. Oxford appointed a "director of volunteer computing", but nothing came of it.
Note that the role of the BBC in promoting the early, pre-BOINC, stage of CPDN's life has escaped David's notice.
Yes, exactly, what rubbish. David clearly has an issue with CPDN. What he doesn't know is there were issues with Carl's behaviour which I am not at liberty to talk about.

If you look at the list of publications from boinc projects in the scientific literature (https://boinc.berkeley.edu/pubs.php), CPDN stands 2nd with 140 publications; only Rosetta has more and most boinc projects publish a lot less. Given how much effort & time it takes to get a boinc project up and running, that's a poor return on grant money for alot of boinc projects. If I was still on grant panels, I'd want to see a better publication record. Scientific publications are still a key measure of scientific impact. I really don't know what basis or measure David A. has for his comments. It reads badly for him frankly, it looks like sour grapes on his part.

I sat in on the online boinc workshop last week, it was not great. It came across as a bunch of older men patting themselves on the back, led by David A. The only highlight was the talk by the Prof introducing BlackHoles@Home, but he highlighted the various issues projects still have using boinc. It's a shame, boinc is a great idea but needs to be steered better from the front.

Anyway, this is drifting off topic, I agree the implementation of trickles & credit smells like a kludge. I think we've seen the software is not very robust in places. I am hoping to get the chance to chat to CPDN next week about this.
---
CPDN Visiting Scientist
ID: 68554 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 68555 - Posted: 3 Mar 2023, 14:30:58 UTC

I have 4 OpenIFS marked valid, the trickles were all uploaded successfully (according to their log and my client event log), yet they all have 0 credit.
Those tasks only completed yesterday. The credit script only runs once a week so they should be credited some time late Saturday or early Sunday.
ID: 68555 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 68556 - Posted: 3 Mar 2023, 14:38:46 UTC

The "special" BBC CPDN experiment that started in 2006 on the other hand did use BOINC.
That is what got me started with CPDN though the loss of an email means I now have a different user name.
ID: 68556 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68557 - Posted: 3 Mar 2023, 14:50:55 UTC

I was possibly misled by a page I pulled up during an earlier conversation with Glenn: http://news.bbc.co.uk/1/hi/sci/tech/3100024.stm

A page dated September 2003 says:

A massive worldwide online effort to predict how the global climate will change this century is being launched in the UK.

Computer users anywhere on Earth can join by downloading a climate model from a website.

The organisers say it will be the world's largest climate prediction experiment.

They hope it will result in a much more robust picture of the probable future climate.

The experiment is being launched on 12 September at the Science Museum in London and at the British Association science festival in Salford.

It is the fruit of collaboration between the universities of Oxford and Reading, the Met Office, the Open University, the Rutherford Appleton Laboratory, and a software company, Tessella Support Services.
I assumed that was the start of the Beeb's editorial backing of the project as part of its educational support services, but I may have conflated two separate events.
ID: 68557 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68558 - Posted: 3 Mar 2023, 15:28:40 UTC

OK, back to credit issues, and specifically the breakdown of credit awards for Hadley tasks in late 2022. I find one of AndreyOR's computers very helpful in isolating the start of this event:

Filtered list of HadSM4 at N144 resolution tasks for computer 1526028, page 5

It is clear that tasks reported on Tuesday 29 Nov 2022, up to 20:29:05 UTC, have been granted credit.
Tasks reported on Wednesday 30 Nov 2022, from 10:56:28 UTC, have not.

Because it happened mid-week, it's unlikely to be a strict "credit script" event: it would most likely have become visible at a weekend, if that was the case. And looking at individual sample tasks, trickles disappeared from the task display in the same time interval. So I think it's more likely to be a problem introduced into the trickle transfer stage of the process.

Switching to trickles I've captured on my own machines at various times. These are from September 2014, and different task types, but they illustrate the flow.

A trickle starts life as an XML file of the project's directory:

<variety>year</variety>
<wu>hadcm3s_1aby_2001_2_008988784</wu>
<result>hadcm3s_1aby_2001_2_008988784_1</result>
<ph>1</ph>
<ts>51840</ts>
<cp>187137</cp>
<vr>7.24</vr>
<ppname>
trickle_hadcm3s_1aby_2001_2_008988784_1_2003.zip</ppname>
<pplen>
110326</pplen>
<ppdataz>
0MT $0! "  " DJ=O4$_U^CWDV4  00!&  , <' H%&9CUV,S]5,A)6>?)#,P$S7
R\%,P@3.X@S-X0S7Q\5;E%F;A]E,P S,?!'9NXV831D8 P+     ( PNX<.(C1&8
...
[snip]
...
</ppdataz>
This gets copied by BOINC into a "sched_request" message to the project server. I'll ignore the ppdata to save space.

<msg_from_host>
      <result_name>hadcm3s_1aby_2001_2_008988784_1</result_name>
      <time>1410789211</time>
<variety>year</variety>
<wu>hadcm3s_1aby_2001_2_008988784</wu>
<result>hadcm3s_1aby_2001_2_008988784_1</result>
<ph>1</ph>
<ts>51840</ts>
<cp>187137</cp>
<vr>7.24</vr>
...
[pp fields snipped]
...
</msg_from_host>
Note that at this stage, we only know the result by name: it has to matched up by the server with the full result record in the database, which is keyed by ResultID number. I'm suspicious that this may be where our problems start.

At this stage, I have to switch to a Linux machine for the next part of the story. Be right back ...
ID: 68558 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68559 - Posted: 3 Mar 2023, 15:45:32 UTC - in response to Message 68558.  

One click on the KVM button later...

Here's a version of that final <msg_from_host>, recorded from an IFS_bl task a couple of weeks ago.

<msg_from_host>
      <result_name>oifs_43r3_bl_a27b_2016092300_15_991_12209642_0</result_name>
      <time>1676364290</time>
<variety>orig</variety>
<wu>oifs_43r3_bl_a27b_2016092300_15_991_12209642</wu>
<result>oifs_43r3_bl_a27b_2016092300_15_991_12209642_0_r863024831</result>
<ph></ph>
<ts>864000</ts>
<cp>17458</cp>
<vr></vr>
</msg_from_host>
The pp fields are no longer used, and a couple of others are blank, but I doubt that matters.

But please compare carefully the tag <result>.

In the old hadcm3 tasks, that's identical to the <result_name> tag added by BOINC. But in IFS, it's been extended by _r863024831 - used in the upload file names.

IF (and that's a very big if) CPDN were relying on <result> to match a trickle to its ResultID, that would be a point of failure. It's the first smoking bearing in a very big machine.
ID: 68559 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,545,204
RAC: 16,601
Message 68560 - Posted: 3 Mar 2023, 16:09:08 UTC - in response to Message 68559.  

Richard, thanks. This all looks promising, bottom line though is that this conversation needs to be had with Andy/CPDN. No-one on the forums will be able to progress this. There's a tech meeting Monday. I will show them and ask if you can help out. If you have any other input send me a PM.
ID: 68560 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,913,871
RAC: 16,233
Message 68561 - Posted: 3 Mar 2023, 21:57:02 UTC - in response to Message 68548.  

OpenIFS first appeared on the production CPDN site in 2020. There is a paper in the scientific literature based on the results from those batches. Then there was a long pause when the model was updated but small batches were released prior to the very big batches we saw end of last year. There has been no change to the way trickles are handled from the task/client side since 2020. I think the issues are at the CPDN server end.

Ok. That's before my time here so that's why I didn't know about it. I believe you didn't show up on the forums until last year too so to me it seemed like OIFS just started at CPDN last year, although I did see evidence on the website that its arrival has been in the works at least.

I've always assumed that the problem is at the CPDN server end. With my comments, I was thinking that possibly the arrival of OIFS disrupted some things with trickles and credit handling by CPDN servers, not that there's an issue with the model or BOINC client.
ID: 68561 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,913,871
RAC: 16,233
Message 68562 - Posted: 3 Mar 2023, 22:18:10 UTC

I have a trickle pending from a task of the latest OIFS run, here's some current data on what Richard was talking about.

The entire contents of trickle_up_oifs_43r3_001t_2019110100_123_993_12213503_0_1677879929.xml file:
<variety>orig</variety>
<wu>oifs_43r3_001t_2019110100_123_993_12213503</wu>
<result>oifs_43r3_001t_2019110100_123_993_12213503_0_r1949673894</result>
<ph></ph>
<ts>10623600</ts>
<cp>84718</cp>
<vr></vr>

What I think is the relevant section of the sched_request_climateprediction.net.xml file:
<msg_from_host>
      <result_name>oifs_43r3_001t_2019110100_123_993_12213503_0</result_name>
      <time>1677877810</time>
<variety>orig</variety>
<wu>oifs_43r3_001t_2019110100_123_993_12213503</wu>
<result>oifs_43r3_001t_2019110100_123_993_12213503_0_r1949673894</result>
<ph></ph>
<ts>10368000</ts>
<cp>82660</cp>
<vr></vr>

  </msg_from_host>
ID: 68562 · Report as offensive     Reply Quote
bullschuck

Send message
Joined: 22 May 21
Posts: 39
Credit: 1,208,413
RAC: 3,997
Message 68564 - Posted: 9 Mar 2023, 13:37:42 UTC - in response to Message 68560.  

Richard, thanks. This all looks promising, bottom line though is that this conversation needs to be had with Andy/CPDN. No-one on the forums will be able to progress this. There's a tech meeting Monday. I will show them and ask if you can help out. If you have any other input send me a PM.


Any report from the tech meeting on Monday?
ID: 68564 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,545,204
RAC: 16,601
Message 68565 - Posted: 9 Mar 2023, 14:57:26 UTC - in response to Message 68564.  
Last modified: 9 Mar 2023, 14:57:35 UTC

Richard, thanks. This all looks promising, bottom line though is that this conversation needs to be had with Andy/CPDN. No-one on the forums will be able to progress this. There's a tech meeting Monday. I will show them and ask if you can help out. If you have any other input send me a PM.
Any report from the tech meeting on Monday?
Richard is engaged with CPDN to isolate the problem. I'm sure he'll report here when there's more info.
---
CPDN Visiting Scientist
ID: 68565 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,729,836
RAC: 7,099
Message 68567 - Posted: 9 Mar 2023, 15:27:10 UTC - in response to Message 68565.  

Yes, I've written to Andy (who was busy with the BOINC workshop yesterday), and requested a specific chunk of data which will help us localise where the problems start. Once I receive that, I can work out whether we need to search forwards or back to the source of the trouble.

It'll take several steps, and I won't keep up a running commentary, but I'll let you know when we make any significant change that may be observable in your own accounts.
ID: 68567 · Report as offensive     Reply Quote
bullschuck

Send message
Joined: 22 May 21
Posts: 39
Credit: 1,208,413
RAC: 3,997
Message 68622 - Posted: 22 Mar 2023, 16:22:28 UTC - in response to Message 68567.  

Yes, I've written to Andy (who was busy with the BOINC workshop yesterday), and requested a specific chunk of data which will help us localise where the problems start. Once I receive that, I can work out whether we need to search forwards or back to the source of the trouble.

It'll take several steps, and I won't keep up a running commentary, but I'll let you know when we make any significant change that may be observable in your own accounts.


Any updates yet? Please forgive me if I'm being a pest.

Thanks!
ID: 68622 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next

Message boards : Number crunching : no credit awarded?

©2024 cpdn.org