climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 91 · Next

AuthorMessage
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58867 - Posted: 17 Oct 2018, 21:41:49 UTC - in response to Message 58866.  

And AGAIN they're only showing one trickle. :(
I've sent an email about it.


I have gotten used to that. In the case of these two, it does not matter much since I got the first and only trickle about half way through.

The ones that bothered me were the ones that got me one trickle after a day or so, and then ran a week or two with no more trickles and no more credits, and then terminated successfully.

Since I seem to get only a few work units a year these days, it does not matter all that much anymore. Maybe I will get some more by Easter.
ID: 58867 · Report as offensive
ProfileBill F

Send message
Joined: 17 Jan 09
Posts: 124
Credit: 2,026,181
RAC: 2,642
Message 58924 - Posted: 30 Oct 2018, 2:43:30 UTC

We have about 7000 active Users running about 9000 active systems and there are about 103000 WU's out being worked now.

That works out to 11 WU's per active system.

I have a system that needs a WU but there are none available.

Under the heading of Orphaned WU's if the system has a WU that has not Trickled in 90 days and the WU is not one of the known groups with Trickle issues. Could these be aborted by the Project and reissued ?

Science is science it should be completed !!!


Thanks
Bill F
In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 58924 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4536
Credit: 18,997,390
RAC: 21,721
Message 58925 - Posted: 30 Oct 2018, 7:30:49 UTC

Under the heading of Orphaned WU's if the system has a WU that has not Trickled in 90 days and the WU is not one of the known groups with Trickle issues. Could these be aborted by the Project and reissued ?


Perhaps a simpler system would be to just reduce the time limit on tasks? It may not release as many tasks as you think however. The figure is tasks/computer and I suspect the average number of CPUs/computer may be quite a lot higher than the three between my two boxes. With BOINC estimating even the quicker tasks as taking over ten days and some over 30 days on my boxes I am not getting any in the queue at the moment. (That is a problem with between WINE and BOINC that results in the GFLOPS value for my computers being greatly under reported.
ID: 58925 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58927 - Posted: 30 Oct 2018, 12:43:14 UTC - in response to Message 58924.  

Under the heading of Orphaned WU's if the system has a WU that has not Trickled in 90 days and the WU is not one of the known groups with Trickle issues. Could these be aborted by the Project and reissued ?


I would not mind, but it would not help me at all.

I got two work units 2018 October 15 that completed 2018 October 17.
The previous two work units I received were 2018 February 14 and completed 2018 Feb 29 and March 1.
I got two before that 2017 October 11 that completed 2017 October 25.

A big reason for my receiving the paucity of work units is that I run a Linux system with a 4-core processor. And no matter what fiddling with completion dates, it will not get me more work units. And the work units I do get send only one work unit near the beginning, and none after that -- none that are acknowledged anyway, and no credits ensue after that, even though they eventually complete correctly.
ID: 58927 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58930 - Posted: 30 Oct 2018, 20:35:37 UTC - in response to Message 58925.  

Perhaps a simpler system would be to just reduce the time limit on tasks?

Yes! The climate will have changed before some of them get back.
ID: 58930 · Report as offensive
ProfileBill F

Send message
Joined: 17 Jan 09
Posts: 124
Credit: 2,026,181
RAC: 2,642
Message 58931 - Posted: 31 Oct 2018, 3:48:27 UTC - in response to Message 58930.  

Well to restate my original thought about a WU with no Trickle in 90 Days (assuming that it is not one of the batches known to have trickle issues.

If a USER has so many WU's queued that he can't get it started in 90 days they have too many WU's and needs to release so others can do science.

OR

If the user has reloaded an OS and no longer has access to the "assigned" WU to work it... Then it is Orphaned and should be re-cycled

How hard would this be to implement ?

Bill F
In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 58931 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58932 - Posted: 31 Oct 2018, 6:46:52 UTC

I'll ask. Might take a while.
ID: 58932 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58934 - Posted: 31 Oct 2018, 14:35:38 UTC

"Project Aborts" aren't possible, but changes are going to be made. Which will take time, I guess.

These include deprecating the older, unused applications, (so those numbers will disappear from the total), and using "project kill" to remove the old, not needed, tasks from people's computers. This should also improve the appearance of the total number running.

Already used, is "no resends" when a batch is closed.
ID: 58934 · Report as offensive
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 58935 - Posted: 31 Oct 2018, 15:28:49 UTC
Last modified: 31 Oct 2018, 15:29:32 UTC

Back to the orginial reason for this thread. Does anyone know when there will be new work? I am starting to see empty cores on my machines.
ID: 58935 · Report as offensive
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 58936 - Posted: 31 Oct 2018, 19:04:37 UTC - in response to Message 58935.  
Last modified: 31 Oct 2018, 19:08:19 UTC

Not me. But it might be next academic term (Quarter or Semester) before another deluge of batches is dropped into the queue -- as we experienced after Summer break. (That is conjecture, admittedly of limited value.)
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 58936 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 58937 - Posted: 31 Oct 2018, 20:29:58 UTC - in response to Message 58935.  

I am starting to see empty cores on my machines.


Not me, because, even though I get very few work units (I run Linux), I run also Seti@home, Rosetta, and World Community grid.

My priorities are CPDN 44%, WCG 30%, Seti 13%, and Rosetta 13%.

Back when there were Linux work units, I usually kept three cores running CPDN 24/7 for weeks or months at a time. But in the last year or so, only four work units altogether.
ID: 58937 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 58938 - Posted: 31 Oct 2018, 21:16:44 UTC - in response to Message 58937.  

Not me, because, even though I get very few work units (I run Linux), I run also Seti@home, Rosetta, and World Community grid.

My priorities are CPDN 44%, WCG 30%, Seti 13%, and Rosetta 13%.

Windows is the problem. What works the best for me is Rosetta, set to the 24-hour work units, since that best approximates the length of the CPDN work. Then, I set the Rosetta resource share to "0", so that it downloads work only when needed. I don't like to do that on other projects where the work units are short, since the PC must keep downloading new ones.

But I have long advocated reducing the timeouts also, to prevent hoarding the work and allowing it to reach the people who can turn it around more quickly. Isn't that what the project (any project) is for?
ID: 58938 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 58943 - Posted: 2 Nov 2018, 22:47:04 UTC

There are six new batches of SAM25 models, Batch #760-765, in a range of model durations - 60 month, 61 month, 72 month and 73 month (batch list).
ID: 58943 · Report as offensive
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 30,967,615
RAC: 14,422
Message 58944 - Posted: 2 Nov 2018, 23:57:02 UTC - in response to Message 54840.  
Last modified: 2 Nov 2018, 23:58:33 UTC

Batch 760 which are 60 month runs and batch 761 which are 72 months. Long estimated times.
ID: 58944 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58945 - Posted: 3 Nov 2018, 0:25:51 UTC

And they're sam25's.

That's climate research for you. :)
ID: 58945 · Report as offensive
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 58946 - Posted: 3 Nov 2018, 1:54:54 UTC
Last modified: 3 Nov 2018, 1:57:29 UTC

Joyful news! Nothing new in SAM25 land!

Had three contaminate my i7, one each from batches 763/4/5 (initial estimate in late 40s of days! [Haven't seen anything like that in a long time.]).

They died the premature death of their earlier peers: two in 3m 21s, one in 3m 20s.

{Edited for typo.)
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 58946 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58947 - Posted: 3 Nov 2018, 2:19:29 UTC

They sure don't seem to like Windows.

I've started up my Linux/WINE machines again.
Might get some re-do's.
ID: 58947 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,803,756
RAC: 5,187
Message 58949 - Posted: 3 Nov 2018, 9:44:44 UTC

One out of two running: the failed one had already performed the two-minute self-destruct on two other machines before doing the same on mine.
ID: 58949 · Report as offensive
mngn

Send message
Joined: 13 Jul 18
Posts: 38
Credit: 62,933,508
RAC: 84,702
Message 58950 - Posted: 3 Nov 2018, 14:11:22 UTC
Last modified: 3 Nov 2018, 14:12:08 UTC

Been running three 764 and 765 tasks for 21 hours now and doing fine. The computer is a virtual machine, XP 32-bit in VirtualBox on Ubuntu 16.04.
https://www.cpdn.org/cpdnboinc/show_host_detail.php?hostid=1469687

The estimated time is 5-6 months. Should I let them run or abort?
ID: 58950 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58951 - Posted: 3 Nov 2018, 14:17:54 UTC

RUN!

The "early failings" are a problem with people's computers.
Once you're past that, it should be smooth sailing.

And running Linux with WINE also works.
ID: 58951 · Report as offensive
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org