climateprediction.net (CPDN) home page
Thread 'New work discussion - 2'

Thread 'New work discussion - 2'

Message boards : Number crunching : New work discussion - 2
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 36 · 37 · 38 · 39 · 40 · 41 · 42 · Next

AuthorMessage
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69633 - Posted: 17 Sep 2023, 14:43:50 UTC - in response to Message 69628.  

Do you run them on virtualbox or natively on linux?
I only run native Linuc WUs and never use virtualbox.
You got LHC to run natively? Their instructions are in Greek. I can't get their native stuff to work and I can't get their Squid to work either, why can't they make their tasks share big datasets instead of downloading them 24 times for 24 cores?
ID: 69633 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69634 - Posted: 17 Sep 2023, 14:46:57 UTC - in response to Message 69629.  

I've only found two coins which aren't scams, Gridcoin and Curecoin. They give you coins for work you're already doing in Boinc and Folding@Home, not ask you to run mindless calculations for other people.
I found it was absolutely trivial to setup and run DNX. The early miners had bugs.
The problem is the virus inside it.

GRC and CURE are scams too. All supply and no demand.
How can it be a scam if I end up with a small amount of cash instead of no cash for running Boinc?

Nobody actually needs cryptocurrency
It was a nice idea, we could have got away from banks and governments, but they lack the ability to remain stable. I don't want to have 50 grand of money which might be 1 grand tomorrow.

Someone should figure out a way to pay workers using real money so they cover their electric bills. Somebody will but crypto is not the answer.
Solar is the answer, stop buying power off the grid. 10x cheaper....
ID: 69634 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 69635 - Posted: 17 Sep 2023, 17:54:31 UTC - in response to Message 69630.  

Einstein doesn't freeze my computers. Boinc removes tasks if the memory is too full.


Einstein does not freeze my computers either.
I do not know if Boinc removes tasks if memory is too full, whatever that means. I know Linux can do that, but I have never had it happen and I have been running Linux since about 1998 (Red Hat not enterprise Linux 5 to begin with).
I am currently running Red Hat Enterprise Linux release 8.8 (Ootpa)
ID: 69635 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69636 - Posted: 17 Sep 2023, 18:04:01 UTC - in response to Message 69635.  

I do not know if Boinc removes tasks if memory is too full, whatever that means.
Boinc is set to use x% of RAM. If that is exceeded, it will suspend a task. Just as it won't start one if there isn't room. You will then notice only 6 tasks running on an 8 core computer for example.

I know Linux can do that, but I have never had it happen and I have been running Linux since about 1998 (Red Hat not enterprise Linux 5 to begin with).
I am currently running Red Hat Enterprise Linux release 8.8 (Ootpa)
I would hope an OS would never do that. The application could be important and have unsaved work. Or did you mean swap to disk?
ID: 69636 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 69637 - Posted: 17 Sep 2023, 19:19:42 UTC - in response to Message 69636.  
Last modified: 17 Sep 2023, 19:35:46 UTC

I know Linux can do that, but I have never had it happen and I have been running Linux since about 1998 (Red Hat not enterprise Linux 5 to begin with).
I am currently running Red Hat Enterprise Linux release 8.8 (Ootpa)

I would hope an OS would never do that. The application could be important and have unsaved work. Or did you mean swap to disk?


I do not mean swap to disk.

https://neo4j.com/developer/kb/linux-out-of-memory-killer/

This one is probably better:

https://rakeshjain-devops.medium.com/linux-out-of-memory-killer-31e477a45759
ID: 69637 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69638 - Posted: 17 Sep 2023, 19:29:32 UTC - in response to Message 69637.  

I do not mean swap to disk.

https://neo4j.com/developer/kb/linux-out-of-memory-killer/
Wow, that's pure insanity, but being Linux it doesn't surprise me. Swapping would be more sensible.
ID: 69638 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 69639 - Posted: 17 Sep 2023, 19:41:25 UTC - in response to Message 69638.  
Last modified: 17 Sep 2023, 19:42:55 UTC

Wow, that's pure insanity, but being Linux it doesn't surprise me. Swapping would be more sensible.


I would not blame Linux. And when things get so bad as to run the system out of memory, swapping may not be possible: buffers would be required to do the swap, and there is proibably no space for the needed buffers.

As I said earlier, in over 20 years of running Linux, this has never happened to me.
ID: 69639 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69640 - Posted: 18 Sep 2023, 5:03:10 UTC - in response to Message 69639.  
Last modified: 18 Sep 2023, 5:03:27 UTC

I would not blame Linux. And when things get so bad as to run the system out of memory, swapping may not be possible: buffers would be required to do the swap, and there is proibably no space for the needed buffers.

As I said earlier, in over 20 years of running Linux, this has never happened to me.
Windows swaps. Just don't let it get that bad. It's like letting your car run completely out of petrol.
ID: 69640 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,445,768
RAC: 14,599
Message 69641 - Posted: 18 Sep 2023, 11:33:59 UTC - in response to Message 69636.  

I do not know if Boinc removes tasks if memory is too full, whatever that means.
Boinc is set to use x% of RAM. If that is exceeded, it will suspend a task. Just as it won't start one if there isn't room. You will then notice only 6 tasks running on an 8 core computer for example.
That's not accurate. The boinc client ignores the memory requirement of the task specified in the workunit XML when deciding whether to start it or not. It only checks the memory when the task is running. This is why we have problems with OpenIFS tasks with their Gbs requirement. If a host get multiple OIFS tasks but does not have enough RAM to run all concurrently, the client still starts them and they will crash because they run out of memory. The client doesn't check frequently enough to catch the rapid increase in RAM of the processes.

It's a known longstanding 'issue' with the boinc client and why CPDN, like LHC, have to limit the no. of tasks sent to volunteers for large memory tasks. That's the only workaround.
---
CPDN Visiting Scientist
ID: 69641 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,029,695
RAC: 19,917
Message 69642 - Posted: 18 Sep 2023, 11:45:06 UTC
Last modified: 18 Sep 2023, 11:54:27 UTC

Another known longstanding 'issue' is that many of us who have used BOINC for a long time have picked up bits and bobs of information that we have taken as gospel, spouted them out for years and only when someone who really knows either the BOINC code or the details of what the task code does sees us spouting them do we learn the truth! I could point to probably half a dozen things I had assumed to be the case that you have put me right on Glen, many of them, things I picked up from those who have been involved with BOINC even longer than I have!
Edit:
The boinc client ignores the memory requirement of the task specified in the workunit XML when deciding whether to start it or not. It only checks the memory when the task is running.
I assume this been raised over at git-hub though a quick search didn't find anything.
ID: 69642 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,445,768
RAC: 14,599
Message 69643 - Posted: 18 Sep 2023, 11:50:23 UTC - in response to Message 69642.  

Another known longstanding 'issue' is that many of us who have used BOINC for a long time have picked up bits and bobs of information that we have taken as gospel, spouted them out for years and only when someone who really knows either the BOINC code or the details of what the task code does sees us spouting them do we learn the truth! I could point to probably half a dozen things I had assumed to be the case that you have put me right on Glen, many of them, things I picked up from those who have been involved with BOINC even longer than I have!
To be fair, I assumed the client did look at the memory of the task before starting it - because that's how it 'looks like it should work' in the client settings. Why would the memory behaviour be any different to behaviour with disk limits. It was only when investigating with OpenIFS and talking to LHC did we find out.
---
CPDN Visiting Scientist
ID: 69643 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69644 - Posted: 18 Sep 2023, 13:13:30 UTC - in response to Message 69643.  

I've seen it refusing to start tasks due to not enough memory. I guess that might be because it's already reached the limit, or maybe it started one then stopped it, but I'm 90% sure it looked like it was being sensible. It never overloads the computer anyway, worst case scenario, a task starts, stops, then has to start again another time.

LHC impose no limits. Any limits have to be inserted yourself in app_config. Or I guess on their webapge, but you have to make the choice and work out how many.
ID: 69644 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,445,768
RAC: 14,599
Message 69648 - Posted: 27 Sep 2023, 9:47:18 UTC

A heads-up that there will be a new batch for WAH2 coming out in the next week or two. This is the resend of the East Asia (Korean) batch that had a lot of problems. It's a new configuration which has been tested to behave better.

Also further ahead is another New Zealand WAH2 batch.
---
CPDN Visiting Scientist
ID: 69648 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,445,768
RAC: 14,599
Message 69660 - Posted: 5 Oct 2023, 15:17:45 UTC

Next East-Asia (eas25) batch being released now (Windows only).

Any problems, please report in new thread rather than this one. Thx.
---
CPDN Visiting Scientist
ID: 69660 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,710,763
RAC: 8,968
Message 69661 - Posted: 5 Oct 2023, 15:45:24 UTC

Collected one task on each of two machines, to see how they run. Both made a clean start, looking good so far.
ID: 69661 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 69662 - Posted: 5 Oct 2023, 17:00:16 UTC - in response to Message 69661.  

Got one on my pipsqueek Windows10 machine and it has over 15 minutes on it so far. Predicting 9 days 18 hours to go.
Task 22340449
Computer 1512658
ID: 69662 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69781 - Posted: 12 Oct 2023, 15:24:46 UTC

I'm getting a lot of resends, and I'm breaking them myself too. Looks like you haven't fixed the problem of the task going wrong if the computer is rebooted. People reboot because they want to, or because the computer crashes, or because Windows Update illegally restarts your machine which isn't their property for updates (I've tried everything, M$ keeps resetting the workarounds). How hard can it be to write proper checkpoints?
ID: 69781 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,029,695
RAC: 19,917
Message 69796 - Posted: 13 Oct 2023, 6:30:44 UTC - in response to Message 69781.  

M$ keeps resetting the workarounds).

Have you tried blocking the MS domain in your router. Those I know who have done that find it works pretty well.
How hard can it be to write proper checkpoints?
Perhaps you should learn Fortran and volunterr your services?
ID: 69796 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 69798 - Posted: 13 Oct 2023, 6:59:41 UTC - in response to Message 69796.  

M$ keeps resetting the workarounds).
Have you tried blocking the MS domain in your router. Those I know who have done that find it works pretty well.
But I want updates, just no reboots until I say so.

How hard can it be to write proper checkpoints?
Perhaps you should learn Fortran and volunterr your services?
I feel it's not the programming at fault, but the common sense. It writes checkpoints, presumably something daft is happening like deleting the old one before creating the new one. I see no reason even cutting the machine's power should ever cause a problem.
ID: 69798 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,445,768
RAC: 14,599
Message 69814 - Posted: 13 Oct 2023, 13:13:50 UTC - in response to Message 69798.  

]I feel it's not the programming at fault, but the common sense. It writes checkpoints, presumably something daft is happening like deleting the old one before creating the new one. I see no reason even cutting the machine's power should ever cause a problem.
If it was something daft it would have been fixed ages ago. It's more subtle than that. You see no reason because you don't understand the problem or the way the models work. If I had a quid for every time you moaned about something on the forums I'd have enough money to employ someone to fix it :D .. in the meantime you'll have to wait and remember (a) these models are not designed to run on systems that can be shutdown instantly, (b) CPDN has little money for developers and the staff they do have are stretched with teaching & research as well as CPDN.
ID: 69814 · Report as offensive
Previous · 1 . . . 36 · 37 · 38 · 39 · 40 · 41 · 42 · Next

Message boards : Number crunching : New work discussion - 2

©2024 cpdn.org