Message boards : Number crunching : New work discussion - 2
Message board moderation
Previous · 1 . . . 38 · 39 · 40 · 41 · 42 · Next
Author | Message |
---|---|
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Looks like the upload servers need a major upgrade. They're uploading, so the server is running, but they keep getting stuck halfway. One of them has retried (automatically, I didn't nudge it) 147 times! Can I assume your server supports continuing a half done upload? If not, you're making it worse. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Since your forum refuses to let me edit my own post, I'll have to write another one. It seems your server does allow continuing a stuck file where it left off, that's something. I nudged a few when I saw one working and managed 11 minutes of uploading, but now none will go again. Something needs steroids. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,028,039 RAC: 20,189 |
Since your forum refuses to let me edit my own post, I'll have to write another one.You have an hour to decide you want to make changes. That is the default in the BOINC server software so is the same on most projects. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
You have an hour to decide you want to make changes. That is the default in the BOINC server software so is the same on most projects.I never said this project was unique. Just the policy is insane. The only purpose I can think of to stop me editing older ones is so there aren't people who've responded to a now outdated message. But.... lots of people respond in under an hour. If there's going to be an anti-edit function, it should be "has there been a reply afterwards?" |
Send message Joined: 23 Feb 05 Posts: 7 Credit: 1,423,261 RAC: 213 |
Rather than further derailing the wrong thread I'll reply here instead... It may be a waste of electricity but my practice is to let them run rather than have them go to another machine if I abort where they may sit for another year. That way at least they get cleared from the "Tasks in progress" column on the server status page. OK, I'll let it run for now. Still, is there no contact for the DOCILE project we can ask about it? I think we really shouldn't be emitting more CO2 than necessary on this project of all projects. BTW if you weren't aware, the project admins can use either scripts or web interface to cancel workunits. Though it's possible that there isn't enough options to make it easy enough. https://boinc.berkeley.edu/trac/wiki/CancelJobs |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I think we really shouldn't be emitting more CO2 than necessary on this project of all projects.ROFL, buy a houseplant. |
Send message Joined: 23 Feb 05 Posts: 7 Credit: 1,423,261 RAC: 213 |
ROFL, wait til you find out what happens when plants die. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
ROFL, wait til you find out what happens when plants die.They get turned into coal and the carbon is lost. Until we put it back for them. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,440,799 RAC: 14,227 |
A heads-up on further batches. A new Weather-at-Home NZ25 batch is being prepared & tested. It will go out before the Christmas break (the NZ25 config does not suffer some the same level of failures as the recent East Asia batch). A new HadAM4 model batch is in preparation but needs more time for setup & testing. It's anticipated this will go out beginning of the new year. A new OpenIFS multi-core app is also under test & development. It's possible a larger scale test on the main site to all volunteers will go out before Christmas -- there will be more news about this before it's sent. --- CPDN Visiting Scientist |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
Ooooh. Very exciting! I suppose I should set up a few Windows VMs for W@H processing over on my Linux hosts. Though I get the impression there's no shortage of hungry Windows compute nodes laying around. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,028,039 RAC: 20,189 |
Ooooh. Very exciting! I suppose I should set up a few Windows VMs for W@H processing over on my Linux hosts. Though I get the impression there's no shortage of hungry Windows compute nodes laying around. There isn't but getting the results back as quickly as possible is still good. I don't think there will be an issue with using WINE with this batch and I get about 15-20% increase in speed using WINE rather than a VM. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,440,799 RAC: 14,227 |
Except with a VM you can use 'save state' which avoids the dreaded task fail on restart because it preserved the exact machine restart... ahh.. but then WINE does something weird with segv anyway :DOoooh. Very exciting! I suppose I should set up a few Windows VMs for W@H processing over on my Linux hosts. Though I get the impression there's no shortage of hungry Windows compute nodes laying around.There isn't but getting the results back as quickly as possible is still good. I don't think there will be an issue with using WINE with this batch and I get about 15-20% increase in speed using WINE rather than a VM. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,028,039 RAC: 20,189 |
ahh.. but then WINE does something weird with segv anyway :DIt still loses some on shutdown and restart, just not nearly as many as Windows. Also, I tend to shutdown with sleep or hibernate anyway which also avoids that particular problem. |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
Oooo - just looked at my current two tasks must have survived two shut-downs as they each have over 12 hours of processing and it's only a few minutes since the last reboot. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,440,799 RAC: 14,227 |
Ah, that's useful to know. From previous conversation I thought WINE never had those errors.ahh.. but then WINE does something weird with segv anyway :DIt still loses some on shutdown and restart, just not nearly as many as Windows. Also, I tend to shutdown with sleep or hibernate anyway which also avoids that particular problem. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
There isn't but getting the results back as quickly as possible is still good. I don't think there will be an issue with using WINE with this batch and I get about 15-20% increase in speed using WINE rather than a VM. Interesting, I don't think I've got that set up - I'll have to mess around with it. I don't have a GUI on these systems, though - will WINE/BOINC/etc get along without a GUI environment installed? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,028,039 RAC: 20,189 |
Yes, wine will work without a GUI. I have some time ago on a system with much less memory run BOINC under WINE without a GUI but would probably have to relearn a bit to do it now. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,440,799 RAC: 14,227 |
You may not need to go to the trouble of setting this up because I'm working on an updated Linux version of weather at home, which we'll use as the code for the windows version. If I get it tested quick enough, it might be ready for the next batches, though most likely not before Christmas. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
Very interesting, a Linux version would be most welcome! I don't know the details of the code or how it profiles memory-wise, but I know there are gains to be had from using large pages in allocations - far fewer TLB misses on large memory footprint code. If you're reworking stuff, that might be worth looking into. |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,440,799 RAC: 14,227 |
TLB misses aren't significant on the small memory config used for WaH. Really only makes a difference at the higher resolutions. |
©2024 cpdn.org