Message boards : Number crunching : New work Discussion
Message board moderation
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 91 · Next
Author | Message |
---|---|
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
And batch 742 has been paused while thinking is in progress. Yes, crunch away. I am. zips at about every 8%, 92.6+ Megs for each, about 10 days total crunching. Sending back lots of data is a good way to help, either to find out what's wrong, or simply to return good data, if that's what the un-failed models are doing. Only the researchers can search the vast amounts of data to see what's what. (By "paused", they mean that downloads from this batch have been stopped.) |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
And batch 742 has been paused while thinking is in progress. Thinking has been done and the people at Oxford are certain the tasks that don't crash are producing worthwhile data. The sending out of these tasks has therefore resumed. The entire purpose of BOINC is to enable multiple projects to be run on individual PC's, not supercomputers. Dinking around with the global settings inherent in BOINC to PERHAPS stabilize one project - i.e., CPDN - at the risk of destabilizing other BOINC-related projects - i.e., SETI, LHC, Cosmology, Milky Way, etc, etc, etc - is NOT a solution and is in fact foolhardy. There are many reasons for tinkering about with the global settings of BOINC, These are mostly related to how different projects play together or how any other programs running on the computer work alongside BOINC. The settings which reduce the chances of CPDN tasks crashing are likely to reduce the chances of tasks crashing from other projects also, though crashes I personally have had on other projects with the exception of the Android platform which isn't supported by CPDN have all crashed on all the other computers they have run on and not showed the frustrating pattern or sometimes lack of pattern that crashes with CPDN show. I would say that with regards to data being useful, past history has shown that a high percentage of crashes with the sementation violation has never rendered the data from the tasks which do complete invalid. |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Batch 745 is I think about 1,000 eu25 13month tasks. Possible it may be more and not showing on the server status page yet. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Front page says batch 745 is 20,000 (wow), but the status page says 10,000, and I think a lot of those are batch 742. So either tasks are going fast, or 745 hasn't been fully released yet. And the trickle program has stopped running. I'll go and see if anyone's home. |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Front page says batch 745 is 20,000 (wow), but the status page says 10,000, and I think a lot of those are batch 742. Now wondering if I misread the numbers when I said, 1,000 or if that's all there were when I looked. It wasn't up on the front page then so any misreading would have been my looking at the workunits. |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Batch 746: EUR25 2010-2016 with 10newPP 8,600 simulations. (from front page.) |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
[Les Bayliss wrote:]Front page says batch 745 is 20,000 (wow), but the status page says 10,000, and I think a lot of those are batch 742. Split batch is what I'm seeing too: batch list. More to come if the front page total is right. |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,803,756 RAC: 5,187 |
Some long models just added: batch #747 PNW at 25 km for 121 months (2000) and 61 months (1000) (batch list). |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Batch 748 200 Hadcm3s tasks and batch 749 420 Hadcm3s tasks And THEY RUN ON LINUX!!! Well start at least, mine are 5 minutes in with no problems so far. Edit, all four have checkpointed. Will report back when they have been going a bit longer. And they won't last long! |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Batch 748 200 Hadcm3s tasks and batch 749 420 Hadcm3s tasks And Perhaps they do, but since I get so few work units (none lately) my BOINC client now queries the server only once every three days, so unless three days worth of LINUX-worthy work units turn up, I am unlikely to get any. |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
And server status page showing just one left now, though if as I suspect these are Linux or Linux and Mac only, some will come back in because of people without the 32bit libs installed. Mine are running at just over 3.5hours/1% Currently a bit over .6%completed.The danger point when some batches have failed is just before completion of first zip. |
Send message Joined: 16 Jun 05 Posts: 16 Credit: 19,439,110 RAC: 9,127 |
And server status page showing just one left now, though if as I suspect these are Linux or Linux and Mac only, some will come back in because of people without the 32bit libs installed. Mine are running at just over 3.5hours/1% Currently a bit over .6%completed.The danger point when some batches have failed is just before completion of first zip. I got one of the Linux WU on my 64-bit Linux machine and it seems to be running fine. There is a lot of "HOW TO run 32-bit dynamic apps on 64-bit Linux" information about making sure that a 64-bit installation has the right 32-bit libraries. Seems like a pretty easy to check to make sure the right 32-bit libraries are installed is by writing a small 32-bit test app that needs the same libraries. Seems like the KEY would be the build. 32-bit COMPILED to "a.out" with the command line that forces the correct libraries to be present: g++ h.cpp -m32 -lpthread -ldl -lstdc++ -lm -lgcc_s -lc -lz -lnsl Example of any c++ program (Hello World): h.cpp cat h.cpp #include <iostream> using namespace std; int main (int argc, char** argv) { cout << "Hello world!" << endl; return 0; } 32-bit libraries I needed for my 32-bit creation to say "Hello World": ldd a.out linux-gate.so.1 (0xf7fcd000) libpthread.so.0 => /lib/libpthread.so.0 (0xf7f7d000) libdl.so.2 => /lib/libdl.so.2 (0xf7f78000) libstdc++.so.6 => /lib/libstdc++.so.6 (0xf7df3000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7dd6000) libz.so.1 => /lib/libz.so.1 (0xf7dbd000) libnsl.so.1 => /lib/libnsl.so.1 (0xf7da2000) libm.so.6 => /lib/libm.so.6 (0xf7ca8000) libc.so.6 => /lib/libc.so.6 (0xf7b10000) /lib/ld-linux.so.2 (0xf7fcf000) Notice the same 32-bit libraries I needed for CPDN application: ldd *gnu *gnu.so hadcm3s_8.34_i686-pc-linux-gnu: linux-gate.so.1 (0xf7fd1000) libpthread.so.0 => /lib/libpthread.so.0 (0xf7f81000) libdl.so.2 => /lib/libdl.so.2 (0xf7f7c000) libstdc++.so.6 => /lib/libstdc++.so.6 (0xf7df7000) libm.so.6 => /lib/libm.so.6 (0xf7cfd000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7ce0000) libc.so.6 => /lib/libc.so.6 (0xf7b48000) /lib/ld-linux.so.2 (0xf7fd3000) hadcm3s_um_8.34_i686-pc-linux-gnu: linux-gate.so.1 (0xf7f69000) libdl.so.2 => /lib/libdl.so.2 (0xf7f33000) libm.so.6 => /lib/libm.so.6 (0xf7e39000) libpthread.so.0 => /lib/libpthread.so.0 (0xf7e1a000) libc.so.6 => /lib/libc.so.6 (0xf7c82000) /lib/ld-linux.so.2 (0xf7f6b000) hadcm3s_se_8.34_i686-pc-linux-gnu.so: linux-gate.so.1 (0xf7f2d000) libz.so.1 => /lib/libz.so.1 (0xf7e59000) libnsl.so.1 => /lib/libnsl.so.1 (0xf7e3e000) libstdc++.so.6 => /lib/libstdc++.so.6 (0xf7cb9000) libm.so.6 => /lib/libm.so.6 (0xf7bbf000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7ba2000) libc.so.6 => /lib/libc.so.6 (0xf7a0a000) /lib/ld-linux.so.2 (0xf7f2f000) |
Send message Joined: 15 Dec 12 Posts: 8 Credit: 535,242 RAC: 0 |
It's been quite a while, but for the first time since the melt down, my iMac has gotten three projects to run that will take 2 days and 22.5 hrs. Keep them coming! |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Just noticed, the 748's are only 12 months while the 749's are 120months. |
Send message Joined: 17 Jan 09 Posts: 124 Credit: 2,025,353 RAC: 2,666 |
Interesting before the Great Crash ... CPDN... not the Stock Market. We had so many users active that getting a WU was a prize. We have lost so many active users that WU's are laying in the system begging to be taken. How times change. Bill F Dallas TX In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic; There was no expiration date. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Get them while you can. Things are about to change. :) And something is seriously wrong with your i5-5200U. Perhaps it's still using the "training wheels" setting ? |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
Some information in the batch list about problems with a batch would be useful, e.g. what to do with it or a link to a message about it. |
Send message Joined: 15 May 09 Posts: 4536 Credit: 18,993,249 RAC: 21,753 |
Some information in the batch list about problems with a batch would be useful, e.g. what to do with it or a link to a message about it. Is there a specific batch you are having problems with? If so some of us may be able to respond with some more information. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,967,615 RAC: 14,422 |
Certainly a lot of computing error failures with batches 738 and 742 if that helps. There is an error thread further down number crunching. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Batch 738 had a set up error. (INANCILA, which means mismatched data files.) A message was posted to Abort them. ****************** Batch 742, the sam25's. Yes, there were a lot of failures with these. The project person checked everything and couldn't find anything wrong. And I had run several that had both 1 failure and 2, and there weren't any problems. So we decided it was most likely just people's computers, and a sensitive modeling area. And I've since run lots of sam25's that have failed on other computers, all with no problems. Possibly a lot of people were/are running BOINC with the default "training wheels" settings. This apparently works well with other projects, but is a computing hazard with cpdn. |
©2024 cpdn.org