Message boards : Number crunching : The old -131 (file size too big) shows up again
Message board moderation
Author | Message |
---|---|
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
It does annoy me that this error crashes the model right at the end. In this case a 20 day jobbie. What a waste. This one. https://www.cpdn.org/result.php?resultid=22153705 3 other finished ok |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
It does annoy me that this error crashes the model right at the end. In this case a 20 day jobbie. What a waste.I raised this with Sarah after getting the same on one of either #920 or 921. I think on 922 the file size was increased. If you have a reasonably fast connection and internet access isn't turned off as the task nears its end, the zip gets uploaded before the file size check when the task actually finishes and you don't get a problem. If there are other files getting uploaded already when the zip is created, this can on my bored band cause the problem. I think on #922 Sarah increased the limit. I went through all of mine from those batches and added a 0 to the end of the file size in client_state.xml but that intervention is at the user's own risk. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
I thought these error had been fixed. But there was 2 w/u finishing at the same time and the uploads were slow. I know there is a fix to change the xml file but it requires a w/u restart?? and we all know how hit & miss that can be. I dident even look to edit any files. Why the w/u cannot be sent out with a large file size set as default?. I'm sure there is a technical reason. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
I thought these error had been fixed. But there was 2 w/u finishing at the same time and the uploads were slow. I know there is a fix to change the xml file but it requires a w/u restart?? and we all know how hit & miss that can be. I dident even look to edit any files. Why the w/u cannot be sent out with a large file size set as default?. I'm sure there is a technical reason. Certainly batch 124 has the line <max_nbytes>200000000.000000</max_nbytes> for the zip files compared to 150000000.000000 for #120 and #121 so I think it has now been increased for new batches since I raised it with the project. And I have changed my initial reply to reflect the fact that it is client_state.xml not cc_config.xml that I edited. Doubtless as computers get faster, at some point the file sizes will increase still further with more complex models being computed. (In testing OpenIFS tasks have had uploads of over 500MB recently. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
As Dave said, the limit was increased some time ago. So all but the really slow connections should be OK. But then, it depends on the computer. e.g. One that's running 64 tasks at once, all finishing at about the same time can STILL have problems, even with a very large limit. With a slow connection, I'd suggest that you Suspend all except one task, wait until it's had a few hours head start, then Resume them one at a time, allowing a few hours before the next one. That way they won't all finish at once. Also, that computer needs lots more memory for this new generation of models. 2-3 gigs per core. |
Send message Joined: 7 Aug 04 Posts: 2187 Credit: 64,822,615 RAC: 5,275 |
While by far most of the hadam4h N216 models in the queue to be sent out are from batches 922 to 925, there are still quite a few from 920 and 921. They can't change the xml files on the server for the batches already issued so they can't fix this for 920 and 921 server-side. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I've just remembered something, vaguely, about the more recent batches having a check on the memory that a computer has, before download. The batches in the 900's need lots more than the ones from a year ago. If this has been implemented, it may explain why that computer is getting all old tasks, with the smaller upload limit, and not the more recent ones. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
If this has been implemented, it may explain why that computer is getting all old tasks, with the smaller upload limit, and not the more recent ones.I think it would pass the test for sufficient memory as before my laptop died it could still get 4 OpenIFS tasks despite only having 8GB of RAM. The test only checks to see if there is enough memory to run one task rather than whether it has enough to run tasks on all cores. Or that is my understanding at any rate. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
Well the machine might only have 12gig of memory but with 4 models running it still uses very little swap space. I try to fill the other 4 threads with low impact tasks such as covid tasks from WCG or other projects. If I run 4 ARP & climate at once it all grinds to a slow crawl. But still little swap used. Just 4 models at once seems ok. I know an I7 processor is only a glorified I5 processor..... once upon a time a P4 was the mighty processor. I still have a couple of P75 machines..... Regardless of the size of the processor etc, why cant the file size be set at the maximum size? Upload speed can be slow for a variety of reasons. Anyway..... whats a decent upgrade. A Ryzen 7/motherboard combo?. We always want more speed. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Regardless of the size of the processor etc, why cant the file size be set at the maximum size? Upload speed can be slow for a variety of reasons.As has been said, the limit has been increased on newer batches. Anyway..... whats a decent upgrade. A Ryzen 7/motherboard combo?. We always want more speed.I went for a Ryzen7 with 32 GB of RAM. I am now thinking of swapping that 32GB for 64GB of slightly faster RAM. While they have yet to make it out of the testing branch of the project, using over 5GB of memory per task. My upgrade was from a core2duo desktop. Laptop was 4 cores with 8GB of RAM. Running more than 5 N216 tasks still results in a slow down as that reaches the limit of the cpu cache memory. It will be interesting to see how much faster RAM mitigates that slow down. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
I used to take the matching of cpu to m/b and type of ram very seriously. This was many years ago when a P3 700 mhz slot2 xeon with 2 meg of L2 cache was the biz. I still have that machine (dual slot2). But today its utterly useless for ..... well, anything. Fedora core 4 is not quite leading edge anymore. Nowadays I am content to copy what others do. I have spent a small amount of time looking at the various Ryzen setups. It would be nice to have a reasonably modern bit of kit for a while. I doubt I will be around long enough to see it consigned to the trash can like the dual slot2. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I suspect that you main problem is with your connection speed to the internet. If this is slow, then the collection of files at the end get clogged up trying to upload. You should get about half an hour from the time the last big zip is created, and the rest start showing up. |
©2024 cpdn.org