Message boards : Number crunching : Missing Zip file 13?
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
I just completed WU Hadam3p_eu_2tlp_1996_1_008161279_0. I know that one of the upload severs is down. Zip files 8,9,10,11 and 12 are backed up in the transfer tab and will remain there until the upload server is back on line. That�s not my problem. The problem (if there is one) is that I can�t find any trace of zip file 13. It is not in the transfer tab. I can�t find any trace of is in event log either. Did it somehow get lost. I know that this is a very important file as it contains the dump that allows the server to generate the next segment of the model. I have a recent backup so I can do a restore and run it to the end again if that would create the missing file |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
This is the event log for the above posting. As you can see zip files 11 and 12 were created and tried to upload, but, couldn�t because of the non-working upload server. What I don�t see is any mention of zip 13. The WU is still listed at 100% and �ready to report�. hadam3p_eu_2tlp_1996_1_008161279_0_11.zip 9/2/2012 12:29:37 PM | climateprediction.net | Started upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip 9/2/2012 12:29:42 PM | climateprediction.net | Restarting task hadcm3n_u3sm_1980_40_008026584_4 using hadcm3n version 607 in slot 3 9/2/2012 12:29:42 PM | climateprediction.net | Restarting task hadam3p_eu_9c98_1960_1_008138976_0 using hadam3p_eu version 609 in slot 0 9/2/2012 12:29:59 PM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2tlp_1996_1_008161279_0_11.zip: connect() failed 9/2/2012 12:29:59 PM | climateprediction.net | Backing off 1 hr 17 min 18 sec on upload of hadam3p_eu_2tlp_1996_1_008161279_0_11.zip 9/2/2012 12:29:59 PM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip: connect() failed 9/2/2012 12:29:59 PM | climateprediction.net | Backing off 16 min 52 sec on upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip 9/2/2012 12:30:07 PM | | Project communication failed: attempting access to reference site 9/2/2012 12:30:09 PM | | Internet access OK - project servers may be temporarily down. 9/2/2012 12:37:13 PM | | Suspending network activity - user request |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,719,896 RAC: 7,946 |
The _13.zip files for EU models are sent to a different upload server from the _1 to _12 files. That server is still running, and my _13 files have uploaded too. If you are running a recent version of BOINC, it will try every new file transfer once (upload or download), even if other files and the project as a whole is in transfer backoff. That's designed to cope with exactly the situation we're in this morning, with one upload server out of action but others working. How thoroughly did you search your message log? I'd expect that the _13 file uploaded some time before the section you posted - it's generated before the task reaches completion. On the other hand, I'm surprised that you say the task is 'ready to report'. Mine (with _13 uploaded but earlier files stuck) are still showing as 'uploading'. That sounds as if the model possibly crashed during the last few minutes. Try to avoid reporting it before the upload server is fixed (supposed to be today, if the delivery arrives in time) - then you can complete the uploads, and look at the outcome afterwards. |
©2024 cpdn.org