Message boards : Number crunching : OpenIFS Discussion
Message board moderation
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 32 · Next
Author | Message |
---|---|
Send message Joined: 29 Nov 17 Posts: 82 Credit: 14,387,344 RAC: 91,190 |
Task will get full credit. Glen did post an explanation about tasks that finish successfully but appear to fail a few days ago. I will see if I can find it later. From what I recall, I didn't read it carefully enough to fully understand it. Glenn said this in post https://www.cpdn.org/forum_thread.php?id=9162&postid=66949#66949 : Agreed. I've asked CPDN if there is a way of getting the server to check the upload was received OK to reclassify this as a success. It may not be easy as the uploads go to a cloud server first. Not my expertise. |
Send message Joined: 20 Dec 20 Posts: 13 Credit: 40,045,863 RAC: 9,755 |
Thank You Dave et PDW. Kali. |
Send message Joined: 4 Oct 15 Posts: 34 Credit: 9,075,151 RAC: 374 |
It seems like there is enough work for the rest of the year available ;) Merry Christmas Felix |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,960,988 RAC: 14,084 |
Getting transient HTTP message: Sat 24 Dec 2022 17:54:15 GMT | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0873_1981050100_123_950_12167517_0_r1054526626_61.zip: transient HTTP error Sat 24 Dec 2022 17:54:15 GMT | climateprediction.net | Backing off 00:02:50 on upload of oifs_43r3_ps_0873_1981050100_123_950_12167517_0_r1054526626_61.zip Sat 24 Dec 2022 17:54:15 GMT | climateprediction.net | Started upload of oifs_43r3_ps_0873_1981050100_123_950_12167517_0_r1054526626_62.zip Sat 24 Dec 2022 17:54:16 GMT | | Internet access OK - project servers may be temporarily down. Guess that's it for the next few days because of the Hols. Network activity suspended and tasks reduced to 1 until I hear otherwise. Happy Xmas everyone. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
If the network holds up ... All my uploads are timing out, since about 17:23 - tracert gets no further than 11 18 ms 18 ms 18 ms ral-r26.ja.net [146.97.41.34] 12 * * * Request timed out. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
All my uploads are timing out, Mine too. But my traceroute seems to work OK. Problem seems to have started: Sat 24 Dec 2022 12:24:01 PM EST | climateprediction.net | Started upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_67.zip Sat 24 Dec 2022 12:24:10 PM EST | climateprediction.net | Computation for task oifs_43r3_ps_0447_1995050100_123_964_12181091_0 finished Sat 24 Dec 2022 12:24:10 PM EST | climateprediction.net | Starting task oifs_43r3_ps_0257_2002050100_123_971_12187901_0 Sat 24 Dec 2022 12:24:21 PM EST | climateprediction.net | Started upload of oifs_43r3_ps_0160_1996050100_123_965_12181804_0_r1040728371_122.zip Sat 24 Dec 2022 12:26:03 PM EST | | Project communication failed: attempting access to reference site Sat 24 Dec 2022 12:26:03 PM EST | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_67.zip: transient HTTP error Sat 24 Dec 2022 12:26:03 PM EST | climateprediction.net | Backing off 00:02:10 on upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_67.zip Sat 24 Dec 2022 12:26:03 PM EST | climateprediction.net | Started upload of oifs_43r3_ps_0144_2001050100_123_970_12186788_0_r1420963935_107.zip Sat 24 Dec 2022 12:26:05 PM EST | | Internet access OK - project servers may be temporarily down. Sat 24 Dec 2022 12:26:21 PM EST | climateprediction.net | Computation for task oifs_43r3_ps_0160_1996050100_123_965_12181804_0 finished Sat 24 Dec 2022 12:26:21 PM EST | climateprediction.net | Starting task oifs_43r3_ps_0675_2002050100_123_971_12188319_0 Sat 24 Dec 2022 12:26:23 PM EST | | Project communication failed: attempting access to reference site Sat 24 Dec 2022 12:26:23 PM EST | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0160_1996050100_123_965_12181804_0_r1040728371_122.zip: transient HTTP error Sat 24 Dec 2022 12:26:23 PM EST | climateprediction.net | Backing off 00:02:08 on upload of oifs_43r3_ps_0160_1996050100_123_965_12181804_0_r1040728371_122.zip Sat 24 Dec 2022 12:26:24 PM EST | | Internet access OK - project servers may be temporarily down. $ traceroute 146.97.41.34 traceroute to 146.97.41.34 (146.97.41.34), 30 hops max, 60 byte packets 1 Fios_Quantum_Gateway.fios-router.home (192.168.0.1) 0.341 ms 0.441 ms 1.725 ms 2 lo0-100.NWRKNJ-VFTTP-309.verizon-gni.net (71.127.205.1) 4.126 ms 6.555 ms 8.083 ms 3 at-0-0-0-1717.ALT2-CORE-RTR2.verizon-gni.net (100.41.5.70) 10.021 ms 9.159 ms 10.106 ms 4 0.csi1.NBWKNJNB-MSE01-BB-SU1.ALTER.NET (140.222.4.106) 11.529 ms 0.csi1.NWRKNJ02-MSE01-BB-SU1.ALTER.NET (140.222.4.104) 11.727 ms 0.csi1.NBWKNJNB-MSE01-BB-SU1.ALTER.NET (140.222.4.106) 11.615 ms 5 * * * 6 * * * 7 nyk-b2-link.ip.twelve99.net (80.239.192.36) 6.703 ms 6.880 ms 6.784 ms 8 nyk-bb2-link.ip.twelve99.net (62.115.135.162) 9.091 ms 6.156 ms 6.017 ms 9 ldn-bb4-link.ip.twelve99.net (62.115.112.245) 75.917 ms ldn-bb1-link.ip.twelve99.net (62.115.113.21) 78.821 ms ldn-bb4-link.ip.twelve99.net (62.115.112.245) 78.562 ms 10 ldn-b2-link.ip.twelve99.net (62.115.122.189) 83.447 ms ldn-b2-link.ip.twelve99.net (62.115.120.239) 78.660 ms ldn-b2-link.ip.twelve99.net (62.115.122.189) 83.480 ms 11 jisc-ic345131-ldn-b2.ip.twelve99-cust.net (62.115.175.131) 80.799 ms 81.055 ms 77.332 ms 12 ae24.londhx-sbr1.ja.net (146.97.35.197) 75.097 ms 73.519 ms 78.504 ms 13 ae29.londpg-sbr2.ja.net (146.97.33.2) 76.043 ms 78.558 ms 76.892 ms 14 ral-r26.ja.net (146.97.41.34) 79.165 ms 79.091 ms 76.619 ms |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,696,681 RAC: 10,226 |
But somehow, we need to bridge the gap between .ja.net (I think that's the UK's "Joint Academic Network") and upload11.cpdn.org |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
There is a big gap there... There is a big delay in crossing the ocean from here in North America to Europe, although I very much doubt the delay has anything to do with the present problem. Is that by cable or by satellite? Thank gawd they are no longer putting rolls of magnetic tape on airplanes as they did a few decades ago. Good enough for e-mail, I guess. $ traceroute upload11.cpdn.org traceroute to upload11.cpdn.org (192.171.169.187), 30 hops max, 60 byte packets 1 Fios_Quantum_Gateway.fios-router.home (192.168.0.1) 0.364 ms 0.509 ms 0.641 ms 2 lo0-100.NWRKNJ-VFTTP-309.verizon-gni.net (71.127.205.1) 7.899 ms 5.678 ms 7.925 ms 3 at-0-0-0-1717.ALT2-CORE-RTR2.verizon-gni.net (100.41.5.70) 8.002 ms at-0-0-0-1716.ALT2-CORE-RTR1.verizon-gni.net (100.41.5.68) 10.877 ms 8.141 ms 4 0.csi1.NWRKNJ02-MSE01-BB-SU1.ALTER.NET (140.222.4.104) 11.025 ms 0.csi1.NBWKNJNB-MSE01-BB-SU1.ALTER.NET (140.222.4.106) 10.587 ms 0.csi1.NWRKNJ02-MSE01-BB-SU1.ALTER.NET (140.222.4.104) 10.918 ms 5 * * * 6 * * * 7 nyk-b2-link.ip.twelve99.net (80.239.192.36) 6.520 ms 6.612 ms 6.688 ms 8 * nyk-bb2-link.ip.twelve99.net (62.115.135.162) 8.858 ms nyk-bb1-link.ip.twelve99.net (62.115.135.160) 11.298 ms 9 * ldn-bb1-link.ip.twelve99.net (62.115.113.21) 77.185 ms 79.592 ms 10 ldn-b2-link.ip.twelve99.net (62.115.120.239) 76.907 ms ldn-b2-link.ip.twelve99.net (62.115.122.189) 82.196 ms ldn-b2-link.ip.twelve99.net (62.115.120.239) 76.953 ms 11 jisc-ic345131-ldn-b2.ip.twelve99-cust.net (62.115.175.131) 82.019 ms 79.404 ms 81.916 ms 12 ae24.londhx-sbr1.ja.net (146.97.35.197) 74.691 ms 74.754 ms 77.150 ms 13 ae29.londpg-sbr2.ja.net (146.97.33.2) 77.274 ms 75.398 ms 74.099 ms 14 ae31.erdiss-sbr2.ja.net (146.97.33.22) 81.635 ms 79.846 ms 82.346 ms 15 * * * 16 ral-r26.ja.net (146.97.41.34) 82.361 ms 79.864 ms 79.845 ms 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
Email sent but I do not see why Andy should sort this on Christmas day! it may well be the new year before it gets sorted. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
From Andy Hi Dave, |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
It does persist. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
It does persist.Message in event log has changed from, "internet access ok...." to transient http error" This suggests something may have changed and it is now the server getting hammered that is causing a problem. Edit: Maybe I spoke too soon. Project servers message appears eventually though users in past 24 hours has gone up from 0 to 1 so maybe someone has gotten a task through. will try again in an hour or so. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Message in event log has changed from, "internet access ok...." to transient http error" This suggests something may have changed and it is now the server getting hammered that is causing a problem. I get both. N.B.: I am in EST time zone, Mon 26 Dec 2022 11:12:48 AM EST | climateprediction.net | Started upload of oifs_43r3_ps_0514_2006050100_123_975_12192158_0_r1364261494_0.zip Mon 26 Dec 2022 11:12:48 AM EST | climateprediction.net | Started upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_69.zip Mon 26 Dec 2022 11:14:49 AM EST | | Project communication failed: attempting access to reference site Mon 26 Dec 2022 11:14:49 AM EST | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0514_2006050100_123_975_12192158_0_r1364261494_0.zip: transient HTTP error Mon 26 Dec 2022 11:14:49 AM EST | climateprediction.net | Backing off 01:43:50 on upload of oifs_43r3_ps_0514_2006050100_123_975_12192158_0_r1364261494_0.zip Mon 26 Dec 2022 11:14:49 AM EST | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_69.zip: transient HTTP error Mon 26 Dec 2022 11:14:49 AM EST | climateprediction.net | Backing off 00:19:18 on upload of oifs_43r3_ps_0438_1998050100_123_967_12184082_1_r1277369987_69.zip Mon 26 Dec 2022 11:14:51 AM EST | | Internet access OK - project servers may be temporarily down. |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
Will confirm to Andy nothing moving in the morning. |
Send message Joined: 2 Oct 19 Posts: 21 Credit: 47,674,094 RAC: 24,265 |
Will confirm to Andy nothing moving in the morning. That would be great. I have a total backlog of 13,315 14.5 Mb files to upload from 4 computers. That's around 193 GB. |
Send message Joined: 12 Apr 21 Posts: 317 Credit: 14,803,682 RAC: 19,762 |
That would be great. I have a total backlog of 13,315 14.5 Mb files to upload from 4 computers. That's around 193 GB. I have 100GB in total and I thought I have a lot. :-) |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
Strange, number of users in past 24 hours for this type of task has gone up from 1 to 2. This implies, someone has finished a task and got it to report though I see no sign of uploads shifting. I wonder if one of the older smaller batches went to a different server? Email sent. Edit: I guess it could be computers that have finished uploading and been turned off before the backoff of an hour finished for them to report the tasks? |
Send message Joined: 23 Nov 19 Posts: 4 Credit: 6,597,088 RAC: 79,816 |
Strange, number of users in past 24 hours for this type of task has gone up from 1 to 2. This implies, someone has finished a task and got it to report though I see no sign of uploads shifting. I wonder if one of the older smaller batches went to a different server? Is the active users counting results or active jobs? Because I was sooo happy the CPDN had an abundance of jobs and joined the party - only to then find out I can't get rid of my results. Two machines crunching, two harddrives slowly filling up. /Oliver |
Send message Joined: 15 May 09 Posts: 4535 Credit: 18,989,107 RAC: 21,788 |
Is the active users counting results or active jobs?Users in past 24 hours means the number of users who have completed and reported tasks. Currently only on on server status page from WAH2 windows work which goes to a different server. |
Send message Joined: 12 Apr 21 Posts: 317 Credit: 14,803,682 RAC: 19,762 |
... I was sooo happy the CPDN had an abundance of jobs and joined the party - only to then find out I can't get rid of my results. Thanks, that's funny. :-) Initially it's "Where's the work?!", now it's "How do I get rid of the results?!" |
©2024 cpdn.org